Beluga_parserParsing and disambiguation of Beluga signatures.
Make sure to familiarize yourself with the beluga-language-specification.tex file which exhaustively details the functional requirements of this parser.
Parsing of Beluga signatures is implemented in three phases:
Token and Lexer)Parser_combinator and modules with the suffix _parser)Disambiguation_state and modules with the suffix _disambiguation)The lexing phase converts UTF-8 encoded input files to tokens using the sedlex library (see the dune file for the pre-processing configuration of the Lexer module). Tokenization works like in other programming language implementations, except for a special handling of the dot . operator with the Token.DOT, Token.DOT_IDENT s and Token.DOT_INTLIT i token kinds. See the Common_parser module for a note on how those tokens are handled.
The context-free parsing phase is a recursive descent parser implemented with parser combinators and selective unlimited backtracking. This phase handles parsing of static operators, and constructs a parse tree with ambiguous nodes. These ambiguous nodes are qualified and dotted qualified identifiers (which may represent projections instead of references to constants in namespaces), applications (the juxtaposition of terms and user-defined operators), contextual LF substitutions and contexts, and old-style LF declarations (LF type-level and term-level declarations).
The context-sensitive disambiguation phase is a stateful translation from the parser syntax to the external syntax. The disambiguation state constructs the lexical referencing environment during the traversal so that identifiers and qualified identifiers may be resolved to their binding sites to determine what kind of AST node to produce. Overloading of identifiers is not supported. The disambiguation state is mutable to ensure good performance.
module Config_parser : sig ... endReader for configuration files for Beluga signatures.
module Token : sig ... endToken definition for lexing.
module Located_token : sig ... endLexer tokens annotated with locations.
module Lexer : sig ... endTokenization of UTF-8 text input.
module type PARSER_STATE = Parser_combinator.PARSER_STATEmodule Parser_combinator : sig ... endGeneral purpose parser combinator library.
module Common_parser : sig ... endCommon parser definitions for parsing Beluga signatures.
module Lf_parser : sig ... endmodule Clf_parser : sig ... endmodule Meta_parser : sig ... endmodule Comp_parser : sig ... endmodule Harpoon_parser : sig ... endmodule Signature_parser : sig ... endmodule type PARSING = sig ... endAbstract definition of parsing for Beluga.
module Make_parsing
(Parser_state :
PARSER_STATE
with type token = Located_token.t
and type location = Beluga_syntax.Location.t) :
PARSING
with type state = Parser_state.state
and type token = Parser_state.token
and type location = Parser_state.locationFunctor creating a Beluga parser using the given implementation of state.
module type DISAMBIGUATION_STATE = Disambiguation_state.DISAMBIGUATION_STATEmodule Disambiguation_state : sig ... endDefinition and implementation of the state required to disambiguate the Beluga syntax from the parser AST to the external AST.
module Application_disambiguation : sig ... endDisambiguation of expression applications with user-defined operators and juxtapositions.
module Lf_disambiguation : sig ... endmodule Clf_disambiguation : sig ... endmodule Meta_disambiguation : sig ... endmodule Comp_disambiguation : sig ... endmodule Harpoon_disambiguation : sig ... endmodule Signature_disambiguation : sig ... endmodule type DISAMBIGUATION = sig ... endAbstract definition of disambiguation for Beluga.
module Make_disambiguation
(Disambiguation_state : DISAMBIGUATION_STATE) :
DISAMBIGUATION with type state = Disambiguation_state.stateFunctor creating a Beluga disambiguator using the given implementation of state.
module Make
(Parser_state :
PARSER_STATE
with type token = Located_token.t
and type location = Beluga_syntax.Location.t)
(Disambiguation_state : DISAMBIGUATION_STATE) :
sig ... endmodule Parser : sig ... end