Module Beluga_parser

Parsing and disambiguation of Beluga signatures.

Make sure to familiarize yourself with the beluga-language-specification.tex file which exhaustively details the functional requirements of this parser.

Parsing of Beluga signatures is implemented in three phases:

  1. Lexing/tokenization (see Token and Lexer)
  2. Context-free parsing (see Parser_combinator and modules with the suffix _parser)
  3. Context-sensitive disambiguation (see Disambiguation_state and modules with the suffix _disambiguation)

The lexing phase converts UTF-8 encoded input files to tokens using the sedlex library (see the dune file for the pre-processing configuration of the Lexer module). Tokenization works like in other programming language implementations, except for a special handling of the dot . operator with the Token.DOT, Token.DOT_IDENT s and Token.DOT_INTLIT i token kinds. See the Common_parser module for a note on how those tokens are handled.

The context-free parsing phase is a recursive descent parser implemented with parser combinators and selective unlimited backtracking. This phase handles parsing of static operators, and constructs a parse tree with ambiguous nodes. These ambiguous nodes are qualified and dotted qualified identifiers (which may represent projections instead of references to constants in namespaces), applications (the juxtaposition of terms and user-defined operators), contextual LF substitutions and contexts, and old-style LF declarations (LF type-level and term-level declarations).

The context-sensitive disambiguation phase is a stateful translation from the parser syntax to the external syntax. The disambiguation state constructs the lexical referencing environment during the traversal so that identifiers and qualified identifiers may be resolved to their binding sites to determine what kind of AST node to produce. Overloading of identifiers is not supported. The disambiguation state is mutable to ensure good performance.

Parsing Configuration Files

module Config_parser : sig ... end

Reader for configuration files for Beluga signatures.

Lexing

module Token : sig ... end

Token definition for lexing.

module Located_token : sig ... end

Lexer tokens annotated with locations.

module Lexer : sig ... end

Tokenization of UTF-8 text input.

Parsing

module type PARSER_STATE = Parser_combinator.PARSER_STATE
module Parser_combinator : sig ... end

General purpose parser combinator library.

module Common_parser : sig ... end

Common parser definitions for parsing Beluga signatures.

module Lf_parser : sig ... end
module Clf_parser : sig ... end
module Meta_parser : sig ... end
module Comp_parser : sig ... end
module Harpoon_parser : sig ... end
module Signature_parser : sig ... end
module type PARSING = sig ... end

Abstract definition of parsing for Beluga.

Functor creating a Beluga parser using the given implementation of state.

Disambiguation

module type DISAMBIGUATION_STATE = Disambiguation_state.DISAMBIGUATION_STATE
module Disambiguation_state : sig ... end

Definition and implementation of the state required to disambiguate the Beluga syntax from the parser AST to the external AST.

module Application_disambiguation : sig ... end

Disambiguation of expression applications with user-defined operators and juxtapositions.

module Lf_disambiguation : sig ... end
module Clf_disambiguation : sig ... end
module Meta_disambiguation : sig ... end
module Comp_disambiguation : sig ... end
module Harpoon_disambiguation : sig ... end
module Signature_disambiguation : sig ... end
module type DISAMBIGUATION = sig ... end

Abstract definition of disambiguation for Beluga.

Functor creating a Beluga disambiguator using the given implementation of state.

Constructors

Instances

module Parser : sig ... end