Description
Pliant is a table driven parser that implements the Earley algorithm. Two optimizations are added to handle issues with the original Earley implementation:
1) Optimization from Joop Leo to efficiently handle right recursions.
2) Bug fix from Aycock and Horspool to handle nullable predictions
Pliant produces all possible parses as a packed syntax forest which can be enumerated to produce individual syntax trees.
Pliant alternatives and similar packages
Based on the "Text Processing" category.
Alternatively, view Pliant alternatives based on common mentions on social networks and blogs.
-
Quickenshtein
Making the quickest and most memory efficient implementation of Levenshtein Distance with SIMD and Threading support
Do you think we are missing an alternative of Pliant or a related project?
README
Pliant
Implementation of a modified Earley parser in C# inspired by the Marpa Parser project.
Build Status
Gitter Chat
Description
Pliant is a table driven parser that implements the Earley algorithm. Two optimizations are added to handle issues with the original Earley implementation:
- Optimization from Joop Leo to efficiently handle right recursions.
- Bug fix from Aycock and Horspool to handle nullable predictions
Using the Code
Getting Started
Add a reference to the Pliant libarary using NuGet
PM> Install-Package Pliant
Creating Grammars
Using the Grammar Expression classes
using Pliant.Builders;
using Pliant.Builders.Expressions;
using Pliant.Grammars;
using Pliant.Automata;
public static int Main(string[] args)
{
var digits = CreateDigitLexerRule();
var whitespace = CreateWhitespaceLexerRule();
ProductionExpression
Calculator = "Calculator",
Factor = "Factor",
Term = "Term",
Expression = "Expression",
Number = "Number";
Calculator.Rule
= Expression ;
Expression.Rule
= Expression + '+' + Term
| Term ;
Term.Rule
= Term + '*' + Factor
| Factor ;
Factor.Rule
= Number ;
Number.Rule
= digits;
var grammar = new GrammarExpression(
Calculator,
new []{ Calculator, Factor, Term, Expression, Number },
new []{ new LexerRuleModel(whitespace) })
.ToGrammar();
// TODO: Use the grammar in a parse.
}
private static BaseLexerRule CreateDigitLexerRule()
{
var startState = new DfaState();
var finalState = new DfaState(true);
var digitTransition = new DfaTransition(new DigitTerminal(), finalState);
startState.AddTransition(digitTransition);
finalState.AddTransition(digitTransition);
return new DfaLexerRule(startState, "[\\d]+");
}
private static ILexerRule CreateWhitespaceLexerRule()
{
var whitespaceTerminal = new WhitespaceTerminal();
var startState = new DfaState();
var finalState = new DfaState(true);
var whitespaceTransition = new DfaTransition(whitespaceTerminal, finalState);
startState.AddTransition(whitespaceTransition);
finalState.AddTransition(whitespaceTransition);
return new DfaLexerRule(startState, new TokenType("[\\s]+"));
}
Using the Ebnf Text Interface ( work in progress )
public static int Main (string[] args)
{
var grammarText = @"
Calculator
= Expression;
Expression
= Expression '+' Term
| Term;
Term
= Term '*' Factor
| Factor;
Factor
= Number ;
Number
= Digits;
Digits ~ /[0-9]+/ ;
Whitespace ~ /[\\s]+/ ;
:start = Calculator;
:ignore = Whitespace;";
var definition = new EbnfParser().Parse(grammarText);
var grammar = new EbnfGrammarGenerator().Generate(definition);
// TODO: use the grammar in a parse.
}
Recognizing and Parse Trees
Using ParseEngine, ParseRunner and Grammars to Recognize Input
Using the calculator grammar from above, we can parse input by constructing a parse engine and parse runner instance.
var input = "1 + 1 * 3 + 2";
// use the calculator grammar from above
var parseEngine = new ParseEngine(grammar);
// use the parse runner to query the parse engine for state
// and use that state to select lexer rules.
var parseRunner = new ParseRunner(parseEngine, input);
// when a parse is recognized, the parse engine is allowed to move
// forward and continues to accept symbols.
var recognized = false;
var errorPosition = 0;
while(!parseRunner.EndOfStream())
{
recognized = parseRunner.Read();
if(!recognized)
{
errorPosition = parseRunner.Position;
break;
}
}
// For a parse to be accepted, all parse rules are completed and a trace
// has been made back to the parse root.
// A parse must be recognized in order for acceptance to have meaning.
var accepted = false;
if(recognized)
{
accepted = parseRunner.ParseEngine.IsAccepted();
if(!accepted)
errorPosition = parseRunner.Position;
}
Console.WriteLine($"Recognized: {recognized}, Accepted: {accepted}");
if(!recognized || !accepted)
Console.Error.WriteLine($"Error at position {errorPosition}");
Using ParseEngine, ParseRunner, Forest API and Grammars to build a parse tree.
The process for creating a parse tree is the same as recognizing input. In fact, when running the ParseEngine, a Sparsley Packed Parse Forest (SPPF) is created in the background. The parse forest is presented in a specialized format to promote density and allow for computational complexity similar to that of running the recognizer alone.
The easiest way to use the parse forest is use a internal node tree visitor on the parse forest root with a SelectFirstChildDisambiguationAlgorithm instance controling disambiguation of thet forest.
If the parse is ambiguous, you may want to supply a custom IForestDisambiguationAlgorithm.
// get the parse forest root from the parse engine
var parseForestRoot = parseEngine.GetParseForestRootNode();
// create a internal tree node and supply the disambiguation algorithm for tree traversal.
var parseTree = new InternalTreeNode(
parseForestRoot,
new SelectFirstChildDisambiguationAlgorithm());
References
- berkeley earley cs164
- washington edu ling571
- unt cse earley
- wikipedia
- optimizing right recursion
- marpa parser
- joop leo - optimizing right recursions
- parsing techniques - a practical guide
- practical earley parsing
- detailed description of leo optimizations and internals of marpa
- theory of Marpa Algorithm
- parse tree forest creation
- cs theory stackexchange, leo optimization parse tree creation
- insights on lexer creation
- incremental reparsing
- An extension of Earley's Algorithm for extended grammars
- Finding nullable productions in a grammar
- Directly-Executable Earley Parsing (Aycock and Horspool, 2001)
- Context Senitive Earley Parsing