| examples | ||
| lib/plankton | ||
| source | ||
| tools | ||
| .gitignore | ||
| licence.txt | ||
| readme.md | ||
Description
type2 is a slow, but versatile reader for arbitrary context free languages (Chomsky hierarchy type 2).
Being fed with a JSON file, which specifies the language, and a string via stdin, it analyzes the input, decomposes it and writes its corresponding abstract syntax tree to stdout.
A typical call looks like this: cat input.txt | type2 language.tp2.json
Working principle
flowchart LR
input_string[input string]
terminal_symbol_chain[terminal symbol chain]
abstract_syntax_tree[abstract syntax tree]
output_string[output string]
input_string -- lexical analysis --> terminal_symbol_chain
terminal_symbol_chain -- parsing --> abstract_syntax_tree
abstract_syntax_tree -- JSON encoding --> output_string
Language specification
A language is specified by a JSON file, which represents a value of the following type:
record<
version:string,
lexer_rules:list<
record<
pattern:string,
name:(null|string),
?pass:boolean
>
>,
parser_rules:list<
record<
label:string,
premise:string,
conclusion:list<
record<
type:string,
parameters:map<string,any>
>
>
>
>,
parser_start:string
>
where:
versionshould be"2"parser_startis one of the parser rulelabelvalues.lexer_rules.*.typeis one ofignore,void,boolean,int,float,string(see section Lexer rule types)parser_rules.*.conclusion.*.typeis one ofterminal,variable(see section Conclusion element types)
The recommended extension for such files is .tp2.json.
Lexer rule types
| type | meaning | parameters |
|---|---|---|
ignore |
disregard matching strings | pattern |
void |
do not assign a value to matching strings | pattern, id |
boolean |
interpret matching strings as boolean values | pattern, id, value |
int |
interpret matching strings as integer number values | pattern, id |
float |
interpret matching strings as floating point number values | pattern, id |
string |
interpret matching strings as string values | pattern, id |
Conclusion element types
| type | meaning | parameters |
|---|---|---|
terminal |
the part is a terminal symbol | id |
variable |
the part is a syntactic variable | id |