[mod] readme

This commit is contained in:
Christian Fraß 2022-03-20 14:39:50 +01:00
parent a7d1f9cb64
commit 4023a252e1

View file

@ -1,18 +1,34 @@
## Description
`type2` is a slow, but versatile parser for context free languages (type 2 in the Chomsky hierarchy).
`type2` is a slow, but versatile reader for arbitrary context free languages (Chomsky hierarchy type 2).
Being fed with a JSON file, specifying the language and a string via `stdin`, it decomposes the input, analyzes it and spits out its corresponding abstract syntax tree on `stdout`.
Being fed with a JSON file, which specifies the language, and a string via `stdin`, it analyzes the input, decomposes it and writes its corresponding abstract syntax tree to `stdout`.
A typical call looks like this: `type2 language.tp2.json < input.txt`
A typical call looks like this: `cat input.txt | type2 language.tp2.json`
## Working principle
```mermaid
flowchart LR
input_string[input string]
terminal_symbol_chain[terminal symbol chain]
abstract_syntax_tree[abstract syntax tree]
output_string[output string]
input_string -- lexical analysis --> terminal_symbol_chain
terminal_symbol_chain -- parsing --> abstract_syntax_tree
abstract_syntax_tree -- JSON encoding --> output_string
```
## Language specification
A language is specified by a JSON file containing a value of the following type:
A language is specified by a JSON file, which represents a value of the following type:
```
record<
version:string,
lexer_rules:list<
record<
type:string,
@ -35,6 +51,11 @@ record<
>
```
where:
- `parser_start` is one of the parser rule `label` values.
- `lexer_rules.*.type` is one of `ignore`, `void`, `boolean`, `int`, `float`, `string` (see section _Lexer rule types_)
- `parser_rules.*.conclusion.*.type` is one of `terminal`, `variable` (see section _Conclusion element types_)
The recommended extension for such files is `.tp2.json`.
@ -43,18 +64,18 @@ The recommended extension for such files is `.tp2.json`.
| type | meaning | parameters |
|-- |-- |-- |
| `ignore` | disregard matching strings | `pattern` |
| `void` | do not assign a value to matching strings | `pattern`,`id` |
| `boolean` | interpret matching strings as boolean values | `pattern`,`id`,`value` |
| `int` | interpret matching strings as integer number values | `pattern`,`id` |
| `float` | interpret matching strings as floating point number values | `pattern`,`id` |
| `string` | interpret matching strings as string values | `pattern`,`id` |
| `void` | do not assign a value to matching strings | `pattern`, `id` |
| `boolean` | interpret matching strings as boolean values | `pattern`, `id`, `value` |
| `int` | interpret matching strings as integer number values | `pattern`, `id` |
| `float` | interpret matching strings as floating point number values | `pattern`, `id` |
| `string` | interpret matching strings as string values | `pattern`, `id` |
### Conclusion element types
| type | meaning | parameters |
|-- |-- |-- |
| `terminal` | | `id` |
| `variable` | | `id` |
| `terminal` | the part is a terminal symbol | `id` |
| `variable` | the part is a syntactic variable | `id` |