82 lines
2.2 KiB
Markdown
82 lines
2.2 KiB
Markdown
## Description
|
|
|
|
`type2` is a slow, but versatile reader for arbitrary context free languages (Chomsky hierarchy type 2).
|
|
|
|
Being fed with a JSON file, which specifies the language, and a string via `stdin`, it analyzes the input, decomposes it and writes its corresponding abstract syntax tree to `stdout`.
|
|
|
|
A typical call looks like this: `cat input.txt | type2 language.tp2.json`
|
|
|
|
|
|
## Working principle
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
input_string[input string]
|
|
terminal_symbol_chain[terminal symbol chain]
|
|
abstract_syntax_tree[abstract syntax tree]
|
|
output_string[output string]
|
|
|
|
input_string -- lexical analysis --> terminal_symbol_chain
|
|
terminal_symbol_chain -- parsing --> abstract_syntax_tree
|
|
abstract_syntax_tree -- JSON encoding --> output_string
|
|
```
|
|
|
|
|
|
## Language specification
|
|
|
|
A language is specified by a JSON file, which represents a value of the following type:
|
|
|
|
```
|
|
record<
|
|
version:string,
|
|
lexer_rules:list<
|
|
record<
|
|
type:string,
|
|
parameters:map<string,any>
|
|
>
|
|
>,
|
|
parser_rules:list<
|
|
record<
|
|
label:string,
|
|
premise:string,
|
|
conclusion:list<
|
|
record<
|
|
type:string,
|
|
parameters:map<string,any>
|
|
>
|
|
>
|
|
>
|
|
>,
|
|
parser_start:string
|
|
>
|
|
```
|
|
|
|
where:
|
|
- `parser_start` is one of the parser rule `label` values.
|
|
- `lexer_rules.*.type` is one of `ignore`, `void`, `boolean`, `int`, `float`, `string` (see section _Lexer rule types_)
|
|
- `parser_rules.*.conclusion.*.type` is one of `terminal`, `variable` (see section _Conclusion element types_)
|
|
|
|
The recommended extension for such files is `.tp2.json`.
|
|
|
|
|
|
### Lexer rule types
|
|
|
|
| type | meaning | parameters |
|
|
|-- |-- |-- |
|
|
| `ignore` | disregard matching strings | `pattern` |
|
|
| `void` | do not assign a value to matching strings | `pattern`, `id` |
|
|
| `boolean` | interpret matching strings as boolean values | `pattern`, `id`, `value` |
|
|
| `int` | interpret matching strings as integer number values | `pattern`, `id` |
|
|
| `float` | interpret matching strings as floating point number values | `pattern`, `id` |
|
|
| `string` | interpret matching strings as string values | `pattern`, `id` |
|
|
|
|
|
|
### Conclusion element types
|
|
|
|
| type | meaning | parameters |
|
|
|-- |-- |-- |
|
|
| `terminal` | the part is a terminal symbol | `id` |
|
|
| `variable` | the part is a syntactic variable | `id` |
|
|
|
|
|