type2/readme.md
2025-09-23 21:46:56 +02:00

1.7 KiB

type2

— a slow, but versatile reader for arbitrary context free languages (Chomsky hierarchy type 2).

Being fed with a JSON file, which specifies the language, and a string via stdin, it analyzes the input, decomposes it and writes its corresponding abstract syntax tree to stdout.

A typical call looks like this: cat input.txt | type2 language.tp2.json

Working principle

flowchart LR
	input_string[input string]
	terminal_symbol_chain[terminal symbol chain]
	abstract_syntax_tree[abstract syntax tree]
	output_string[output string]
	
	input_string -- lexical analysis --> terminal_symbol_chain
	terminal_symbol_chain -- parsing --> abstract_syntax_tree
	abstract_syntax_tree -- JSON encoding --> output_string

Language specification

A language is specified by a JSON file, which represents a value of the following type:

record<
	version:string,
	lexer_rules:list<
		record<
			pattern:string,
			name:(null|string),
			?pass:boolean
		>
	>,
	parser_rules:list<
		record<
			label:string,
			premise:string,
			conclusion:list<
				record<
					type:string,
					parameters:map<string,any>
				>
			>
		>
	>,
	parser_start:string
>

where:

  • version should be "2"
  • lexer_rules.*.pass controls, whether the read sequence shall be incorporated or not
  • parser_rules.*.conclusion.*.type is one of terminal, variable (see section Conclusion element types)
  • parser_start is one of the parser rule label values

The recommended extension for such files is .tp2.json.

Conclusion element types

type meaning parameters
terminal the part is a terminal symbol id
variable the part is a syntactic variable id