[All Lists] [By Thread] [By Date] [Previous] [Next]
From: Devorah
Subject: Implementation requirements
Date: 13 Shevat 5785
I propose three parsing levels. Implementations could declare which level they provide.
Level 1: Tokenization only. Input is block text. Output is list of token strings. No decomposition, no validation beyond delimiter recognition.
Level 2: Structural parsing. Tokens are decomposed into category code, modifiers, subspecifiers, transitions, contextual variants. Unknown codes are preserved with a marker. Malformed syntax is flagged.
Level 3: Semantic annotation. Known codes are labeled with their categories from the specification. Denomination codes are identified. Practice codes distinguished from identity codes.
Level 1 is minimal. Level 3 is maximal. An application could request the level it needs.
Thread: