String[Name]["whatever"] → CharacterSetName
The rules for Reckoning a String as an Expression are set by�
String[Name[�charSetNam�]][RuleList]
�where�
RuleList = {Noops, Digits, Letters, ParaPuncs, Operators}.
This is called the CharacterSetAssignment.
Noops = {EscapeSequence, StartComment, EndComment,
UnrecognizedCharacter, EmptySpace, �}
ParaPunc={StartString, Separator, EndString, FunctionName}
ParaPuncs = {{�[�, �,�, �]�}, {�{�, �,�, �}�, List}, {���, �, �, ���, PatternFormSequence}, ParaPunc�}
Operators = {ContextMark, StartGroup, EndGroup, copula�}
The Grok32` String is designed to facilitate CharacterSets of arbitrary complexity.� The object is to create a programming abstraction capable of modeling (the ling) any language�s CharacterSets without prejudice.� In any case, the similarities are sufficient so that a standard RuleList can be constructed for a CharacterSet
It is believed that the String embodies fundamental aspects of
All linguistic communication written or spoken is received and sent as a temporal sequence of information �chunks�.
Furthermore, it is believed that all linguistic Strings employ similar mechanisms to parse and elicit meaning.� For example, the String could probably be adapted to help parse phonetic linguistics or any character-expression sequence, human or otherwise.� Frequently, communication is multi-channel, and a correct �parsing� requires parallel String processing of completely different kinds of �character strings�.� A �character string� could be a sequence of human gestures.� What complex of CharacterString channels are sufficient to characterize a whale specie�s language?
(1) String[Name]["whatever"]
�returns the CharacterSetName(s) of the String, "whatever".
If "whatever" is a concatenation of several different Strings, then (1) will return a Sequence of CharacterSetNames. (See String Implementation Notes.)
CharacterSet definitions are stored in Contexts bearing the CharacterSetName. These Contexts are subContexts of Construct`String`Name`. The CharacterSetName is identical to the subContext name without the ContextMarks.
For example, the Character groupings (see below), for the ASCII CharacterSet are kept in Construct`String`Name`ASCII`
. If "whatever" in (1) is an ASCII String, the name returned by (1) will be �ASCII.�
Grok32` does not contain any Character glyph definitions or rendering software. The host machine�s text rendering software is better suited to this task. The String object presumes that the CharacterSet and CharacterCode are the only relevant facts about a Character.
If �charSetNam� is the string-name given to a CharacterSet,
�then the character semantics whereby a String is interpreted as an Expression can be assigned with the following declaration:
(2) String[Name[�charSetNam�]][RuleList]
where
RuleList = {Noops, Digits, Letters, ParaPuncs, Operators}
Each CharacterSet has its own RuleList. The five lists in RuleList assign meanings to individual Characters and thereby specify how to parse a String as an Expression. The five sublists reflects a categorical subdivision of the Characters. See RuleList.
By combining a customized CharacterSet with appropriately designed Named Functions and procedures, it is possible to mimic (model) most languages with Grok32`.
In ASCII and other CharacterSets, the equal sign is a token for Name[_, _].
This operator is assigned in the Standard ASCII RuleList.
Thus, �a = 3� is interpreted as Name[a, 3].
If �els... are all Strings, then�
Sequence[{�els...}]
�sorts the elements in {�els...} using the letter and digit order established by the Digits and Letters, lists.
ParaPuncs are sometimes called List punctuators and are three Characters named {StartString, Separator, EndString}.
CharacterCode (often abbreviated to �CC�) is an integer used to specify a specific Character for Expression parsing purposes. Grok32` conforms to the Character-Glyph Model which separates a Character�s glyph (display value) from its semantic value.
� 2004, 2005
by John Van Wie Bergamini. All rights reserved.