Redefining the XML language

Anne Brüggemann-Klein, Derick Wood

Version 0.5, October 7, 2000

About this document

This document redefines the production rules of the XML Recommendation 1.0 (http://www.w3.org/TR/1998/REC-xml-19980210/). We provide a listing of the Recommendation's original production rules, well-formedness constraints and validating constraints at http://www11.in.tum.de/~brueggem/XML/Recommendation/xmlRules.xml (XML format) and at http://www11.in.tum.de/~brueggem/XML/Recommendation/xmlRules.htm (HTML format).

The XML Recommendation writes grammar symbols with an initial capital letter if they are "defined by a regular expression", otherwise with an initial lower-case letter. We capitalize all grammar symbols and also all semantically meaningful subwords of grammar symbols; for example, document is renamed Document and Nmtoken is renamed NmToken. We achieve this by referring to grammar symbol XXX by an entity reference &XXX; that expands to <nt def="NT-XXX">XXX</nt>, so the capitalization is easily reversed by redefining these entities.

We demonstrate that an XML processor (also called an XML parser) can be built using well-proven compiler-construction techniques, in particular dividing the task into lexical analysis aka tokenization, syntax analysis, semantic analysis and code generation.

As a first step, we assign categories to the XML production rules. The XML grammar defines a language of strings over the alphabet of Unicode characters. We categorize XML's production rules so that their role with respect to lexical and syntactic analysis becomes apparent. We use the following categories:

Character level: Rules that operate at the character level define sets (or classes) of Unicode characters.

Token level: Rules that operate at the token level define regular languages over the alphabet of Unicode characters, which we call token classes. Token-level rules may refer to character-level rules.

Syntax level: Rules that operate at the syntax level define context-free languages over the alphabet of token classes.

The XML Recommendation uses three grammar symbols as start symbols, namely Document to define the class of well-formed or valid XML documents, ExtParsedEnt to define the class of well-formed external parsed entities (fragments of document instances) and ExtPE to define the class of well-formed external parameter entities (fragments of DTDs). The "orphaned" productions [6], [8], [33], [34], [35], [36], [37], [38] of the XML Recommendation cannot be reached from any of the three start symbols. The XML Recommendation uses these rules to pose additional constraints on XML documents, that belong to the semantic-analysis phase of XML processing. We keep them out of the grammar, which deals with the syntax-analysis phase of XML processors.

We rewrite some rules of the XML grammar, so that character strings in grammar-level rules are turned into tokens. This leads to additional token-level rules.

The character-level rules

No XML No Rule
(1) [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
(2) [4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender
(3) [13] PubidChar ::= #x20 | #xD | #xA | [a-zA-Z0-9] | [-'()+,./:=?;!*#@$_%]
(4) [84] Letter ::= BaseChar | Ideographic
(5) [85] BaseChar ::= [#x0041-#x005A] | [#x0061-#x007A] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x00FF] | [#x0100-#x0131] | [#x0134-#x013E] | [#x0141-#x0148] | [#x014A-#x017E] | [#x0180-#x01C3] | [#x01CD-#x01F0] | [#x01F4-#x01F5] | [#x01FA-#x0217] | [#x0250-#x02A8] | [#x02BB-#x02C1] | #x0386 | [#x0388-#x038A] | #x038C | [#x038E-#x03A1] | [#x03A3-#x03CE] | [#x03D0-#x03D6] | #x03DA | #x03DC | #x03DE | #x03E0 | [#x03E2-#x03F3] | [#x0401-#x040C] | [#x040E-#x044F] | [#x0451-#x045C] | [#x045E-#x0481] | [#x0490-#x04C4] | [#x04C7-#x04C8] | [#x04CB-#x04CC] | [#x04D0-#x04EB] | [#x04EE-#x04F5] | [#x04F8-#x04F9] | [#x0531-#x0556] | #x0559 | [#x0561-#x0586] | [#x05D0-#x05EA] | [#x05F0-#x05F2] | [#x0621-#x063A] | [#x0641-#x064A] | [#x0671-#x06B7] | [#x06BA-#x06BE] | [#x06C0-#x06CE] | [#x06D0-#x06D3] | #x06D5 | [#x06E5-#x06E6] | [#x0905-#x0939] | #x093D | [#x0958-#x0961] | [#x0985-#x098C] | [#x098F-#x0990] | [#x0993-#x09A8] | [#x09AA-#x09B0] | #x09B2 | [#x09B6-#x09B9] | [#x09DC-#x09DD] | [#x09DF-#x09E1] | [#x09F0-#x09F1] | [#x0A05-#x0A0A] | [#x0A0F-#x0A10] | [#x0A13-#x0A28] | [#x0A2A-#x0A30] | [#x0A32-#x0A33] | [#x0A35-#x0A36] | [#x0A38-#x0A39] | [#x0A59-#x0A5C] | #x0A5E | [#x0A72-#x0A74] | [#x0A85-#x0A8B] | #x0A8D | [#x0A8F-#x0A91] | [#x0A93-#x0AA8] | [#x0AAA-#x0AB0] | [#x0AB2-#x0AB3] | [#x0AB5-#x0AB9] | #x0ABD | #x0AE0 | [#x0B05-#x0B0C] | [#x0B0F-#x0B10] | [#x0B13-#x0B28] | [#x0B2A-#x0B30] | [#x0B32-#x0B33] | [#x0B36-#x0B39] | #x0B3D | [#x0B5C-#x0B5D] | [#x0B5F-#x0B61] | [#x0B85-#x0B8A] | [#x0B8E-#x0B90] | [#x0B92-#x0B95] | [#x0B99-#x0B9A] | #x0B9C | [#x0B9E-#x0B9F] | [#x0BA3-#x0BA4] | [#x0BA8-#x0BAA] | [#x0BAE-#x0BB5] | [#x0BB7-#x0BB9] | [#x0C05-#x0C0C] | [#x0C0E-#x0C10] | [#x0C12-#x0C28] | [#x0C2A-#x0C33] | [#x0C35-#x0C39] | [#x0C60-#x0C61] | [#x0C85-#x0C8C] | [#x0C8E-#x0C90] | [#x0C92-#x0CA8] | [#x0CAA-#x0CB3] | [#x0CB5-#x0CB9] | #x0CDE | [#x0CE0-#x0CE1] | [#x0D05-#x0D0C] | [#x0D0E-#x0D10] | [#x0D12-#x0D28] | [#x0D2A-#x0D39] | [#x0D60-#x0D61] | [#x0E01-#x0E2E] | #x0E30 | [#x0E32-#x0E33] | [#x0E40-#x0E45] | [#x0E81-#x0E82] | #x0E84 | [#x0E87-#x0E88] | #x0E8A | #x0E8D | [#x0E94-#x0E97] | [#x0E99-#x0E9F] | [#x0EA1-#x0EA3] | #x0EA5 | #x0EA7 | [#x0EAA-#x0EAB] | [#x0EAD-#x0EAE] | #x0EB0 | [#x0EB2-#x0EB3] | #x0EBD | [#x0EC0-#x0EC4] | [#x0F40-#x0F47] | [#x0F49-#x0F69] | [#x10A0-#x10C5] | [#x10D0-#x10F6] | #x1100 | [#x1102-#x1103] | [#x1105-#x1107] | #x1109 | [#x110B-#x110C] | [#x110E-#x1112] | #x113C | #x113E | #x1140 | #x114C | #x114E | #x1150 | [#x1154-#x1155] | #x1159 | [#x115F-#x1161] | #x1163 | #x1165 | #x1167 | #x1169 | [#x116D-#x116E] | [#x1172-#x1173] | #x1175 | #x119E | #x11A8 | #x11AB | [#x11AE-#x11AF] | [#x11B7-#x11B8] | #x11BA | [#x11BC-#x11C2] | #x11EB | #x11F0 | #x11F9 | [#x1E00-#x1E9B] | [#x1EA0-#x1EF9] | [#x1F00-#x1F15] | [#x1F18-#x1F1D] | [#x1F20-#x1F45] | [#x1F48-#x1F4D] | [#x1F50-#x1F57] | #x1F59 | #x1F5B | #x1F5D | [#x1F5F-#x1F7D] | [#x1F80-#x1FB4] | [#x1FB6-#x1FBC] | #x1FBE | [#x1FC2-#x1FC4] | [#x1FC6-#x1FCC] | [#x1FD0-#x1FD3] | [#x1FD6-#x1FDB] | [#x1FE0-#x1FEC] | [#x1FF2-#x1FF4] | [#x1FF6-#x1FFC] | #x2126 | [#x212A-#x212B] | #x212E | [#x2180-#x2182] | [#x3041-#x3094] | [#x30A1-#x30FA] | [#x3105-#x312C] | [#xAC00-#xD7A3]
(6) [86] Ideographic ::= [#x4E00-#x9FA5] | #x3007 | [#x3021-#x3029]
(7) [87] CombiningChar ::= [#x0300-#x0345] | [#x0360-#x0361] | [#x0483-#x0486] | [#x0591-#x05A1] | [#x05A3-#x05B9] | [#x05BB-#x05BD] | #x05BF | [#x05C1-#x05C2] | #x05C4 | [#x064B-#x0652] | #x0670 | [#x06D6-#x06DC] | [#x06DD-#x06DF] | [#x06E0-#x06E4] | [#x06E7-#x06E8] | [#x06EA-#x06ED] | [#x0901-#x0903] | #x093C | [#x093E-#x094C] | #x094D | [#x0951-#x0954] | [#x0962-#x0963] | [#x0981-#x0983] | #x09BC | #x09BE | #x09BF | [#x09C0-#x09C4] | [#x09C7-#x09C8] | [#x09CB-#x09CD] | #x09D7 | [#x09E2-#x09E3] | #x0A02 | #x0A3C | #x0A3E | #x0A3F | [#x0A40-#x0A42] | [#x0A47-#x0A48] | [#x0A4B-#x0A4D] | [#x0A70-#x0A71] | [#x0A81-#x0A83] | #x0ABC | [#x0ABE-#x0AC5] | [#x0AC7-#x0AC9] | [#x0ACB-#x0ACD] | [#x0B01-#x0B03] | #x0B3C | [#x0B3E-#x0B43] | [#x0B47-#x0B48] | [#x0B4B-#x0B4D] | [#x0B56-#x0B57] | [#x0B82-#x0B83] | [#x0BBE-#x0BC2] | [#x0BC6-#x0BC8] | [#x0BCA-#x0BCD] | #x0BD7 | [#x0C01-#x0C03] | [#x0C3E-#x0C44] | [#x0C46-#x0C48] | [#x0C4A-#x0C4D] | [#x0C55-#x0C56] | [#x0C82-#x0C83] | [#x0CBE-#x0CC4] | [#x0CC6-#x0CC8] | [#x0CCA-#x0CCD] | [#x0CD5-#x0CD6] | [#x0D02-#x0D03] | [#x0D3E-#x0D43] | [#x0D46-#x0D48] | [#x0D4A-#x0D4D] | #x0D57 | #x0E31 | [#x0E34-#x0E3A] | [#x0E47-#x0E4E] | #x0EB1 | [#x0EB4-#x0EB9] | [#x0EBB-#x0EBC] | [#x0EC8-#x0ECD] | [#x0F18-#x0F19] | #x0F35 | #x0F37 | #x0F39 | #x0F3E | #x0F3F | [#x0F71-#x0F84] | [#x0F86-#x0F8B] | [#x0F90-#x0F95] | #x0F97 | [#x0F99-#x0FAD] | [#x0FB1-#x0FB7] | #x0FB9 | [#x20D0-#x20DC] | #x20E1 | [#x302A-#x302F] | #x3099 | #x309A
(8) [88] Digit ::= [#x0030-#x0039] | [#x0660-#x0669] | [#x06F0-#x06F9] | [#x0966-#x096F] | [#x09E6-#x09EF] | [#x0A66-#x0A6F] | [#x0AE6-#x0AEF] | [#x0B66-#x0B6F] | [#x0BE7-#x0BEF] | [#x0C66-#x0C6F] | [#x0CE6-#x0CEF] | [#x0D66-#x0D6F] | [#x0E50-#x0E59] | [#x0ED0-#x0ED9] | [#x0F20-#x0F29]
(9) [89] Extender ::= #x00B7 | #x02D0 | #x02D1 | #x0387 | #x0640 | #x0E46 | #x0EC6 | #x3005 | [#x3031-#x3035] | [#x309D-#x309E] | [#x30FC-#x30FE]

The token-level rules

No XML No Rule
(10) [3] S ::= (#x20 | #x9 | #xD | #xA)+
(11) [5] Name ::= (Letter | '_' | ':') (NameChar)*
(12) [7] NmToken ::= (NameChar)+
(13) [-] DQuote ::= '"'
(14) [-] SQuote ::= "'"
(15) [-] EntCharsNoDQuotes ::= ([^%&"])*
(16) [-] EntCharsNoSQuotes ::= ([^%&'])*
(17) [-] AttCharsNoDQuotes ::= ([^<&"])*
(18) [-] AttCharsNoSQuotes ::= ([^<&'])*
(19) [11] SystemLiteral ::= ('"' [^"]* '"') | ("'" [^']* "'")
(20) [12] PubidLiteral ::= '"' PubidChar* '"' | "'" (PubidChar - "'")* "'"
(21) [14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)
(22) [-] CommentOpen ::= '<!--'
(23) [-] CommentBody ::= ((Char - '-') | ('-' (Char - '-')))*
(24) [-] CommentClose ::= '-->'
(25) [-] PIOpen ::= '<?'
(26) [-] PIBody ::= Char* - (Char* '?>' Char*)
(27) [-] PIClose ::= '?>'
(28) [17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' | 'l'))
(29) [19] CDStart ::= '<![CDATA['
(30) [20] CData ::= (Char* - (Char* ']]>' Char*))
(31) [21] CDEnd ::= ']]>'
(32) [-] XMLDeclOpen ::= '<?xml'
(33) [-] VersionKW ::= 'version'
(34) [25] Eq ::= '='
(35) [26] VersionNum ::= ([a-zA-Z0-9_.:] | '-')+
(36) [-] DocTypeOpen ::= '<!DOCTYPE'
(37) [-] BOpen ::= '['
(38) [-] BClose ::= ']'
(39) [-] DeclClose ::= '>'
(40) [-] StandAloneKW ::= 'standalone'
(41) [-] YesOrNo ::= ("'" ('yes' | 'no') "'") | ('"' ('yes' | 'no') '"')
(42) [-] STagOpen ::= '<'
(43) [-] TagClose ::= '>'
(44) [-] ETagOpen ::= '</'
(45) [-] EmptyTagClose ::= '/>'
(46) [-] ElementDeclOpen ::= '<!ELEMENT'
(47) [-] EmptyKW ::= 'EMPTY'
(48) [-] AnyKW ::= 'ANY'
(49) [-] QMark ::= '?'
(50) [-] Star ::= '*'
(51) [-] Plus ::= '+'
(52) [-] POpen ::= '('
(53) [-] Alt ::= '|'
(54) [-] PClose ::= ')'
(55) [-] Comma ::= ','
(56) [-] PCDataKW ::= '#PCDATA'
(57) [-] AttlistDeclOpen ::= '<!ATTLIST'
(58) [55] StringType ::= 'CDATA'
(59) [56] TokenizedType ::= 'ID' | 'IDREF' | 'IDREFS' | 'ENTITY' | 'ENTITIES' | 'NMTOKEN' | 'NMTOKENS'
(60) [-] NotationKW ::= 'NOTATION'
(61) [-] RequiredKW ::= '#REQUIRED'
(62) [-] ImpliedKW ::= '#IMPLIED'
(63) [-] FixedKW ::= '#FIXED'
(64) [-] SectOpen ::= '<!['
(65) [-] IncludeKW ::= 'INCLUDE'
(66) [-] SectClose ::= ']]>'
(67) [-] IgnoreKW ::= 'IGNORE'
(68) [65] Ignore ::= Char* - (Char* ('<![' | ']]>') Char*)
(69) [66] CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';'
(70) [68] EntityRef ::= '&' Name ';'
(71) [69] PEReference ::= '%' Name ';'
(72) [-] EntityDeclOpen ::= '<!ENTITY'
(73) [-] Percent ::= %
(74) [-] SystemKW ::= 'SYSTEM'
(75) [-] PublicKW ::= 'PUBLIC'
(76) [-] NDataKW ::= 'NDATA'
(77) [-] EncodingKW ::= 'encoding'
(78) [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
(79) [-] NotationDeclOpen ::= '<!NOTATION'

The syntax-level rules

No XML No Rule
(80) [1] Document ::= Prolog Element Misc*
(81) [9] EntityValue ::= DQuote (EntCharsNoDQuotes | PEReference | Reference)* DQuote | SQuote (EntCharsNoSQuotes | PEReference | Reference)* SQuote
(82) [10] AttValue ::= DQuote (AttCharsNoDQuotes | Reference)* DQuote | SQuote (AttCharsNoSQuotes | Reference)* SQuote
(83) [15] Comment ::= CommentOpen CommentBody CommentClose
(84) [16] PI ::= PIOpen PITarget (S PIBody)? PIClose
(85) [18] CDSect ::= CDStart CData CDEnd
(86) [22] Prolog ::= XMLDecl? Misc* (DocTypeDecl Misc*)?
(87) [23] XMLDecl ::= XMLDeclOpen VersionInfo EncodingDecl? SDDecl? S? PIClose
(88) [24] VersionInfo ::= S VersionKW S? Eq S? (SQuote VersionNum SQuote | DQuote VersionNum DQuote)
(89) [27] Misc ::= Comment | PI | S
(90) [28] DocTypeDecl ::= DocTypeOpen S Name (S ExternalID)? S? (BOpen (MarkupDecl | PEReference | S)* BClose S?)? DeclClose
(91) [29] MarkupDecl ::= ElementDecl | AttlistDecl | EntityDecl | NotationDecl | PI | Comment
(92) [30] ExtSubset ::= TextDecl? ExtSubsetDecl
(93) [31] ExtSubsetDecl ::= (MarkupDecl | ConditionalSect | PEReference | S)*
(94) [32] SDDecl ::= S StandAloneKW S? Eq S? YesOrNo
(95) [39] Element ::= EmptyElemTag | STag Content ETag
(96) [40] STag ::= STagOpen Name (S Attribute)* S? TagClose
(97) [41] Attribute ::= Name S? Eq S? AttValue
(98) [42] ETag ::= ETagOpen Name S? TagClose
(99) [43] Content ::= (Element | CharData | Reference | CDSect | PI | Comment)*
(100) [44] EmptyElemTag ::= STagOpen Name (S Attribute)* S? EmptyTagClose
(101) [45] ElementDecl ::= ElementDeclOpen S Name S ContentSpec S? DeclClose
(102) [46] ContentSpec ::= EmptyKW | AnyKW | Mixed | Children
(103) [47] Children ::= (Choice | Seq) (QMark | Star | Plus)?
(104) [48] Cp ::= (Name | Choice | Seq) (QMark | Star | Plus)?
(105) [49] Choice ::= POpen S? Cp ( S? Alt S? Cp )* S? PClose
(106) [50] Seq ::= POpen S? Cp ( S? Comma S? Cp )* S? PClose
(107) [51] Mixed ::= POpen S? PCDataKW (S? Alt S? Name)* S? PClose Star | POpen S? PCDataKW S? PClose
(108) [52] AttlistDecl ::= AttlistDeclOpen S Name AttDef* S? DeclClose
(109) [53] AttDef ::= S Name S AttType S DefaultDecl
(110) [54] AttType ::= StringType | TokenizedType | EnumeratedType
(111) [57] EnumeratedType ::= NotationType | Enumeration
(112) [58] NotationType ::= NotationKW S POpen S? Name (S? Alt S? Name)* S? PClose
(113) [59] Enumeration ::= POpen S? NmToken (S? Alt S? NmToken)* S? PClose
(114) [60] DefaultDecl ::= RequiredKW | ImpliedKW | ((FixedKW S)? AttValue)
(115) [61] ConditionalSect ::= IncludeSect | IgnoreSect
(116) [62] IncludeSect ::= SectOpen S? IncludeKW S? BOpen ExtSubsetDecl SectClose
(117) [63] IgnoreSect ::= SectOpen S? IgnoreKW S? BOpen IgnoreSectContents* SectClose
(118) [64] IgnoreSectContents ::= Ignore (SectOpen IgnoreSectContents SectClose Ignore)*
(119) [67] Reference ::= EntityRef | CharRef
(120) [70] EntityDecl ::= GEDecl | PEDecl
(121) [71] GEDecl ::= EntityDeclOpen S Name S EntityDef S? DeclClose
(122) [72] PEDecl ::= EntityDeclOpen S Percent S Name S PEDef S? DeclClose
(123) [73] EntityDef ::= EntityValue | (ExternalID NDataDecl?)
(124) [74] PEDef ::= EntityValue | ExternalID
(125) [75] ExternalID ::= SystemKW S SystemLiteral | PublicKW S PubidLiteral S SystemLiteral
(126) [76] NDataDecl ::= S NDataKW S Name
(127) [77] TextDecl ::= XMLDeclOpen VersionInfo? EncodingDecl S? PIClose
(128) [78] ExtParsedEnt ::= TextDecl? Content
(129) [79] ExtPE ::= TextDecl? ExtSubsetDecl
(130) [80] EncodingDecl ::= S EncodingKW S? Eq S? (DQuote EncName DQuote | SQuote EncName SQuote)
(131) [82] NotationDecl ::= NotationDeclOpen S Name S (ExternalID | PublicID) S? DeclClose
(132) [83] PublicID ::= PublicKW S PubidLiteral

Discussion

It is now easy (though tedious) to verify that the character-level rules denote character sets, that the token-level rules denote regular languages and that the syntax-level rules denote context-free languages of token classes. Again it is easy (though tedious) to verify that we have applied the following rules to the original XML productions:

  1. In grammar-level rules, we have replaced some regular subparts of the right-hand sides with token classes. These token classes have token-level rules without an XML number, indicated by [-] in the second column of the rule definitions.

  2. We have rewritten the token-level rules for Eq as Eq ::= '='. The original XML production is Eq ::= S? '=' S?. Consequently, we have replaced each Eq token in a right-hand side of a production with S? Eq S. We made this change so that white space that is represented by grammar symbol S can be used as token separator in the lexical-analysis phase and be removed from the syntax-level rules.

Obviously, these transformations preserve languages.

Analyzing the productions

No Token type Production used in
(10) S (84 PI) (87 XMLDecl) (88 VersionInfo) (89 Misc) (90 DocTypeDecl) (93 ExtSubsetDecl) (94 SDDecl) (96 STag) (97 Attribute) (98 ETag) (100 EmptyElemTag) (101 ElementDecl) (105 Choice) (106 Seq) (107 Mixed) (108 AttlistDecl) (109 AttDef) (112 NotationType) (113 Enumeration) (114 DefaultDecl) (116 IncludeSect) (117 IgnoreSect) (121 GEDecl) (122 PEDecl) (125 ExternalID) (126 NDataDecl) (127 TextDecl) (130 EncodingDecl) (131 NotationDecl) (132 PublicID)
(11) Name (28 PITarget) (70 EntityRef) (71 PEReference) (90 DocTypeDecl) (96 STag) (97 Attribute) (98 ETag) (100 EmptyElemTag) (101 ElementDecl) (104 Cp) (107 Mixed) (108 AttlistDecl) (109 AttDef) (112 NotationType) (121 GEDecl) (122 PEDecl) (126 NDataDecl) (131 NotationDecl)
(12) NmToken (113 Enumeration)
(13) DQuote (81 EntityValue) (82 AttValue) (88 VersionInfo) (130 EncodingDecl)
(14) SQuote (81 EntityValue) (82 AttValue) (88 VersionInfo) (130 EncodingDecl)
(15) EntCharsNoDQuotes (81 EntityValue)
(16) EntCharsNoSQuotes (81 EntityValue)
(17) AttCharsNoDQuotes (82 AttValue)
(18) AttCharsNoSQuotes (82 AttValue)
(19) SystemLiteral (125 ExternalID)
(20) PubidLiteral (125 ExternalID) (132 PublicID)
(21) CharData (99 Content)
(22) CommentOpen (83 Comment)
(23) CommentBody (83 Comment)
(24) CommentClose (83 Comment)
(25) PIOpen (84 PI)
(26) PIBody (84 PI)
(27) PIClose (84 PI) (87 XMLDecl) (127 TextDecl)
(28) PITarget (84 PI)
(29) CDStart (85 CDSect)
(30) CData (85 CDSect)
(31) CDEnd (85 CDSect)
(32) XMLDeclOpen (87 XMLDecl) (127 TextDecl)
(33) VersionKW (88 VersionInfo)
(34) Eq (88 VersionInfo) (94 SDDecl) (97 Attribute) (130 EncodingDecl)
(35) VersionNum (88 VersionInfo)
(36) DocTypeOpen (90 DocTypeDecl)
(37) BOpen (90 DocTypeDecl) (116 IncludeSect) (117 IgnoreSect)
(38) BClose (90 DocTypeDecl)
(39) DeclClose (90 DocTypeDecl) (101 ElementDecl) (108 AttlistDecl) (121 GEDecl) (122 PEDecl) (131 NotationDecl)
(40) StandAloneKW (94 SDDecl)
(41) YesOrNo (94 SDDecl)
(42) STagOpen (96 STag) (100 EmptyElemTag)
(43) TagClose (96 STag) (98 ETag)
(44) ETagOpen (98 ETag)
(45) EmptyTagClose (100 EmptyElemTag)
(46) ElementDeclOpen (101 ElementDecl)
(47) EmptyKW (102 ContentSpec)
(48) AnyKW (102 ContentSpec)
(49) QMark (103 Children) (104 Cp)
(50) Star (103 Children) (104 Cp) (107 Mixed)
(51) Plus (103 Children) (104 Cp)
(52) POpen (105 Choice) (106 Seq) (107 Mixed) (112 NotationType) (113 Enumeration)
(53) Alt (105 Choice) (107 Mixed) (112 NotationType) (113 Enumeration)
(54) PClose (105 Choice) (106 Seq) (107 Mixed) (112 NotationType) (113 Enumeration)
(55) Comma (106 Seq)
(56) PCDataKW (107 Mixed)
(57) AttlistDeclOpen (108 AttlistDecl)
(58) StringType (110 AttType)
(59) TokenizedType (110 AttType)
(60) NotationKW (112 NotationType)
(61) RequiredKW (114 DefaultDecl)
(62) ImpliedKW (114 DefaultDecl)
(63) FixedKW (114 DefaultDecl)
(64) SectOpen (116 IncludeSect) (117 IgnoreSect) (118 IgnoreSectContents)
(65) IncludeKW (116 IncludeSect)
(66) SectClose (116 IncludeSect) (117 IgnoreSect) (118 IgnoreSectContents)
(67) IgnoreKW (117 IgnoreSect)
(68) Ignore (118 IgnoreSectContents)
(69) CharRef (119 Reference)
(70) EntityRef (119 Reference)
(71) PEReference (81 EntityValue) (90 DocTypeDecl) (93 ExtSubsetDecl)
(72) EntityDeclOpen (121 GEDecl) (122 PEDecl)
(73) Percent (122 PEDecl)
(74) SystemKW (125 ExternalID)
(75) PublicKW (125 ExternalID) (132 PublicID)
(76) NDataKW (126 NDataDecl)
(77) EncodingKW (130 EncodingDecl)
(78) EncName (130 EncodingDecl)
(79) NotationDeclOpen (131 NotationDecl)
No Structure symbol Production used in
(80) Document
(81) EntityValue (123 EntityDef) (124 PEDef)
(82) AttValue (97 Attribute) (114 DefaultDecl)
(83) Comment (89 Misc) (91 MarkupDecl) (99 Content)
(84) PI (89 Misc) (91 MarkupDecl) (99 Content)
(85) CDSect (99 Content)
(86) Prolog (80 Document)
(87) XMLDecl (86 Prolog)
(88) VersionInfo (87 XMLDecl) (127 TextDecl)
(89) Misc (80 Document) (86 Prolog)
(90) DocTypeDecl (86 Prolog)
(91) MarkupDecl (90 DocTypeDecl) (93 ExtSubsetDecl)
(92) ExtSubset
(93) ExtSubsetDecl (92 ExtSubset) (116 IncludeSect) (129 ExtPE)
(94) SDDecl (87 XMLDecl)
(95) Element (80 Document) (99 Content)
(96) STag (95 Element)
(97) Attribute (96 STag) (100 EmptyElemTag)
(98) ETag (95 Element)
(99) Content (95 Element) (128 ExtParsedEnt)
(100) EmptyElemTag (95 Element)
(101) ElementDecl (91 MarkupDecl)
(102) ContentSpec (101 ElementDecl)
(103) Children (102 ContentSpec)
(104) Cp (105 Choice) (106 Seq)
(105) Choice (103 Children) (104 Cp)
(106) Seq (103 Children) (104 Cp)
(107) Mixed (102 ContentSpec)
(108) AttlistDecl (91 MarkupDecl)
(109) AttDef (108 AttlistDecl)
(110) AttType (109 AttDef)
(111) EnumeratedType (110 AttType)
(112) NotationType (111 EnumeratedType)
(113) Enumeration (111 EnumeratedType)
(114) DefaultDecl (109 AttDef)
(115) ConditionalSect (93 ExtSubsetDecl)
(116) IncludeSect (115 ConditionalSect)
(117) IgnoreSect (115 ConditionalSect)
(118) IgnoreSectContents (117 IgnoreSect) (118 IgnoreSectContents)
(119) Reference (81 EntityValue) (82 AttValue) (99 Content)
(120) EntityDecl (91 MarkupDecl)
(121) GEDecl (120 EntityDecl)
(122) PEDecl (120 EntityDecl)
(123) EntityDef (121 GEDecl)
(124) PEDef (122 PEDecl)
(125) ExternalID (90 DocTypeDecl) (123 EntityDef) (124 PEDef) (131 NotationDecl)
(126) NDataDecl (123 EntityDef)
(127) TextDecl (92 ExtSubset) (128 ExtParsedEnt) (129 ExtPE)
(128) ExtParsedEnt
(129) ExtPE
(130) EncodingDecl (87 XMLDecl) (127 TextDecl)
(131) NotationDecl (91 MarkupDecl)
(132) PublicID (131 NotationDecl)