1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
|
# FleXML fast XML scanner framework
# Copyright (c) 1999 Kristoffer Rose. All rights reserved.
#
# Description: Notes for FleXML scanner generator
# Author: Kristoffer Rose
# Created: August 1999
# License: NTSys proprietary
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
# $Id: NOTES,v 1.3 2006/07/18 18:21:13 mquinson Exp $
GUIDELINES FOR WRITING A flex(1) SCANNER FOR SIMPLE XML FORMATS.
We know how to handle the following DTD element declarations.
1. "Leaf" elements with the declarations
<!ELEMENT tag EMPTY>
<!ELEMENT tag (#PCDATA)>
Start conditions:
%x AL_tag
Rules:
"<tag"{s}"/>" STag(tag), ETag(tag);
"<tag"{s}">" STag(tag), ENTER(PCDATA);
"<tag"{S} STag(tag), ENTER(AL_tag);
<AL_tag>"/>" LEAVE, Etag(tag);
<AL_tag>">" LEAVE, ENTER(PCDATA);
<PCDATA>"</tag"{s}">" Etag(tag), LEAVE;
Handlers:
STag_tag(void) {...}
ETag_tag(char* pcdata) {...}
2. "Logical" elements with unordered element contents using the declaration
<!ELEMENT tag (tag1|tag2|...|tagn)*>
Start conditions:
%x AL_tag
%x IN_tag
Rules:
"<tag"{s}"/>" STag(tag), ETag(tag);
"<tag"{s}">" STag(tag), ENTER(IN_tag);
"<tag"{S} STag(tag), ENTER(AL_tag);
<AL_tag>"/>" LEAVE, Etag(tag);
<AL_tag>">" BEGIN(IN_tag);
<IN_tag>"</tag"{s}">" Etag(tag);
Handlers:
STag_tag(void) {...}
ETag_tag(char* pcdata) {...}
3. "Attribute" declarations of the form
<!ATTLIST tag attribute CDATA>
Rule:
<AL_tag>"attribute"{Eq}{Q} Attribute(tag,attribute);
Handler:
Attribute_tag_attribute(char* value) {...}
That's all, for the moment.
Note: the scanner can be made (more) validating by using the <IN_tag>
condition in front of all the rules in tag1...tagn and give the
top-level rules the start condition INITIAL.
|