de.rw7.token
Class HtmlTokenizer
java.lang.Object
|
+--de.rw7.token.HtmlTokenizer
- public class HtmlTokenizer
- extends java.lang.Object
Parses a HTML-stream (given as FastInput) into tokens. These tokens may be:
- Opening tags
- Closing tags
- Attributs of an opening tag
- Text beetween the tags converted to strings
The parser tries to accept any dirty HTML. There are no exceptions for invalid HTML.
- tag ::= < [/] name param* >
- name ::= (c)*
- param ::= n* key [ n* = n* value ]
- key ::= (c)*
- value := (c)* | " (c-")* " | ' (c-')* '
- n::= <a white space>
- c::= <a non whitespace character>
The parser is specialized to work on only a few interesting tags over a large amount of HTML.
| Methods inherited from class java.lang.Object |
<clinit>, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait |
PRESENT
public static final java.lang.String PRESENT
MODE_ABORT
public static final int MODE_ABORT
MODE_PARSE_TEXT
public static final int MODE_PARSE_TEXT
MODE_IGNORE_COMMENTS
public static final int MODE_IGNORE_COMMENTS
MODE_STRICT_TAGS
public static final int MODE_STRICT_TAGS
MODE_STRICT_ATTR
public static final int MODE_STRICT_ATTR
initialMode
private int initialMode
open
private TagNode open
- The registry of the opening tags, the consumer is interested in.
clos
private TagNode clos
- The registry of the closing tags, the consumer is interested in.
HtmlTokenizer
public HtmlTokenizer(int initialMode)
addOpeningTag
public void addOpeningTag(java.lang.String name,
int code,
java.lang.String[] attr)
addClosingTag
public void addClosingTag(java.lang.String name,
int code)
addTag
public void addTag(java.lang.String name,
int code,
java.lang.String[] attr)
removeOpeningTag
public void removeOpeningTag(java.lang.String name)
removeClosingTag
public void removeClosingTag(java.lang.String name)
removeTag
public void removeTag(java.lang.String name)
printTree
public void printTree(java.io.PrintStream o)
read
public void read(FastInput in,
HtmlConsumer t)
throws java.io.IOException,
FastInput.EndException
readComments
public void readComments(FastInput in,
boolean ignoreComments)
throws java.io.IOException,
FastInput.EndException
test
public static final void test()