public final class SGMLParser extends Object
SGML parser with something like a SAX-api.| Modifier and Type | Class and Description |
|---|---|
(package private) class |
SGMLParser.AttributesWrapper
An **** partial **** implementation
of the SAX-interface
Attributes
which allows to set name-value-pairs by method SGMLParser.AttributesWrapper.addAttribute(java.lang.String, java.lang.String). |
(package private) static class |
SGMLParser.Buffer
Class which buffers the read stream.
|
(package private) static interface |
SGMLParser.CharTester
Provides a single method which decides whether the given character
passes a certain test.
|
(package private) static class |
SGMLParser.SpecCharTester
A
CharTester which allows to specify
the character which passes the test. |
(package private) static class |
SGMLParser.TrivialContentHandler
A
ContentHandler which simply ignores all events. |
(package private) static interface |
SGMLParser.XMLsGMLspecifica
Provides a bunch of methods fpr parsing
with implementations specific to xml and sgml.
|
| Modifier and Type | Field and Description |
|---|---|
private static String |
ATTR_NAME
Short string representation of the object currently parsed.
|
private static String |
ATTR_VALUE
Short string representation of the object currently parsed.
|
private SGMLParser.Buffer |
buffer
The buffer of the input stream.
|
private static int |
BUFFER_SIZE
The size of the buffer used internally.
|
private ContentHandler |
contentHandler
The
ContentHandler registered. |
private int |
currChar
The current character or
-1
to signfy the end of the stream. |
private static String |
END_TAG
Short string representation of the object currently parsed.
|
private SGMLParser.XMLsGMLspecifica |
htmlAttributeParser
Contains the
HTML-specific part of the parser. |
private ParseExceptionHandler |
parseExceptionHandler
The
ParseExceptionHandler registered. |
private static String |
PROC_INSTR
Short string representation of the object currently parsed.
|
private static String |
QUOTE_DOT |
private static String |
START_TAG
Short string representation of the object currently parsed.
|
private static char |
SYMB_COMMENT |
private static char |
SYMB_EQ |
private static char |
SYMB_TAG |
private static SGMLParser.CharTester |
TEST_BLANK_EQUALS_GT
Tests for
= and for >. |
private static SGMLParser.CharTester |
TEST_BLANK_GT
Tests for blank or
>. |
private static SGMLParser.CharTester |
TEST_BLANK_GT_SLASH
Tests for blank,
/, >. |
private static SGMLParser.CharTester |
TEST_END_OF_COMMENT
Tests for quote both for
' and for ". |
private static SGMLParser.CharTester |
TEST_GT
Tests for
>. |
private static SGMLParser.CharTester |
TEST_LT
Tests for
<. |
private static SGMLParser.CharTester |
TEST_NO_WHITESPACE
Tests for whitespace.
|
private static SGMLParser.SpecCharTester |
TEST_SPEC
Tests for a specified character.
|
private static String |
WHITESP_IN_ATTR
Short string representation of the object currently parsed.
|
private SGMLParser.XMLsGMLspecifica |
xmlAttributeParser
Contains the
XML-specific part of the parser. |
private SGMLParser.XMLsGMLspecifica |
xmlSgmlSpecifica
Contains class with methods specific for xml and sgml, respectively.
|
| Constructor and Description |
|---|
SGMLParser()
Creates a new
SGMLParser
with the default handlers for content and exceptions. |
| Modifier and Type | Method and Description |
|---|---|
ContentHandler |
getContentHandler()
Returns
contentHandler. |
ParseExceptionHandler |
getExceptionHandler()
Returns
parseExceptionHandler. |
boolean |
isXMLParser() |
(package private) void |
parse(InputSource src)
Parses the
InputSource given
but delegates everything inside a tag or a processing instruction
to parseTagOrPI(). |
void |
parse(Reader reader)
Parses the given
InputStream. |
(package private) void |
parseEndTag()
Parses an end-tag notifying the underlying handler.
|
(package private) void |
parseStartOrStartEndTag()
Parses a start-tag or, for xml, an empty tag.
|
private int |
parseTagOrPI()
Parses everything within a tag, a processing instruction, ...
|
private int |
parseText()
Parses everything outside a tag, a processing instruction, ...
|
boolean |
parseXML(boolean xml)
Sets whether this parser is used as an xml-parser.
|
void |
setContentHandler(ContentHandler contentHandler)
Sets
contentHandler. |
void |
setExceptionHandler(ParseExceptionHandler peHandler)
Sets
parseExceptionHandler. |
private static final String QUOTE_DOT
private static final char SYMB_EQ
private static final char SYMB_COMMENT
private static final char SYMB_TAG
private static final SGMLParser.CharTester TEST_BLANK_GT_SLASH
/, >.private static final SGMLParser.CharTester TEST_BLANK_GT
>.private static final SGMLParser.CharTester TEST_LT
<.private static final SGMLParser.CharTester TEST_GT
>.private static final SGMLParser.CharTester TEST_BLANK_EQUALS_GT
= and for >.private static final SGMLParser.CharTester TEST_NO_WHITESPACE
private static final SGMLParser.CharTester TEST_END_OF_COMMENT
' and for ".private static final SGMLParser.SpecCharTester TEST_SPEC
' and ".private final SGMLParser.XMLsGMLspecifica htmlAttributeParser
HTML-specific part of the parser.private final SGMLParser.XMLsGMLspecifica xmlAttributeParser
XML-specific part of the parser.private static final int BUFFER_SIZE
1.
I found no significant difference in speed when increasing this number.
The buffer coming from a stream from a URL seems to hav maximal size
of 1448 whereas for file streams there seems no bound.
In the cases considered, the file is read in as a whole.private static final String START_TAG
SGMLParser.Buffer.readStringBuffer(eu.simuline.util.sgml.SGMLParser.CharTester, java.lang.String).private static final String END_TAG
SGMLParser.Buffer.readStringBuffer(eu.simuline.util.sgml.SGMLParser.CharTester, java.lang.String).private static final String PROC_INSTR
SGMLParser.Buffer.readStringBuffer(eu.simuline.util.sgml.SGMLParser.CharTester, java.lang.String).private static final String ATTR_NAME
SGMLParser.Buffer.readStringBuffer(eu.simuline.util.sgml.SGMLParser.CharTester, java.lang.String).private static final String WHITESP_IN_ATTR
SGMLParser.Buffer.readStringBuffer(eu.simuline.util.sgml.SGMLParser.CharTester, java.lang.String).private static final String ATTR_VALUE
SGMLParser.Buffer.readStringBuffer(eu.simuline.util.sgml.SGMLParser.CharTester, java.lang.String).private SGMLParser.XMLsGMLspecifica xmlSgmlSpecifica
private int currChar
-1
to signfy the end of the stream.private ContentHandler contentHandler
ContentHandler registered.private ParseExceptionHandler parseExceptionHandler
ParseExceptionHandler registered.private SGMLParser.Buffer buffer
public SGMLParser()
SGMLParser
with the default handlers for content and exceptions.void parse(InputSource src) throws IOException, SAXException
InputSource given
but delegates everything inside a tag or a processing instruction
to parseTagOrPI().src - an InputSource.IOException - if an error occursSAXException - if an error occurspublic void parse(Reader reader) throws IOException, SAXException
InputStream.reader - an Reader sequentializing an SGML document.IOException - if an error reading the stream occurs.SAXException - if an error with the sgml-syntax occurs.private int parseText()
throws IOException,
SAXException
< and >.
***** Missing: distinction between notification
of characters and whitespace. ****IOException - if an error reading the stream occurs.SAXException - if an error with the sgml-syntax occurs.parseTagOrPI()void parseEndTag()
throws IOException,
SAXException
IOException - if an error reading the stream occurs.SAXException - if an error with the sgml-syntax occurs.void parseStartOrStartEndTag()
throws IOException,
SAXException
IOException - if an error reading the stream occurs.SAXException - if an error with the sgml-syntax occurs.private int parseTagOrPI()
throws IOException,
SAXException
< and >.IOExceptionSAXExceptionparseText()public void setContentHandler(ContentHandler contentHandler)
contentHandler.contentHandler - a ContentHandler.public ContentHandler getContentHandler()
contentHandler.ContentHandler contentHandler.public void setExceptionHandler(ParseExceptionHandler peHandler)
parseExceptionHandler.peHandler - a ParseExceptionHandler.public ParseExceptionHandler getExceptionHandler()
parseExceptionHandler.ContentHandler parseExceptionHandler.public boolean parseXML(boolean xml)
xml - a boolean value signifying
whether this parser will be used as an xml-parser in the sequel.boolean value signifying
whether before invoking this method
this parser was used as an xml-parserpublic boolean isXMLParser()
Copyright © 2012–2018 Simuline Organization (l2r). All rights reserved.