Package Torello.HTML
Class HTMLTags
- java.lang.Object
-
- Torello.HTML.HTMLTags
-
public class HTMLTags extends java.lang.Object
Primary "HTML-5 Tags" class - keeps a list of all122 Tagsin aTreeSet<String>, and many accessor methods that are used by he HTML Parser, or potentially any class or function that may need this list.
The purpose of this class is to maintain the list of valid HTML tags in Java memory. There are under 200 of these, and they aid the HTMLParseclass in picking valid HTML tags when scraping. This class also maintains in memory some "pre-instantiated" Java-HTMLHTMLNode - TagNodeinstances. Theclass TagNodecontains only "final variables" (is immutable) because at least 80% of HTML on any given page is just a tag / element instance that never needs to change in memory. Call thepublic TagNode hasTag(String, TC)to obtain a valid instance ofclass TagNode.
-
-
Method Summary
Basic Methods Modifier and Type Method static StringgetDescription(String tag)static TagNodehasTag(String tag, TC openOrClosed)List Known Tags Modifier and Type Method static Iterator<String>iterator()static Iterator<String>iteratorAddedForHTML5()static Iterator<String>iteratorBlockTags()static Iterator<String>iteratorDeprecatedForHTML5()static Iterator<Map.Entry<String,
String>>iteratorDescriptions()static Iterator<String>iteratorInlineTags()static Iterator<String>iteratorSingletonTags()Check Tag Categories Modifier and Type Method static booleandeprecated(String tok)static booleanisBlock(String tok)static booleanisHTML5(String tok)static booleanisInline(String tok)static booleanisSingleton(String tok)static booleanisTag(String tag)Add or Remove Tags (to/from the Internal-List) Modifier and Type Method static booleanaddSingleton(String htmlTagSingleton)static booleanaddTag(String htmlTag)static booleanremoveSingleton(String htmlTagSingleton)static booleanremoveTag(String htmlTag)Print the Internal Tag List Modifier and Type Method static voidprintAll(Appendable a, boolean printDescriptions)static voidprintAllToTerminal(boolean printDescriptions)Utilities Modifier and Type Method static StringgetTag_MEM_HEAP_CHECKOUT_COPY(String tag)static voidloadDescriptions()static bytemaxTokenLength()
-
-
-
Method Detail
-
printAllToTerminal
public static void printAllToTerminal(boolean printDescriptions)
This simply prints all data that is stored in the JAR file to terminal output. It uses the method with the near-same name, but utilizes'System.out'for theAppendableinstance. Because'System.out'does not throw theIOExceptionwhen printing, it is caught here, for convenience.- Parameters:
printDescriptions- If this is set toTRUE, then it will ensure that the JAR Descriptions-Data-File is loaded into memory. If not, then the description-String'swill not be loaded. TheseString'scontain a one-sentence-long text-description of each HTML Element listed in this class. If this parameter isFALSEthe data-file will not be visited, and the HTML Element descriptions will not be sent to the output stream.- See Also:
printAll(Appendable, boolean)- Code:
- Exact Method Body:
try { printAll(System.out, printDescriptions); } catch (IOException e) { }
-
printAll
public static void printAll(java.lang.Appendable a, boolean printDescriptions) throws java.io.IOExceptionThis simply prints all data that is stored in the JAR data-file to ajava.lang.Appendable.- Parameters:
a- This parameter provides an instance that will receive the text output. This parameter may not be null, or aNullPointerExceptionwill throw. This expects an implementation of Java'sjava.lang.Appendableinterface which allows for a wide range of options when logging intermediate messages.Class or Interface Instance Use & Purpose 'System.out'Sends text to the standard-out terminal Torello.Java.StorageWriterSends text to System.out, and saves it, internally.FileWriter, PrintWriter, StringWriterGeneral purpose java text-output classes FileOutputStream, PrintStreamMore general-purpose java text-output classes
Checked IOException:
TheAppendableinterface requires that the Checked-ExceptionIOExceptionbe caught when using itsappend(...)methods.printDescriptions- If this is set toTRUE, then the ensure that the JAR Descriptions-Data-File has already been loaded into memory. If not, then the description-String'swill be loaded into memory. TheseString'scontain a one-sentence-long text-description of each HTML Element listed in this class. If this parameter isFALSEthe data-file will not be visited, and the HTML Element descriptions will not be sent to the output stream.- Throws:
java.io.IOException- The general purposeinterface java.lang.Appendablerequires checking for anIOExceptionthrow when printing information. If the'Appendable'provided to this method fails, this exception shall propagate out.- Code:
- Exact Method Body:
a.append("TAGS: "); for (String tag : tags) a.append(tag + ", "); a.append("\n\nDEPRECATED: "); for (String deprecatedTag : deprecated) a.append(deprecatedTag + ", "); a.append("\n\nHTML5: "); for (String html5Tag : html5Tags) a.append(html5Tag + ", "); a.append("\n\nSINGLETON-TAGS: "); for (String selfClosingTag : singletonTags) a.append(selfClosingTag + ", "); a.append("\n\nBLOCK-TAGS: "); for (String blockTag : blockTags) a.append(blockTag + ", "); a.append("\n\nINLINE-TAGS: "); for (String inlineTag : inlineTags) a.append(inlineTag + ", "); a.append("\n\ntagNodesOpening: "); for (String s : tagNodesOpening.keySet()) a.append(tagNodesOpening.get(s).toString() + ", "); a.append("\n\ntagNodesClosing: "); for (String s : tagNodesClosing.keySet()) a.append(tagNodesClosing.get(s).toString() + ", "); a.append("\n\ntagNodesOpeningUC: "); for (String s : tagNodesOpeningUC.keySet()) a.append(tagNodesOpeningUC.get(s).toString() + ", "); a.append("\n\ntagNodesClosingUC: "); for (String s : tagNodesClosingUC.keySet()) a.append(tagNodesClosingUC.get(s).toString() + ", "); if (printDescriptions) { loadDescriptions(); // Will only load if descriptions have not already been loaded. a.append("\n\n"); for (String s : descriptions.keySet()) a.append(s + ((s.length() >= 7) ? ":\t" : ":\t\t") + descriptions.get(s) + "\n"); }
-
loadDescriptions
public static void loadDescriptions()
The data-structure (ajava.util.TreeMap) that holds the individualtext-descriptionsof each HTML tag is not loaded into memory from the JAR file automatically. When the class-loader for this class loads this class, it employs a "Lazy Loading" Heuristic to prevent unnecessary memory-usage.
Instead, if a programmer has decided that he would like to start printing information about HTML-Tags, and would like to include a short, one or two sentence description of the HTML Elements (using the methodgetDescription(String), then and only then will this method'loadDescriptions'be invoked to load those one-sentence HTML-Tag Summaries.
As an aside, the purpose of keeping these sentences in a jar file is that they are a kind of long, and really never used at all - unless you are interested in doing some reporting. By keeping them in the jar-file (unless requested) some amount of "over-head" resource usage is saved.
If the text-descriptions have already loaded, this method will just exit and return, rather than loading them a second time.- See Also:
LFEC.readObjectFromFile_JAR(Class, String, boolean, Class)- Code:
- Exact Method Body:
if (descriptions.size() == 0) descriptions.putAll((TreeMap<String, String>) LFEC.readObjectFromFile_JAR (HTMLTags.class, "data-files/HTMLTagDescriptions.tmdat", true, TreeMap.class));
-
maxTokenLength
public static byte maxTokenLength()
This will compute theString-length of the longest HTML token saved in the internal stateTreeSet<String>of HTML Tokens.- Returns:
- The length of the longest HTML Token String.
- Code:
- Exact Method Body:
return MAX_TOKEN_LENGTH;
-
addTag
public static boolean addTag(java.lang.String htmlTag)
Adds a new HTML element to the list of elements that may be parsed, created and checked. This is not always advisable, as the complete list of HTML-5 tags are already internally stored, but if you would like to add or remove certain tags, there are two methods for doing this.- Parameters:
htmlTag- Any HTML tag that you would like to see parsed by the HTML page parser. If the parser encounters a construct such as:<YOUR_NEW_TAG ATTRIBUTES="...">it will treat that as a new HTML element.- Returns:
TRUEif the element was indeed a new element to the list, andFALSEif the HTML-tokens-list already contained this HTML element. If so, this method call will just return gracefully - with no changes being made to the underlying list of acceptable HTML tokens.- Throws:
HTMLTokException- If theStringparameter'htmlTag'contains non-alpha-numeric characters.- Code:
- Exact Method Body:
Matcher m = HTML_TAG_ALPHA_NUMERIC.matcher(htmlTag); if ((! m.find()) || (htmlTag.length() != m.group().length())) throw new HTMLTokException( "The HTML-Tag Parameter that was passed [" + htmlTag + "] doesn't conform to the " + "expected requirements for HTML-Tags. It may only contain alpha-numeric characters, " + "and it must not begin with a number." ); String tag = htmlTag.trim().toLowerCase(); if (tag.length() > 127) throw new HTMLTokException( "The (trimmed) HTML-Tag Parameter that was passed [" + tag + "] is longer than 127 " + "characters. This is not allowed here." ); boolean ret = tags.add(tag); if (ret) { // NOTE: These four private, static fields are of type TreeMap<String, TagNode> // tagNodesOpening, tagNodesOpeningUC, tagNodesClosing, tagNodesClosingUC // // They can provide a significant savings for the Garbage Collector. For any // HTML Element that does not have any attributes, and has a standard 'case' // (all upper-case, or all lower-case), the parser will "re-use" pre-existing // instances of class TagNode, rather than building a new one. // // FOR EXAMPLE: The parser will "re-use" the same instance of a "<BR>" TagNode, or // any one, actually, as long as it does not have attributes. Since 40% // to 50% of class TagNode are "TC.ClosingTags", this can be a significant // improvement // Build a Lower-Case, Pre-Instantiated, Zero-Attribute version of the HTML Element // Uses specialized package-only visible TagNode constructor. // Not available to the general public tagNodesOpening.put(tag, new TagNode(tag, TC.OpeningTags)); tagNodesClosing.put(tag, new TagNode(tag, TC.ClosingTags)); // Build an Upper-Case, Pre-Instantiated, Zero-Attribute version of the HTML Element tag = tag.toUpperCase(); tagNodesOpeningUC.put(tag, new TagNode("<" + tag + ">")); tagNodesClosingUC.put(tag, new TagNode("</" + tag + ">")); // Update the MAX_TOKEN_LENGTH - but only if necessary. if (tag.length() > MAX_TOKEN_LENGTH) MAX_TOKEN_LENGTH = (byte) tag.length(); } return ret;
-
removeTag
public static boolean removeTag(java.lang.String htmlTag)
Removes and HTML element from the list of elements that may be parsed, created and checked. This is not always advisable, as the complete list of HTML-5 tags are already internally stored, but if you would like to add or remove certain tags, there are two methods for doing this.- Parameters:
htmlTag- Any HTML tag that you no longer want to see parsed by the HTML page parser. HTML nodes that contain this tag as their element will cause the parser to ignore the node, and treat it like aTextNode.- Returns:
TRUEif the element was removed, andFALSEif it was not - because it wasn't in the HTML-tokens-list in the first place.- Code:
- Exact Method Body:
String tag = htmlTag.trim().toLowerCase(); boolean ret = tags.remove(tag); if (ret) { // "Lower-Case" and "Pre-Instantiated" (Zero-Attributes) version of TagNode tagNodesOpening.remove(tag); tagNodesClosing.remove(tag); tag = tag.toUpperCase(); // "Upper-Case", Pre-Instantiated, Zero-Attribute version of TagNode tagNodesOpeningUC.remove(tag); tagNodesClosingUC.remove(tag); // After removal, there is a small chance the // MAX_TOKEN_LENGTH is, now, shorter if (tag.length() == MAX_TOKEN_LENGTH) setMaxTokenLength(); } return ret;
-
addSingleton
public static boolean addSingleton(java.lang.String htmlTagSingleton)
Removes an HTML-element to the list of singleton HTML-elements. A singleton may only have an "opening" tag, and may not have a closing-version tag. For instance the<IMG SRC="...">is the classic-singleton, it's data is all stored internally as attribute values.- Parameters:
htmlTagSingleton- Any HTML tag that you would like to see listed as a singleton HTML-element.- Returns:
TRUEif the element was indeed a new element to the list, andFALSEif the HTML-singleton tokens-list already contained this HTML element. If so, this method call will just return gracefully - with no changes being made to the underlying list of singleton tokens.- Throws:
java.lang.IllegalArgumentException- If you have tried to "register" a singleton tag that isn't a fundamental HTML-tag, then this method will throw an exception directing you to first add your token to the HTML-tags/tokens internal-list.- Code:
- Exact Method Body:
String tag = htmlTagSingleton.trim().toLowerCase(); if (! tags.contains(tag)) throw new IllegalArgumentException( "The HTML token you have attempted to add [" + tag + "] may not be added to the " + "singletons list, because it is not a known/registered HTML token, as of now. " + "First, make sure it is listed as one of the parser's tokens by calling " + "'addTag(token)', and then invoking this method with that token." ); // Internally, there is a private & static TreeSet<String> which saves the names // of all HTML 'singleton' elements. Use Java's TreeSet.add(E) method return singletonTags.add(tag);
-
removeSingleton
public static boolean removeSingleton(java.lang.String htmlTagSingleton)
Adds an HTML-element to the list of singleton HTML-elements. A singleton may only have an "opening" tag, and may not have a closing-version tag. For instance the<IMG SRC="...">is the classic-singleton, it's data is all stored internally as attribute values.- Parameters:
htmlTagSingleton- Any HTML tag that you no longer want to see in the HTML-singleton tokens-list.- Returns:
TRUEif the element was removed, andFALSEif it was not - because it wasn't in the HTML-Singleton tokens-list in the first place.- Code:
- Exact Method Body:
String tag = htmlTagSingleton.trim().toLowerCase(); // Internally, there is a private & static TreeSet<String> which saves the names // of all HTML 'singleton' elements. Use Java's TreeSet.remove(Object) method return singletonTags.remove(tag);
-
hasTag
public static TagNode hasTag(java.lang.String tag, TC openOrClosed)
The purpose of this function/method is to provide a little "optimization." Since 100% ofclass HTMLTaginformation is stored as constant/final - this class facilitates instantiating only one copy of each node when building HTML page node-Vectors.
Internal to this class is a'Vector<TagNode>'of each and every HTML-Tag available - both in upper-case tag-versions, and also in lower-case tags. There must also be an opening-version of theTagNode, and also a closing-version of the sameTagNode.
This does, indeed, make a total of four total pre-instantiated tags that are stored within the Java-HTML JAR File. There is ajava.util.TreeMapthat is holding these serialized-TagNodeinstances. ThisTreeMaphas also been serialized and saved in the Java-HTML JAR, and it is loaded into memory by the Class-Loader as soon as an invocation to an HTML Method is made.
It is not mandatory to "reuse" instantiated HTMLTagNode's, but for memory management, garbage-collection efficiency, and other optimizations, the classes in this package use the pre-instantiated versions of these objects whenever possible.- Parameters:
tag- Any valid HTML tag. If the String passed is not a valid HTML tag, then this method will return null.openOrClosed- IfTC.OpeningTagsis passed, then an "open" version of the HTML tag will be returned, and ifTC.ClosingTagsis passed, then a closing version will be returned. IfTC.Bothis accidentally passed - it will default toTC.OpeningTags- Returns:
- An opening (or closing)
TagNode- ornullif the passedString tagdoes not represent any valid HTML-Tag - Code:
- Exact Method Body:
// FAIL-FAST: Check Input's immediately. Throw Exception for invalid input. if (openOrClosed == null) throw new NullPointerException ("Parameter 'openOrClosed' is null, but this is not allowed."); if (openOrClosed == TC.Both) throw new IllegalArgumentException ("Parameter 'openOrClosed' was specified as TC.Both, but this is not allowed here."); // IMPORTANT NOTE: For Singleton-Tags: There is no closing-version, so one SHOULD NOT be // requested. (There is no '</IMG>' tag!) However, this method DOES NOT throw // IllegalArgumentException in this case, but rather it just exits gracefully, and returns // null. String tagLC = tag.toLowerCase(); if (singletonTags.contains(tagLC) && (openOrClosed == TC.ClosingTags)) return null; // First, Check if the 'tag' is all lower-case. If it is, the string would be identical to // the 'tagLC' variable we have just created. if (tagLC.equals(tag)) { // Debugging Information, Debug-println. Un-comment to follow. DO NOTE DELETE THIS LINE. // System.out.println("Used a pre-instantiated TagNode, Lower-Case TreeMap"); return (openOrClosed == TC.OpeningTags) ? tagNodesOpening.get(tag) : tagNodesClosing.get(tag); } // Now, here, the variable could not have been all-lower-case. NEXT, Check if it is // all-upper-case // // NOTE: There are pre-defined tables that include pre-instantiated TagNode's - both for // lower-case tags and for upper-case tags. String tagUC = tag.toUpperCase(); if (tagUC.equals(tag)) { // Debugging Information, Debug-println. Un-comment to follow. DO NOTE DELETE THIS LINE. // System.out.println("Used a pre-instantiated TagNode, Upper-Case TreeMap"); return (openOrClosed == TC.OpeningTags) ? tagNodesOpeningUC.get(tag) : tagNodesClosingUC.get(tag); } // SPECIAL CASE: (Very Rare / Unlikely, but possible) The user has created an HTML Element // that has some lower-case alphabet letters, and some upper-case as well. This does not // guarantee that it is a valid HTML Token, though, so check // // FOR EXAMPLE: If somebody typed <SeCtIoN>, we need to preserve the case, no matter how // bizarre. In such a case, a pre-packaged TagNode cannot be used, and instead // a new TagNode must be instantiated. if (openOrClosed == TC.OpeningTags) return (tagNodesOpening.get(tagLC) == null) ? null : new TagNode("<" + tag + ">"); else return (tagNodesClosing.get(tagLC) == null) ? null : new TagNode("</" + tag + ">");
-
getTag_MEM_HEAP_CHECKOUT_COPY
public static java.lang.String getTag_MEM_HEAP_CHECKOUT_COPY (java.lang.String tag)This is an optimized, internal method that is used to prevent lots of duplicate HTML token-String'sfrom being created by theparser.Internally, there ought to be just one-instance ofString'slike:"img", "br", "div",etc... This is used by theparserto reuse an already instantiated tokenString.
This method probably has relatively little use outside of the internal HTMLparsercode.
As a concept, this method is similar to what the JDK MethodString.intern()does. It, essentially, makes it somewhat easier to reuse instances of theTagNode'stokString-Field.- Parameters:
tag- This is an HTML token. An identicalStringto this 'token'String, but possible different memory reference on the heap shall be returned.- Returns:
- The returned
Stringshall obey this issue:- assert(tag.equals(returned_string)); // Identical
Stringis returned
- assert(! (tag == returnedString)); // Probably a different memory allocation on the
// heap. PROBABLY!
Note that Java does not make any contracts regardingStringreferences! (This can only help...)
IMPORTANT: If the tag passed is not a valid HTML-Tag, then this method shall return null. - assert(tag.equals(returned_string)); // Identical
- Code:
- Exact Method Body:
// Obviously, for the 200 or so "pre-instantiated" (having-no-attributes) instances of // class TagNode that are kept, internally, in the data-structures of this class, // 'HTMLTags' We cannot retrieve a "pre-allocated" copy of the tag-as-a-string from // the heap, because we are building the data-file for the first time! if (BUILDING_DATA_FILE___SKIP_OPTIMIZATION_TEMPORARILY) return tag.toLowerCase(); TagNode tn = tagNodesOpening.get(tag.toLowerCase()); // If the tag isn't found, make sure not to throw NullPointerException! if (tn == null) return null; // This "version" (of the exact same html-element-name is already on the heap) // Obviously, because, variable 'tn' has already been instantiated and is in the TreeMap // If this EXACT SAME REFERENCE IS USED FOR ALL "TagNode.tok" instances, quite a bit of // wasted-space in the heap's lookup table will be eliminated as the same "token" // (which is the name of the HTML Element: "div," "img," "span," etc...) is reused over // and over and over again. Helps a little bit! Not that complicated! return tn.tok;
-
isTag
public static boolean isTag(java.lang.String tag)
Checks if aStringis registered as a proper HTML tag according to the internally maintained lists.
View Tags List:
The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's'tag'parameter, and result in a return value of TRUE. This list is the complete list of HTML Element Names that are maintained, by default, in this class' internalLookup TableofHTMLTags.
HTML Elements
Case Insensitive:
The test performed by this method shall ignore case.
Modifying this List:
The list ofHTML Elementsmay, in fact, be altered. To add a newElement Nameto the internal lookup table of valid HTML Elements, useaddTag(String). To remove an HTML Element from the internal list, useremoveTag(String).- Returns:
TRUEif this is a valid HTML tag. NOTE: All HTML-5 Element-TagStringswill returnTRUEas they are contained in the default internal list.- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the standard HTML Tags. Just uses Java's TreeSet.contains(Object) method. return tags.contains(tag.toLowerCase());
-
isHTML5
public static boolean isHTML5(java.lang.String tok)
Checks if aStringis a proper HTML-5 (only) tag. This list is rather short, and only containsHTML Elementswhich specifically for the release of HTML 5. AnyHTML Elementwhich is both a validHTML Release 4(or earlier) and anHTML 5 Elementwill not result inTRUEbeing returned by this method.
View Tags List:
The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's'tok'parameter, and result in a return value of TRUE. This list is the complete list of HTML-5 Element Names that are maintained, by default, in this class' internalLookup TableofHTML-5Tags.
Elements Added for HTML-5
Case Insensitive:
The test performed by this method shall ignore case.- Parameters:
tok- Any HTML-Tag as aString.- Returns:
TRUEif this is a tag that was added for HTML-5, and not included in HTML 4, or earlier- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the HTML-5 Tags. Just uses Java's TreeSet.contains(Object) method. return html5Tags.contains(tok.toLowerCase());
-
deprecated
public static boolean deprecated(java.lang.String tok)
Checks if aStringis listed as an HTML Element that was deprecated for HTML 5
View Tags List:
The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's'tok'parameter, and result in a return value of TRUE. This list is the complete list of Deprecated HTML Element Names that are maintained, by default, in this class' internalLookup TableofDeprecated HTMLTags.
Elements Deprecated for HTML-5
Case Insensitive:
The test performed by this method shall ignore case.- Parameters:
tok- Any HTML-Tag as aString.- Returns:
TRUEif this tag was deprecated for HTML-5- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the deprecated-for-HTML-5 Tags. Just uses Java's TreeSet.contains(Object) // method. return deprecated.contains(tok.toLowerCase());
-
isSingleton
public static boolean isSingleton(java.lang.String tok)
This method checks whether specific HTML elements are both "opening and closing" elements, such as:P, DIV, SPAN,along with myriad others, OR if this one of the (very few) "singleton HTML elements", such as the HTML<IMG SRC="...">element which may not have a closing tag. Such tags are also called "Self-Closing" tags.
View Tags List:
The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's'tok'parameter, and result in a return value of TRUE. This list is the complete list of Singleton Element Names that are maintained, by default, in this class' internalLookup TableofSingletonTags.
Singleton Elements
Case Insensitive:
The test performed by this method shall ignore case.
Modifying this List:
The list ofSingleton HTML Elementsmay, in fact, be altered. To add a newSingleton HTML Element Nameto the internal lookup table of valid Singleton Elements, useaddSingleton(String). To remove an HTML Elementfrom the internal list, useremoveSingleton(String).- Parameters:
tok- This is the HTML element name to be tested.- Returns:
TRUEif this is a'singleton'HTML Element - a.k.a., onlyOpeningTagversions of the element exist, because singleton HTML elements don't need / may not have a closing tag.Singletonexamples include:IMG, HR, INPUTetc...FALSEis returned if the tag is not asingletonparameter.- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the 'singleton' HTML Tags. Just uses Java's TreeSet.contains(Object) method. return singletonTags.contains(tok.toLowerCase());
-
isBlock
public static boolean isBlock(java.lang.String tok)
This method checks whether specific HTML elements are among the'Block'Tag elements list. An explanation of what a'block'or'inline'tag is, is beyond the scope of this document.
View Tags List:
The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's'tok'parameter, and result in a return value of TRUE. This list is the complete list of Block Element Names that are maintained, by default, in this class' internalLookup TableofBlockTags.
HTML Block Elements
Case Insensitive:
The test performed by this method shall ignore case.- Parameters:
tok- This is the HTML element name to be tested.- Returns:
TRUEif this is a'block'HTML Element,FALSEotherwise.- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the HTML 'Block' Tags. Just uses Java's TreeSet.contains(Object) method. return blockTags.contains(tok.toLowerCase());
-
isInline
public static boolean isInline(java.lang.String tok)
This method checks whether specific HTML elements are among the'Inline'Tag elements list. An explanation of what a'block'or'inline'tag is, is beyond the scope of this document.
View Tags List:
The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's'tok'parameter, and result in a return value of TRUE. This list is the complete list of Inline Element Names that are maintained, by default, in this class' internalLookup TableofInlineTags.
HTML Inline Elements
Case Insensitive:
The test performed by this method shall ignore case.- Parameters:
tok- This is the HTML element name to be tested.- Returns:
TRUEif this is an'inline'HTML Element,FALSEotherwise.- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the HTML 'Inline' Tags. Just uses Java's TreeSet.contains(Object) method. return inlineTags.contains(tok.toLowerCase());
-
getDescription
public static java.lang.String getDescription(java.lang.String tag)
Returns a brief, English Language Description, of an HTML Tag. These descriptions are stored in a small data-file,
Loading from JAR-File:
This method will attempt to load a particular data-file from the JAR-library into memory. This file contains a one-sentence description, stored asjava.lang.String'sfor each of the HTML Elements known to this class. Under normal operation, theseString-arrays remain on-disk, only.- Parameters:
tag- Any valid HTML tag.- Returns:
- A short English-Language description of the Tag in HTML, or null if this tag is unknown.
- See Also:
loadDescriptions()- Code:
- Exact Method Body:
// Loads the descriptions map, ONLY IF they have not already been loaded into memory from // the JAR data-files loadDescriptions(); return descriptions.get(tag.toLowerCase());
-
iterator
public static java.util.Iterator<java.lang.String> iterator()
Internally, tags are stored in a Javajava.util.TreeSet<String>. This method invokes theiterator()method on thatTreeSet.
Remove Unsupported:
In order to prevent accidental removal of HTML-Tags via theIterator's 'remove()'method, the returned-Iteratorinstance has been overloaded - "wrapped" - in a simple class that throws an exception ifremove()is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.
Data File Contents:
The contents of thisIteratormay be viewed here:
HTML Elements- Returns:
- an
Iterator<String>that iterates over all the Tag-String'sin alphabetical order. - See Also:
RemoveUnsupportedIterator- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the standard HTML Tags. Just uses Java's TreeSet.iterator() method. // // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this // TreeSet return new RemoveUnsupportedIterator<String>(tags.iterator());
-
iteratorDescriptions
public static java.util.Iterator<java.util.Map.Entry<java.lang.String,java.lang.String>> iteratorDescriptions ()Will build anIteratorthat can return attributes and their text-Stringdescriptions.
Data File Contents:
The contents of thisIteratorare loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load is only performed on request. The contents of this data-file (and the list ofMap.Entry'sreturned by theIterator) may be viewed, here, by clicking the link below:
HTML Elements with Descriptions
Lazy Loading:
In this class, if the methods invoked do not require the Event-DescriptionString-Data, then the Class-Loader will not load this extensive text-data into memory from the JAR data-files.- Returns:
- an
Iteratorthat iterates the HTML-Tag / HTML-Tag-Description key-value pairs as instances of"Map.Entry<String, String>" - See Also:
loadDescriptions(),RemoveUnsupportedIterator- Code:
- Exact Method Body:
loadDescriptions(); // Will only load if descriptions have not already been loaded. return new RemoveUnsupportedIterator<Map.Entry<String, String>> (descriptions.entrySet().iterator());
-
iteratorAddedForHTML5
public static java.util.Iterator<java.lang.String> iteratorAddedForHTML5()
Internally, HTML-5 tags are stored in a Javajava.util.TreeSet<String>. This method invokes theiterator()method on thatTreeSet.
Remove Unsupported:
In order to prevent accidental removal of HTML-5-Tags via theIterator's 'remove()'method, the returned-Iteratorinstance has been overloaded - "wrapped" - in a simple class that throws an exception ifremove()is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.
Data File Contents:
The contents of thisIteratorare loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:
Elements Added for HTML-5- Returns:
- an
Iterator<String>that cycles through the list of HTML Tag-String's that were added for in HTML-5. - See Also:
RemoveUnsupportedIterator- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the HTML-5 Tags. Just uses Java's TreeSet.iterator() method. // // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this // TreeSet return new RemoveUnsupportedIterator<String>(html5Tags.iterator());
-
iteratorDeprecatedForHTML5
public static java.util.Iterator<java.lang.String> iteratorDeprecatedForHTML5 ()Internally, deprecated tags are stored in a Javajava.util.TreeSet<String>. This method invokes theiterator()method on thatTreeSet.
Remove Unsupported:
In order to prevent accidental removal of Deprecated-Tags via theIterator's 'remove()'method, the returned-Iteratorinstance has been overloaded - "wrapped" - in a simple class that throws an exception ifremove()is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.
Data File Contents:
The contents of thisIteratorare loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:
Elements Deprecated for HTML-5- Returns:
- an
Iterator<String>that cycles through the list of HTML Tag-String's that were removed for HTML-5. - See Also:
RemoveUnsupportedIterator- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the deprecated Tags. Just uses Java's TreeSet.iterator() method. // // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this // TreeSet return new RemoveUnsupportedIterator<String>(deprecated.iterator());
-
iteratorSingletonTags
public static java.util.Iterator<java.lang.String> iteratorSingletonTags()
Internally, singleton / self-closing tags are stored in a Javajava.util.TreeSet<String>. This method invokes theiterator()method on thatTreeSet.
Remove Unsupported:
In order to prevent accidental removal of Singleton-Tags via theIterator's 'remove()'method, the returned-Iteratorinstance has been overloaded - "wrapped" - in a simple class that throws an exception ifremove()is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.
Data File Contents:
The contents of thisIteratorare loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:
Singleton Elements- Returns:
- an
Iterator<String>that cycles through the list of HTML Tag-String's that qualify as singleton elements, and may not have closing-tag versions. - See Also:
RemoveUnsupportedIterator- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the HTML 'Singleton' Tags. Just uses Java's TreeSet.iterator() method. // // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this // TreeSet return new RemoveUnsupportedIterator<String>(singletonTags.iterator());
-
iteratorBlockTags
public static java.util.Iterator<java.lang.String> iteratorBlockTags()
Internally, singleton / self-closing tags are stored in a Javajava.util.TreeSet<String>. This method invokes theiterator()method on thatTreeSet.
Remove Unsupported:
In order to prevent accidental removal of Block-Tags via theIterator's 'remove()'method, the returned-Iteratorinstance has been overloaded - "wrapped" - in a simple class that throws an exception ifremove()is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.
Data File Contents:
The contents of thisIteratorare loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:
HTML Block Elements- Returns:
- an
Iterator<String>that cycles through the list of HTML Tag-String's that qualify as block elements. - See Also:
RemoveUnsupportedIterator- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the HTML 'Inline' Tags. Just uses Java's TreeSet.iterator() method. // // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this // TreeSet return new RemoveUnsupportedIterator<String>(blockTags.iterator());
-
iteratorInlineTags
public static java.util.Iterator<java.lang.String> iteratorInlineTags()
Internally, "HTML Block Tags" are stored in a Javajava.util.TreeSet<String>. This method invokes theiterator();method on thatTreeSet.
Remove Unsupported:
In order to prevent accidental removal of Inline-Tags via theIterator's 'remove()'method, the returned-Iteratorinstance has been overloaded - "wrapped" - in a simple class that throws an exception ifremove()is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.
Data File Contents:
The contents of thisIteratorare loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:
HTML Inline Elements- Returns:
- an
Iterator<String>that cycles through the list of HTML Tag-String's that qualify as inline elements. - See Also:
RemoveUnsupportedIterator- Code:
- Exact Method Body:
// Internally, this class has a private & static TreeSet<String> that stores a list // of all the HTML 'Block' Tags. Just uses Java's TreeSet.iterator() method. // // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this // TreeSet return new RemoveUnsupportedIterator<String>(inlineTags.iterator());
-
-