Package Torello.HTML

Class HTMLTags


  • public class HTMLTags
    extends java.lang.Object
    Primary "HTML-5 Tags" class - keeps a list of all 122 Tags in a TreeSet<String>, and many accessor methods that are used by he HTML Parser, or potentially any class or function that may need this list.

    The purpose of this class is to maintain the list of valid HTML tags in Java memory. There are under 200 of these, and they aid the HTML Parse class in picking valid HTML tags when scraping. This class also maintains in memory some "pre-instantiated" Java-HTML HTMLNode - TagNode instances. The class TagNode contains only "final variables" (is immutable) because at least 80% of HTML on any given page is just a tag / element instance that never needs to change in memory. Call the public TagNode hasTag(String, TC) to obtain a valid instance of class TagNode.
    • Method Detail

      • printAllToTerminal

        🡅  🡇     🗕  🗗  🗖
        public static void printAllToTerminal​(boolean printDescriptions)
        This simply prints all data that is stored in the JAR file to terminal output. It uses the method with the near-same name, but utilizes 'System.out' for the Appendable instance. Because 'System.out' does not throw the IOException when printing, it is caught here, for convenience.
        Parameters:
        printDescriptions - If this is set to TRUE, then it will ensure that the JAR Descriptions-Data-File is loaded into memory. If not, then the description-String's will not be loaded. These String's contain a one-sentence-long text-description of each HTML Element listed in this class. If this parameter is FALSE the data-file will not be visited, and the HTML Element descriptions will not be sent to the output stream.
        See Also:
        printAll(Appendable, boolean)
        Code:
        Exact Method Body:
         try { printAll(System.out, printDescriptions); } catch (IOException e) { }
        
      • printAll

        🡅  🡇     🗕  🗗  🗖
        public static void printAll​(java.lang.Appendable a,
                                    boolean printDescriptions)
                             throws java.io.IOException
        This simply prints all data that is stored in the JAR data-file to a java.lang.Appendable.
        Parameters:
        a - This parameter provides an instance that will receive the text output. This parameter may not be null, or a NullPointerException will throw. This expects an implementation of Java's java.lang.Appendable interface which allows for a wide range of options when logging intermediate messages.
        Class or Interface InstanceUse & Purpose
        'System.out' Sends text to the standard-out terminal
        Torello.Java.StorageWriter Sends text to System.out, and saves it, internally.
        FileWriter, PrintWriter, StringWriter General purpose java text-output classes
        FileOutputStream, PrintStream More general-purpose java text-output classes

        Checked IOException:
        The Appendable interface requires that the Checked-Exception IOException be caught when using its append(...) methods.
        printDescriptions - If this is set to TRUE, then the ensure that the JAR Descriptions-Data-File has already been loaded into memory. If not, then the description-String's will be loaded into memory. These String's contain a one-sentence-long text-description of each HTML Element listed in this class. If this parameter is FALSE the data-file will not be visited, and the HTML Element descriptions will not be sent to the output stream.
        Throws:
        java.io.IOException - The general purpose interface java.lang.Appendable requires checking for an IOException throw when printing information. If the 'Appendable' provided to this method fails, this exception shall propagate out.
        Code:
        Exact Method Body:
         a.append("TAGS: ");
         for (String tag : tags)                     a.append(tag + ", ");
        
         a.append("\n\nDEPRECATED: ");
         for (String deprecatedTag : deprecated)     a.append(deprecatedTag + ", ");
        
         a.append("\n\nHTML5: ");
         for (String html5Tag : html5Tags)           a.append(html5Tag + ", ");
        
         a.append("\n\nSINGLETON-TAGS: ");
         for (String selfClosingTag : singletonTags) a.append(selfClosingTag + ", ");
        
         a.append("\n\nBLOCK-TAGS: ");
         for (String blockTag : blockTags)           a.append(blockTag + ", ");
        
         a.append("\n\nINLINE-TAGS: ");
         for (String inlineTag : inlineTags)         a.append(inlineTag + ", ");
        
         a.append("\n\ntagNodesOpening: ");
         for (String s : tagNodesOpening.keySet())
             a.append(tagNodesOpening.get(s).toString() + ", ");
        
         a.append("\n\ntagNodesClosing: ");
         for (String s : tagNodesClosing.keySet())
             a.append(tagNodesClosing.get(s).toString() + ", ");
        
         a.append("\n\ntagNodesOpeningUC: ");
         for (String s : tagNodesOpeningUC.keySet())
             a.append(tagNodesOpeningUC.get(s).toString() + ", ");
        
         a.append("\n\ntagNodesClosingUC: ");
         for (String s : tagNodesClosingUC.keySet())
             a.append(tagNodesClosingUC.get(s).toString() + ", ");
        
         if (printDescriptions)
         {
             loadDescriptions(); // Will only load if descriptions have not already been loaded.
        
             a.append("\n\n");
             for (String s : descriptions.keySet())
                 a.append(s + ((s.length() >= 7) ? ":\t" : ":\t\t") + descriptions.get(s) + "\n");
         }
        
      • loadDescriptions

        🡅  🡇     🗕  🗗  🗖
        public static void loadDescriptions()
        The data-structure (a java.util.TreeMap) that holds the individual text-descriptions of each HTML tag is not loaded into memory from the JAR file automatically. When the class-loader for this class loads this class, it employs a "Lazy Loading" Heuristic to prevent unnecessary memory-usage.

        Instead, if a programmer has decided that he would like to start printing information about HTML-Tags, and would like to include a short, one or two sentence description of the HTML Elements (using the method getDescription(String), then and only then will this method 'loadDescriptions' be invoked to load those one-sentence HTML-Tag Summaries.

        As an aside, the purpose of keeping these sentences in a jar file is that they are a kind of long, and really never used at all - unless you are interested in doing some reporting. By keeping them in the jar-file (unless requested) some amount of "over-head" resource usage is saved.

        If the text-descriptions have already loaded, this method will just exit and return, rather than loading them a second time.
        See Also:
        LFEC.readObjectFromFile_JAR(Class, String, boolean, Class)
        Code:
        Exact Method Body:
         if (descriptions.size() == 0)
        
             descriptions.putAll((TreeMap<String, String>) LFEC.readObjectFromFile_JAR
                 (HTMLTags.class, "data-files/HTMLTagDescriptions.tmdat", true, TreeMap.class));
        
      • maxTokenLength

        🡅  🡇     🗕  🗗  🗖
        public static byte maxTokenLength()
        This will compute the String-length of the longest HTML token saved in the internal state TreeSet<String> of HTML Tokens.
        Returns:
        The length of the longest HTML Token String.
        Code:
        Exact Method Body:
         return MAX_TOKEN_LENGTH;
        
      • addTag

        🡅  🡇     🗕  🗗  🗖
        public static boolean addTag​(java.lang.String htmlTag)
        Adds a new HTML element to the list of elements that may be parsed, created and checked. This is not always advisable, as the complete list of HTML-5 tags are already internally stored, but if you would like to add or remove certain tags, there are two methods for doing this.
        Parameters:
        htmlTag - Any HTML tag that you would like to see parsed by the HTML page parser. If the parser encounters a construct such as: <YOUR_NEW_TAG ATTRIBUTES="..."> it will treat that as a new HTML element.
        Returns:
        TRUE if the element was indeed a new element to the list, and FALSE if the HTML-tokens-list already contained this HTML element. If so, this method call will just return gracefully - with no changes being made to the underlying list of acceptable HTML tokens.
        Throws:
        HTMLTokException - If the String parameter 'htmlTag' contains non-alpha-numeric characters.
        Code:
        Exact Method Body:
         Matcher m = HTML_TAG_ALPHA_NUMERIC.matcher(htmlTag);
        
         if ((! m.find()) || (htmlTag.length() != m.group().length())) throw new HTMLTokException(
             "The HTML-Tag Parameter that was passed [" + htmlTag + "] doesn't conform to the " +
             "expected requirements for HTML-Tags.  It may only contain alpha-numeric characters, " +
             "and it must not begin with a number."
         );
        
         String tag = htmlTag.trim().toLowerCase();
        
         if (tag.length() > 127) throw new HTMLTokException(
             "The (trimmed) HTML-Tag Parameter that was passed [" + tag + "] is longer than 127 " +
             "characters.  This is not allowed here."
         );
        
         boolean ret = tags.add(tag);
        
         if (ret)
         {
             // NOTE: These four private, static fields are of type TreeMap<String, TagNode>
             //       tagNodesOpening, tagNodesOpeningUC, tagNodesClosing, tagNodesClosingUC
             //
             //       They can provide a significant savings for the Garbage Collector.  For any
             //       HTML Element that does not have any attributes, and has a standard 'case'
             //       (all upper-case, or all lower-case), the parser will "re-use" pre-existing
             //       instances of class TagNode, rather than building a new one.
             //
             // FOR EXAMPLE: The parser will "re-use" the same instance of a "<BR>" TagNode, or
             //              any one, actually, as long as it does not have attributes.  Since 40%
             //              to 50% of class TagNode are "TC.ClosingTags", this can be a significant
             //              improvement
        
             // Build a Lower-Case, Pre-Instantiated, Zero-Attribute version of the HTML Element
             // Uses specialized package-only visible TagNode constructor.
             // Not available to the general public
        
             tagNodesOpening.put(tag, new TagNode(tag, TC.OpeningTags));
             tagNodesClosing.put(tag, new TagNode(tag, TC.ClosingTags));
        
             // Build an Upper-Case, Pre-Instantiated, Zero-Attribute version of the HTML Element
             tag = tag.toUpperCase();
             tagNodesOpeningUC.put(tag, new TagNode("<" + tag + ">"));
             tagNodesClosingUC.put(tag, new TagNode("</" + tag + ">"));
        
             // Update the MAX_TOKEN_LENGTH - but only if necessary.
             if (tag.length() > MAX_TOKEN_LENGTH) MAX_TOKEN_LENGTH = (byte) tag.length();
         }
        
         return ret;
        
      • removeTag

        🡅  🡇     🗕  🗗  🗖
        public static boolean removeTag​(java.lang.String htmlTag)
        Removes and HTML element from the list of elements that may be parsed, created and checked. This is not always advisable, as the complete list of HTML-5 tags are already internally stored, but if you would like to add or remove certain tags, there are two methods for doing this.
        Parameters:
        htmlTag - Any HTML tag that you no longer want to see parsed by the HTML page parser. HTML nodes that contain this tag as their element will cause the parser to ignore the node, and treat it like a TextNode.
        Returns:
        TRUE if the element was removed, and FALSE if it was not - because it wasn't in the HTML-tokens-list in the first place.
        Code:
        Exact Method Body:
         String  tag = htmlTag.trim().toLowerCase();
         boolean ret = tags.remove(tag);
        
         if (ret)
         {
             // "Lower-Case" and "Pre-Instantiated" (Zero-Attributes) version of TagNode
             tagNodesOpening.remove(tag); 
             tagNodesClosing.remove(tag);
        
             tag = tag.toUpperCase();
        
             // "Upper-Case", Pre-Instantiated, Zero-Attribute version of TagNode
             tagNodesOpeningUC.remove(tag); 
             tagNodesClosingUC.remove(tag);
        
             // After removal, there is a small chance the
             // MAX_TOKEN_LENGTH is, now, shorter
        
             if (tag.length() == MAX_TOKEN_LENGTH) setMaxTokenLength();
         }
        
         return ret;
        
      • addSingleton

        🡅  🡇     🗕  🗗  🗖
        public static boolean addSingleton​(java.lang.String htmlTagSingleton)
        Removes an HTML-element to the list of singleton HTML-elements. A singleton may only have an "opening" tag, and may not have a closing-version tag. For instance the <IMG SRC="..."> is the classic-singleton, it's data is all stored internally as attribute values.
        Parameters:
        htmlTagSingleton - Any HTML tag that you would like to see listed as a singleton HTML-element.
        Returns:
        TRUE if the element was indeed a new element to the list, and FALSE if the HTML-singleton tokens-list already contained this HTML element. If so, this method call will just return gracefully - with no changes being made to the underlying list of singleton tokens.
        Throws:
        java.lang.IllegalArgumentException - If you have tried to "register" a singleton tag that isn't a fundamental HTML-tag, then this method will throw an exception directing you to first add your token to the HTML-tags/tokens internal-list.
        Code:
        Exact Method Body:
         String tag = htmlTagSingleton.trim().toLowerCase();
        
         if (! tags.contains(tag)) throw new IllegalArgumentException(
             "The HTML token you have attempted to add [" + tag + "] may not be added to the " + 
             "singletons list, because it is not a known/registered HTML token, as of now.  " +
             "First, make sure it is listed as one of the parser's tokens by calling " +
             "'addTag(token)', and then invoking this method with that token."
         );
        
         // Internally, there is a private & static TreeSet<String> which saves the names
         // of all HTML 'singleton' elements.  Use Java's TreeSet.add(E) method
        
         return singletonTags.add(tag);
        
      • removeSingleton

        🡅  🡇     🗕  🗗  🗖
        public static boolean removeSingleton​(java.lang.String htmlTagSingleton)
        Adds an HTML-element to the list of singleton HTML-elements. A singleton may only have an "opening" tag, and may not have a closing-version tag. For instance the <IMG SRC="..."> is the classic-singleton, it's data is all stored internally as attribute values.
        Parameters:
        htmlTagSingleton - Any HTML tag that you no longer want to see in the HTML-singleton tokens-list.
        Returns:
        TRUE if the element was removed, and FALSE if it was not - because it wasn't in the HTML-Singleton tokens-list in the first place.
        Code:
        Exact Method Body:
         String tag = htmlTagSingleton.trim().toLowerCase();
        
         // Internally, there is a private & static TreeSet<String> which saves the names
         // of all HTML 'singleton' elements.  Use Java's TreeSet.remove(Object) method
        
         return singletonTags.remove(tag);
        
      • hasTag

        🡅  🡇     🗕  🗗  🗖
        public static TagNode hasTag​(java.lang.String tag,
                                     TC openOrClosed)
        The purpose of this function/method is to provide a little "optimization." Since 100% of class HTMLTag information is stored as constant/final - this class facilitates instantiating only one copy of each node when building HTML page node- Vectors.

        Internal to this class is a 'Vector<TagNode>' of each and every HTML-Tag available - both in upper-case tag-versions, and also in lower-case tags. There must also be an opening-version of the TagNode, and also a closing-version of the same TagNode.

        This does, indeed, make a total of four total pre-instantiated tags that are stored within the Java-HTML JAR File. There is a java.util.TreeMap that is holding these serialized-TagNode instances. This TreeMap has also been serialized and saved in the Java-HTML JAR, and it is loaded into memory by the Class-Loader as soon as an invocation to an HTML Method is made.

        It is not mandatory to "reuse" instantiated HTML TagNode's, but for memory management, garbage-collection efficiency, and other optimizations, the classes in this package use the pre-instantiated versions of these objects whenever possible.
        Parameters:
        tag - Any valid HTML tag. If the String passed is not a valid HTML tag, then this method will return null.
        openOrClosed - If TC.OpeningTags is passed, then an "open" version of the HTML tag will be returned, and if TC.ClosingTags is passed, then a closing version will be returned. If TC.Both is accidentally passed - it will default to TC.OpeningTags
        Returns:
        An opening (or closing) TagNode - or null if the passed String tag does not represent any valid HTML-Tag
        Code:
        Exact Method Body:
         // FAIL-FAST: Check Input's immediately.  Throw Exception for invalid input.
         if (openOrClosed == null)
             throw new NullPointerException
                 ("Parameter 'openOrClosed' is null, but this is not allowed.");
        
         if (openOrClosed == TC.Both)
             throw new IllegalArgumentException
                 ("Parameter 'openOrClosed' was specified as TC.Both, but this is not allowed here.");
        
         // IMPORTANT NOTE:  For Singleton-Tags: There is no closing-version, so one SHOULD NOT be
         // requested.  (There is no '</IMG>' tag!)  However, this method DOES NOT throw
         // IllegalArgumentException in this case, but rather it just exits gracefully, and returns
         // null.
        
         String tagLC = tag.toLowerCase();
        
         if (singletonTags.contains(tagLC) && (openOrClosed == TC.ClosingTags)) return null;
        
         // First, Check if the 'tag' is all lower-case.  If it is, the string would be identical to
         // the 'tagLC' variable we have just created.
        
         if (tagLC.equals(tag)) 
         {
             // Debugging Information, Debug-println.  Un-comment to follow.  DO NOTE DELETE THIS LINE.
             // System.out.println("Used a pre-instantiated TagNode, Lower-Case TreeMap");
        
             return (openOrClosed == TC.OpeningTags)
                 ? tagNodesOpening.get(tag)
                 : tagNodesClosing.get(tag);
         }
        
         // Now, here, the variable could not have been all-lower-case.  NEXT, Check if it is
         // all-upper-case
         //
         // NOTE: There are pre-defined tables that include pre-instantiated TagNode's - both for
         //       lower-case tags and for upper-case tags.
        
         String tagUC = tag.toUpperCase();
        
         if (tagUC.equals(tag)) 
         {
             // Debugging Information, Debug-println.  Un-comment to follow.  DO NOTE DELETE THIS LINE.
             // System.out.println("Used a pre-instantiated TagNode, Upper-Case TreeMap");
        
             return (openOrClosed == TC.OpeningTags)
                 ? tagNodesOpeningUC.get(tag)
                 : tagNodesClosingUC.get(tag);
         }
        
         // SPECIAL CASE: (Very Rare / Unlikely, but possible)  The user has created an HTML Element
         // that has some lower-case alphabet letters, and some upper-case as well.  This does not
         // guarantee that it is a valid HTML Token, though, so check
         //
         // FOR EXAMPLE: If somebody typed <SeCtIoN>, we need to preserve the case, no matter how
         //              bizarre.  In such a case, a pre-packaged TagNode cannot be used, and instead
         //              a new TagNode must be instantiated.
                
         if (openOrClosed == TC.OpeningTags)
        
             return (tagNodesOpening.get(tagLC) == null)
                 ? null
                 : new TagNode("<" + tag + ">");
        
         else 
        
             return (tagNodesClosing.get(tagLC) == null)
                 ? null
                 : new TagNode("</" + tag + ">");
        
      • getTag_MEM_HEAP_CHECKOUT_COPY

        🡅  🡇     🗕  🗗  🗖
        public static java.lang.String getTag_MEM_HEAP_CHECKOUT_COPY​
                    (java.lang.String tag)
        
        This is an optimized, internal method that is used to prevent lots of duplicate HTML token-String's from being created by the parser. Internally, there ought to be just one-instance of String's like: "img", "br", "div", etc... This is used by the parser to reuse an already instantiated token String.

        This method probably has relatively little use outside of the internal HTML parser code.
        Parameters:
        tag - This is an HTML token. An identical String to this 'token' String, but possible different memory reference on the heap shall be returned.
        Returns:
        The returned String shall obey this issue:

        • assert(tag.equals(returned_string)); // Identical String is returned

        • assert(! (tag == returnedString)); // Probably a different memory allocation on the // heap. PROBABLY!
          Note that Java does not make any contracts regarding String references! (This can only help...)


        IMPORTANT: If the tag passed is not a valid HTML-Tag, then this method shall return null.
        Code:
        Exact Method Body:
         // Obviously, for the 200 or so "pre-instantiated" (having-no-attributes) instances of
         // class TagNode that are kept, internally, in the data-structures of this class,
         // 'HTMLTags'  We cannot retrieve a "pre-allocated" copy of the tag-as-a-string from
         // the heap, because we are building the data-file for the first time!
        
         if (BUILDING_DATA_FILE___SKIP_OPTIMIZATION_TEMPORARILY) return tag.toLowerCase();
        
         TagNode tn = tagNodesOpening.get(tag.toLowerCase());
        
         // If the tag isn't found, make sure not to throw NullPointerException!
         if (tn == null) return null;
        
         // This "version" (of the exact same html-element-name is already on the heap)
         // Obviously, because, variable 'tn' has already been instantiated and is in the TreeMap
         // If this EXACT SAME REFERENCE IS USED FOR ALL "TagNode.tok" instances, quite a bit of 
         // wasted-space in the heap's lookup table will be eliminated as the same "token"
         // (which is the name of the HTML Element: "div," "img," "span," etc...) is reused over 
         // and over and over again.  Helps a little bit!  Not that complicated!
        
         return tn.tok;
        
      • isTag

        🡅  🡇     🗕  🗗  🗖
        public static boolean isTag​(java.lang.String tag)
        Checks if a String is registered as a proper HTML tag according to the internally maintained lists.

        View Tags List:
        The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's 'tag' parameter, and result in a return value of TRUE. This list is the complete list of HTML Element Names that are maintained, by default, in this class' internal Lookup Table of HTML Tags.

        HTML Elements

        Case Insensitive:
        The test performed by this method shall ignore case.

        Modifying this List:
        The list of HTML Elements may, in fact, be altered. To add a new Element Name to the internal lookup table of valid HTML Elements, use addTag(String). To remove an HTML Element from the internal list, use removeTag(String).
        Returns:
        TRUE if this is a valid HTML tag. NOTE: All HTML-5 Element-Tag Strings will return TRUE as they are contained in the default internal list.
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the standard HTML Tags.  Just uses Java's TreeSet.contains(Object) method.
        
         return tags.contains(tag.toLowerCase());
        
      • isHTML5

        🡅  🡇     🗕  🗗  🗖
        public static boolean isHTML5​(java.lang.String tok)
        Checks if a String is a proper HTML-5 (only) tag. This list is rather short, and only contains HTML Elements which specifically for the release of HTML 5. Any HTML Element which is both a valid HTML Release 4 (or earlier) and an HTML 5 Element will not result in TRUE being returned by this method.

        View Tags List:
        The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's 'tok' parameter, and result in a return value of TRUE. This list is the complete list of HTML-5 Element Names that are maintained, by default, in this class' internal Lookup Table of HTML-5 Tags.

        Elements Added for HTML-5

        Case Insensitive:
        The test performed by this method shall ignore case.
        Parameters:
        tok - Any HTML-Tag as a String.
        Returns:
        TRUE if this is a tag that was added for HTML-5, and not included in HTML 4, or earlier
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the HTML-5 Tags.  Just uses Java's TreeSet.contains(Object) method.
        
         return html5Tags.contains(tok.toLowerCase());
        
      • deprecated

        🡅  🡇     🗕  🗗  🗖
        public static boolean deprecated​(java.lang.String tok)
        Checks if a String is listed as an HTML Element that was deprecated for HTML 5

        View Tags List:
        The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's 'tok' parameter, and result in a return value of TRUE. This list is the complete list of Deprecated HTML Element Names that are maintained, by default, in this class' internal Lookup Table of Deprecated HTML Tags.

        Elements Deprecated for HTML-5

        Case Insensitive:
        The test performed by this method shall ignore case.
        Parameters:
        tok - Any HTML-Tag as a String.
        Returns:
        TRUE if this tag was deprecated for HTML-5
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the deprecated-for-HTML-5 Tags.  Just uses Java's TreeSet.contains(Object)
         // method.
        
         return deprecated.contains(tok.toLowerCase());
        
      • isSingleton

        🡅  🡇     🗕  🗗  🗖
        public static boolean isSingleton​(java.lang.String tok)
        This method checks whether specific HTML elements are both "opening and closing" elements, such as: P, DIV, SPAN, along with myriad others, OR if this one of the (very few) "singleton HTML elements", such as the HTML <IMG SRC="..."> element which may not have a closing tag. Such tags are also called "Self-Closing" tags.

        View Tags List:
        The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's 'tok' parameter, and result in a return value of TRUE. This list is the complete list of Singleton Element Names that are maintained, by default, in this class' internal Lookup Table of Singleton Tags.

        Singleton Elements

        Case Insensitive:
        The test performed by this method shall ignore case.

        Modifying this List:
        The list of Singleton HTML Elements may, in fact, be altered. To add a new Singleton HTML Element Name to the internal lookup table of valid Singleton Elements, use addSingleton(String). To remove an HTML Elementfrom the internal list, use removeSingleton(String).
        Parameters:
        tok - This is the HTML element name to be tested.
        Returns:
        TRUE if this is a 'singleton' HTML Element - a.k.a., only OpeningTag versions of the element exist, because singleton HTML elements don't need / may not have a closing tag. Singleton examples include: IMG, HR, INPUT etc...

        FALSE is returned if the tag is not a singleton parameter.
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the 'singleton' HTML Tags.  Just uses Java's TreeSet.contains(Object) method.
        
         return singletonTags.contains(tok.toLowerCase());
        
      • isBlock

        🡅  🡇     🗕  🗗  🗖
        public static boolean isBlock​(java.lang.String tok)
        This method checks whether specific HTML elements are among the 'Block' Tag elements list. An explanation of what a 'block' or 'inline' tag is, is beyond the scope of this document.

        View Tags List:
        The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's 'tok' parameter, and result in a return value of TRUE. This list is the complete list of Block Element Names that are maintained, by default, in this class' internal Lookup Table of Block Tags.

        HTML Block Elements

        Case Insensitive:
        The test performed by this method shall ignore case.
        Parameters:
        tok - This is the HTML element name to be tested.
        Returns:
        TRUE if this is a 'block' HTML Element, FALSE otherwise.
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the HTML 'Block' Tags.  Just uses Java's TreeSet.contains(Object) method.
        
         return blockTags.contains(tok.toLowerCase());
        
      • isInline

        🡅  🡇     🗕  🗗  🗖
        public static boolean isInline​(java.lang.String tok)
        This method checks whether specific HTML elements are among the 'Inline' Tag elements list. An explanation of what a 'block' or 'inline' tag is, is beyond the scope of this document.

        View Tags List:
        The HTML Elements which are listed (in the link below), indicate exactly what may be passed to this method's 'tok' parameter, and result in a return value of TRUE. This list is the complete list of Inline Element Names that are maintained, by default, in this class' internal Lookup Table of Inline Tags.

        HTML Inline Elements

        Case Insensitive:
        The test performed by this method shall ignore case.
        Parameters:
        tok - This is the HTML element name to be tested.
        Returns:
        TRUE if this is an 'inline' HTML Element, FALSE otherwise.
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the HTML 'Inline' Tags.  Just uses Java's TreeSet.contains(Object) method.
        
         return inlineTags.contains(tok.toLowerCase());
        
      • getDescription

        🡅  🡇     🗕  🗗  🗖
        public static java.lang.String getDescription​(java.lang.String tag)
        Returns a brief, English Language Description, of an HTML Tag. These descriptions are stored in a small data-file,

        Loading from JAR-File:
        This method will attempt to load a particular data-file from the JAR-library into memory. This file contains a one-sentence description, stored as java.lang.String's for each of the HTML Elements known to this class. Under normal operation, these String-arrays remain on-disk, only.
        Parameters:
        tag - Any valid HTML tag.
        Returns:
        A short English-Language description of the Tag in HTML, or null if this tag is unknown.
        See Also:
        loadDescriptions()
        Code:
        Exact Method Body:
         // Loads the descriptions map, ONLY IF they have not already been loaded into memory from
         // the JAR data-files
        
         loadDescriptions();
        
         return descriptions.get(tag.toLowerCase());
        
      • iterator

        🡅  🡇     🗕  🗗  🗖
        public static java.util.Iterator<java.lang.String> iterator()
        Internally, tags are stored in a Java java.util.TreeSet<String>. This method invokes the iterator() method on that TreeSet.

        Remove Unsupported:
        In order to prevent accidental removal of HTML-Tags via the Iterator's 'remove()' method, the returned-Iterator instance has been overloaded - "wrapped" - in a simple class that throws an exception if remove() is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.

        Data File Contents:
        The contents of this Iterator may be viewed here:

        HTML Elements
        Returns:
        an Iterator<String> that iterates over all the Tag-String's in alphabetical order.
        See Also:
        RemoveUnsupportedIterator
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the standard HTML Tags.  Just uses Java's TreeSet.iterator() method.
         //
         // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this
         //       TreeSet
        
         return new RemoveUnsupportedIterator<String>(tags.iterator());
        
      • iteratorDescriptions

        🡅  🡇     🗕  🗗  🗖
        public static java.util.Iterator<java.util.Map.Entry<java.lang.String,​java.lang.String>> iteratorDescriptions
                    ()
        
        Will build an Iterator that can return attributes and their text-String descriptions.

        Data File Contents:
        The contents of this Iterator are loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load is only performed on request. The contents of this data-file (and the list of Map.Entry's returned by the Iterator) may be viewed, here, by clicking the link below:

        HTML Elements with Descriptions

        Lazy Loading:
        In this class, if the methods invoked do not require the Event-Description String-Data, then the Class-Loader will not load this extensive text-data into memory from the JAR data-files.
        Returns:
        an Iterator that iterates the HTML-Tag / HTML-Tag-Description key-value pairs as instances of "Map.Entry<String, String>"
        See Also:
        loadDescriptions(), RemoveUnsupportedIterator
        Code:
        Exact Method Body:
         loadDescriptions(); // Will only load if descriptions have not already been loaded.
        
         return new RemoveUnsupportedIterator<Map.Entry<String, String>>
             (descriptions.entrySet().iterator());
        
      • iteratorAddedForHTML5

        🡅  🡇     🗕  🗗  🗖
        public static java.util.Iterator<java.lang.String> iteratorAddedForHTML5()
        Internally, HTML-5 tags are stored in a Java java.util.TreeSet<String>. This method invokes the iterator() method on that TreeSet.

        Remove Unsupported:
        In order to prevent accidental removal of HTML-5-Tags via the Iterator's 'remove()' method, the returned-Iterator instance has been overloaded - "wrapped" - in a simple class that throws an exception if remove() is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.

        Data File Contents:
        The contents of this Iterator are loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:

        Elements Added for HTML-5
        Returns:
        an Iterator<String> that cycles through the list of HTML Tag-String's that were added for in HTML-5.
        See Also:
        RemoveUnsupportedIterator
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the HTML-5 Tags.  Just uses Java's TreeSet.iterator() method.
         //
         // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this
         //       TreeSet
        
         return new RemoveUnsupportedIterator<String>(html5Tags.iterator());
        
      • iteratorDeprecatedForHTML5

        🡅  🡇     🗕  🗗  🗖
        public static java.util.Iterator<java.lang.String> iteratorDeprecatedForHTML5
                    ()
        
        Internally, deprecated tags are stored in a Java java.util.TreeSet<String>. This method invokes the iterator() method on that TreeSet.

        Remove Unsupported:
        In order to prevent accidental removal of Deprecated-Tags via the Iterator's 'remove()' method, the returned-Iterator instance has been overloaded - "wrapped" - in a simple class that throws an exception if remove() is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.

        Data File Contents:
        The contents of this Iterator are loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:

        Elements Deprecated for HTML-5
        Returns:
        an Iterator<String> that cycles through the list of HTML Tag-String's that were removed for HTML-5.
        See Also:
        RemoveUnsupportedIterator
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the deprecated Tags.  Just uses Java's TreeSet.iterator() method.
         //
         // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this
         //       TreeSet
        
         return new RemoveUnsupportedIterator<String>(deprecated.iterator());
        
      • iteratorSingletonTags

        🡅  🡇     🗕  🗗  🗖
        public static java.util.Iterator<java.lang.String> iteratorSingletonTags()
        Internally, singleton / self-closing tags are stored in a Java java.util.TreeSet<String>. This method invokes the iterator() method on that TreeSet.

        Remove Unsupported:
        In order to prevent accidental removal of Singleton-Tags via the Iterator's 'remove()' method, the returned-Iterator instance has been overloaded - "wrapped" - in a simple class that throws an exception if remove() is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.

        Data File Contents:
        The contents of this Iterator are loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:

        Singleton Elements
        Returns:
        an Iterator<String> that cycles through the list of HTML Tag-String's that qualify as singleton elements, and may not have closing-tag versions.
        See Also:
        RemoveUnsupportedIterator
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the HTML 'Singleton' Tags.  Just uses Java's TreeSet.iterator() method.
         //
         // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this
         //       TreeSet
        
         return new RemoveUnsupportedIterator<String>(singletonTags.iterator());
        
      • iteratorBlockTags

        🡅  🡇     🗕  🗗  🗖
        public static java.util.Iterator<java.lang.String> iteratorBlockTags()
        Internally, singleton / self-closing tags are stored in a Java java.util.TreeSet<String>. This method invokes the iterator() method on that TreeSet.

        Remove Unsupported:
        In order to prevent accidental removal of Block-Tags via the Iterator's 'remove()' method, the returned-Iterator instance has been overloaded - "wrapped" - in a simple class that throws an exception if remove() is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.

        Data File Contents:
        The contents of this Iterator are loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:

        HTML Block Elements
        Returns:
        an Iterator<String> that cycles through the list of HTML Tag-String's that qualify as block elements.
        See Also:
        RemoveUnsupportedIterator
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the HTML 'Inline' Tags.  Just uses Java's TreeSet.iterator() method.
         //
         // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this
         //       TreeSet
        
         return new RemoveUnsupportedIterator<String>(blockTags.iterator());
        
      • iteratorInlineTags

        🡅     🗕  🗗  🗖
        public static java.util.Iterator<java.lang.String> iteratorInlineTags()
        Internally, "HTML Block Tags" are stored in a Java java.util.TreeSet<String>. This method invokes the iterator(); method on that TreeSet.

        Remove Unsupported:
        In order to prevent accidental removal of Inline-Tags via the Iterator's 'remove()' method, the returned-Iterator instance has been overloaded - "wrapped" - in a simple class that throws an exception if remove() is invoked. The purpose is to prevent a user from accidentally destorying a member of the this class' vital data-structures.

        Data File Contents:
        The contents of this Iterator are loaded from a (small) internal data-file stored in the JAR Distribution for this Java HTML Package. Load of this data is performed as soon as this class is loaded by the Class-Loader. The Data-File (Iterator) contents may be viewed here, by clicking the link below:

        HTML Inline Elements
        Returns:
        an Iterator<String> that cycles through the list of HTML Tag-String's that qualify as inline elements.
        See Also:
        RemoveUnsupportedIterator
        Code:
        Exact Method Body:
         // Internally, this class has a private & static TreeSet<String> that stores a list
         // of all the HTML 'Block' Tags.  Just uses Java's TreeSet.iterator() method.
         //
         // NOTE: The 'RemoveUnsupportedIterator' wrapper class prohibits modifications to this
         //       TreeSet
        
         return new RemoveUnsupportedIterator<String>(inlineTags.iterator());