Package Torello.HTML

Class Elements


  • public class Elements
    extends java.lang.Object
    A simple, demonstrative set of functions for retrieving HTMLNode's from a web-page (a 'Workbook Class').

    "Legacy Class," used early-on to help explain HTML Vectors

    This was a preliminary demonstration-version of how to use the NodeSearch Package
    The exact reason to have included this class is not so obvious. Yes, it is useful to traverse HTML tables in Java. However, to the novice user who doesn't quite understand how the words "Find" and "Get" really relate to HTMLNode Vectors, using thesenhigher-level search functions might make things easier. If the words "TagNode" (which is, sort-of, opposite a "TextNode") still doesn't make so much sense - here, all a programmer really ought to do is download an HTML page into a page vector to where it is in the format Vector<HTMLNode> - and then try searching for any of the commonly found HTML elements in that page.

    The actual purpose of this class is to see how to use the classes in Node-Search, with-ease. There are only a few methods (about 10), and they show the uses of the node-search operations by providing the code inside the method body inside this method-declarations of this Javadoc page. Think of this as a "work-book."

    JavaScript:
    // NOTE: Mostly, if you are familiar with JavaScript, this will make sense:
    // Java-Script for obtaining the HTML-Content of a divider "<DIV>" element.
    var html    = document.getElementById("main-content").innerHTML; // for-example
    var nodes   = document.getElementsByClassName("article-footer");
    


    The script, as above, will essentially translate to calls such as:
    // Java-HTML Scrape Package means of doing the same thing (almost, but not identical)
    Vector<HTMLNode>    subPage = InnerTagGetInclusive.first(some_page, "id", TextComparitor.EQ_CI_TRM, "main-content");
    Vector<TagNode>     tn      = InnerTagGet.all(some_page, "class", TextComparitor.C, "article-footer");
    


    ALWAYS: Node-search methods that use the term "Find" retrieve the node's integer-position inside the page Vector, while methods that use the term "Get" return the node itself. There is no CSS-selector corollary to this difference, primarily because Java-Script's Document Object Model a.k.a. "the DOM-Tree"), is, well, a tree! This package uses array-like java Vector's - instead of Tree's. Java-Vector's provides an extreme amount of simplicity when dealing with web-pages that have any readable text. Primarily, because humans generally think in terms of "sentences" rather than "trees," looking, parsing and even translating content is much easier this way.

    FURTHERMORE: Node-search methods that use the term "Inclusive" retrieve the entire list of nodes (or integer node-pointers) between the opening and closing version of the tag and attributes that your are searching. They are "a tautology" to Java-Script's "someElement.innerHTML". If the term "Inclusive" is not present, only the opening-TagNode itself, or the opening-TagNode's index in the HTML Page-Vector will be returned.

    If one calls DotPair dp = Elements.findTable(someHTMLPage); the DotPair variable that is returned from this function will delineate / demarcate the starting and ending positions within the Vector<HTMLNode> that constitute the first HTML-'Table' structure found on the web-page.

    If one calls Vector<HTMLNode> list = Elements.getOL(someHTMLPage); the Vector that is returned will be the entire sub-set of HTMLNode's copied from the original Vector someHTMLPage that comprise the very first HTML 'OL' (Ordered List) Element found on this page.


Stateless Class: This class neither contains any program-state, nor can it be instantiated. The @StaticFunctional Annotation may also be called 'The Spaghetti Report'. Static-Functional classes are, essentially, C-Styled Files, without any constructors or non-static member field. It is very similar to the Java-Bean @Stateless Annotation.
  • 1 Constructor(s), 1 declared private, zero-argument constructor
  • 35 Method(s), 35 declared static
  • 0 Field(s)


    • Method Detail

      • findBody

        🡇    
        public static DotPair findBody​(java.util.Vector<? extends HTMLNode> html)
        Retrieves the start and end points of the web-page body in the underlying HTML page-Vector. All nodes between <BODY> ... </BODY> will be included.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        InnerTagFindInclusive
        Code:
        Exact Method Body:
         return InnerTagFindInclusive.first(html, "body");
        
      • getBody

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetBody​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Gets the nodes of the web-page body. All nodes between <BODY> ... </BODY> will be included.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        InnerTagGetInclusive
        Code:
        Exact Method Body:
         return InnerTagGetInclusive.first(html, "body");
        
      • findHead

        🡅  🡇    
        public static DotPair findHead​(java.util.Vector<? extends HTMLNode> html)
        Retrieves the start and end points of the web-page header in the underlying HTML page-Vector. All nodes between <HEAD> ... </HEAD> will be included.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        InnerTagFindInclusive
        Code:
        Exact Method Body:
         return InnerTagFindInclusive.first(html, "head");
        
      • getHead

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetHead​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Gets the nodes of the web-page header. All nodes between <HEAD> ... </HEAD> will be included.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        InnerTagGetInclusive
        Code:
        Exact Method Body:
         return InnerTagGetInclusive.first(html, "head");
        
      • findMeta

        🡅  🡇    
        public static int[] findMeta​(java.util.Vector<? extends HTMLNode> html)
        Gets all <META NAME="..." CONTENT="..."> (or <META CHARSET="..."> and <META HTTP-EQUIV="...">) elements in a web-page header - returned via their position in the page-Vector.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML Elements, as an integer-array list of index-pointers to the underlying Vector.
        See Also:
        TagNodeFind
        Code:
        Exact Method Body:
         return TagNodeFind.all(html, TC.OpeningTags, "meta");
        
      • getMeta

        🡅  🡇    
        public static java.util.Vector<TagNodegetMeta​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Gets all <META NAME="..." CONTENT="..."> (or <META CHARSET="..."> and <META HTTP-EQUIV="...">) elements in a web-page header.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML Elements, as TagNode's, in a return Vector.
        See Also:
        TagNodeGet
        Code:
        Exact Method Body:
         return TagNodeGet.all(html, TC.OpeningTags, "meta");
        
      • findLink

        🡅  🡇    
        public static int[] findLink​(java.util.Vector<? extends HTMLNode> html)
        Gets all <LINK REL="..." HREF="..."> elements in a web-page header - returned via their position in the page-Vector.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML Elements, as an integer-array list of index-pointers to the underlying Vector.
        See Also:
        TagNodeFind
        Code:
        Exact Method Body:
         return TagNodeFind.all(html, TC.OpeningTags, "link");
        
      • getLink

        🡅  🡇    
        public static java.util.Vector<TagNodegetLink​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Gets all <LINK REL="..." HREF="..."> elements in a web-page header.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML Elements, as TagNode's, in a return Vector.
        See Also:
        TagNodeGet
        Code:
        Exact Method Body:
         return TagNodeGet.all(html, TC.OpeningTags, "link");
        
      • findTitle

        🡅  🡇    
        public static DotPair findTitle​(java.util.Vector<? extends HTMLNode> html)
        Returns the start and end positions in the page-Vector of the HTML <TITLE>...</TITLE> elements.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        InnerTagFindInclusive
        Code:
        Exact Method Body:
         return TagNodeFindInclusive.first(html, "title");
        
      • getTitle

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetTitle​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Returns the <TITLE>...</TITLE> elements sub-list from the HTML page-Vector.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        InnerTagGetInclusive
        Code:
        Exact Method Body:
         return TagNodeGetInclusive.first(html, "title");
        
      • titleString

        🡅  🡇    
        public static java.lang.String titleString​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Returns the String encapsulated by the HTML 'HEAD'-section's "<TITLE>...</TITLE>" element, if there such an element. If there is no such element, null is returned. If there is a 'TITLE' element, but it has the empty-String (zero-length-string) an empty String is returned.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>. Retrieves the 'TITLE' of an HTML page - by getting the String-text between the 'TITLE' elements.
        Returns:
        The title string
        Code:
        Exact Method Body:
         Vector<HTMLNode> title = getTitle(html);
        
         if (title == null) return null;
                
         return Util.textNodesString(title);
        
      • findTable

        🡅  🡇    
        public static DotPair findTable​(java.util.Vector<? extends HTMLNode> html)
        This method will find the very first HTML 'TABLE' (<TABLE> <TH>...</TH> <TR> <TD>..</TD> ... </TR> ... </TABLE>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
         return TagNodeFindInclusive.first(html, "table");
        
      • findTable

        🡅  🡇    
        public static DotPair findTable​(java.util.Vector<? extends HTMLNode> html,
                                        int sPos,
                                        int ePos)
        This method will find the very first HTML 'TABLE' (<TABLE> <TH>...</TH> <TR> <TD>..</TD> ... </TR> ... </TABLE>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
         return TagNodeFindInclusive.first(html, sPos, ePos, "table");
        
      • getTable

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetTable​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will get the very first HTML 'TABLE' (<TABLE> <TR> <TH>...</TH> </TR> <TR> <TD>..</TD> ... </TR> ... </TABLE>) element set. This returns a sub-Vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
         return TagNodeGetInclusive.first(html, "table");
        
      • getTable

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetTable​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will get the very first HTML 'TABLE' (<TABLE> <TH>...</TH> <TR> <TD>..</TD> ... </TR> ... </TABLE>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The requested HTML sublist, as a Vector.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
         return TagNodeGetInclusive.first(html, sPos, ePos, "table");
        
      • findSelect

        🡅  🡇    
        public static DotPair findSelect​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will find the very first first HTML 'SELECT-OPTION' set. (<SELECT> ... <OPTION> ... </OPTION> .. </SELECT>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
         return TagNodeFindInclusive.first(html, "select");
        
      • findSelect

        🡅  🡇    
        public static DotPair findSelect​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will find the very first first HTML 'SELECT-OPTION' set. (<SELECT> ... <OPTION> ... </OPTION> .. </SELECT>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
         return TagNodeFindInclusive.first(html, sPos, ePos, "select");
        
      • getSelect

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetSelect​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will find the very first first HTML 'SELECT-OPTION' set. (<SELECT> ... <OPTION> ... </OPTION> .. </SELECT>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair.) This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
         return TagNodeGetInclusive.first(html, "select");
        
      • getSelect

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetSelect​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will find the very first first HTML 'SELECT-OPTION' set. (<SELECT> ... <OPTION> ... </OPTION> .. </SELECT>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The requested HTML sublist, as a Vector.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
         return TagNodeGetInclusive.first(html, sPos, ePos, "select");
        
      • findUL

        🡅  🡇    
        public static DotPair findUL​(java.util.Vector<? extends HTMLNode> html)
        This method will find the very first HTML Un-Ordered List (<UL> ..<LI>...</LI> ... </UL>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
         return TagNodeFindInclusive.first(html, "ul");
        
      • findUL

        🡅  🡇    
        public static DotPair findUL​(java.util.Vector<? extends HTMLNode> html,
                                     int sPos,
                                     int ePos)
        This method will find the very first HTML Un-Ordered List (<UL> ..<LI>...</LI> ... </UL>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
         return TagNodeFindInclusive.first(html, sPos, ePos, "ul");
        
      • getUL

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetUL​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will find the very first HTML Un-Ordered List (<UL> ..<LI>...</LI> ... </UL>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
         return TagNodeGetInclusive.first(html, "ul");
        
      • getUL

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetUL​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will find the very first HTML Un-Ordered List (<UL> ..<LI>...</LI> ... </UL>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The requested HTML sublist, as a Vector.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
         return TagNodeGetInclusive.first(html, sPos, ePos, "ul");
        
      • findOL

        🡅  🡇    
        public static DotPair findOL​(java.util.Vector<? extends HTMLNode> html)
        This method will find the very first HTML Un-Ordered List (<OL> ..<LI>...</LI> ... </OL>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
         return TagNodeFindInclusive.first(html, "ol");
        
      • findOL

        🡅  🡇    
        public static DotPair findOL​(java.util.Vector<? extends HTMLNode> html,
                                     int sPos,
                                     int ePos)
        This method will find the very first HTML Un-Ordered List (<OL> ..<LI>...</LI> ... </OL>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
         return TagNodeFindInclusive.first(html, sPos, ePos, "ol");
        
      • getOL

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetOL​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will find the very first HTML Un-Ordered List (<OL> ..<LI>...</LI> ... </OL>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
         return TagNodeGetInclusive.first(html, "ol");
        
      • getOL

        🡅  🡇    
        public static java.util.Vector<HTMLNodegetOL​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will find the very first HTML Un-Ordered List (<OL> ..<LI>...</LI> ... </OL>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The requested HTML sublist, as a Vector.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
         return TagNodeGetInclusive.first(html, sPos, ePos, "ol");
        
      • findAllOption

        🡅  🡇    
        public static java.util.Vector<DotPairfindAllOption​
                    (java.util.Vector<? extends HTMLNode> selectList)
                throws MalformedHTMLException
        
        This will use the "L1 Inclusive" concept defined in this HTML package to provide a list (returned using the type: java.util.Vector<DotPair>) of each element that fits the <OPTION> ... </OPTION> HTML "select-option element" structure.
        Parameters:
        selectList - An HTML list of TagNode's and TextNode's that constitute an selection-option drop-down menu. This list cannot contain extraneous TagNode's or TextNode's, but rather, must begin and end with the open and close "select" HTML drop-down menu Tags.
        Returns:
        A "list of lists" - specifically, a list of Torello.HTML.DotPair , each of which delineate a complete <OPTION> ... </OPTION> sub-list that are present within this HTML "select" drop-down-menu structure.
        Throws:
        MalformedHTMLException - This method in no way performs a complete evaluation of the HTML structure provided by the user in the Vector<? extends HTMLNode> list parameter that is passed. However rules that are related to the HTML elements "Select Option" <SELECT>...<OPTION> ... </OPTION> ... </SELECT> are inspected.

        • If the passed list parameter does not start and end with the exact HTML elements - <SELECT>, </SELECT> , then this exception is thrown.
        • If the passed list parameter contains "extraneous HTML tags" or "extraneous text" in between the <OPTION> ... </OPTION> or <SELECT> ... </SELECT> list-start and list-end demarcated HTML TagNodes, then the Torello.HTML.MalformedHTMLException will, again, be thrown
        See Also:
        checkEndPoints(Vector, String[]), checkL1(Vector, Vector), TagNodeFindL1Inclusive
        Code:
        Exact Method Body:
         checkEndPoints(selectList, "select");
        
         Vector<DotPair> ret = TagNodeFindL1Inclusive.all(selectList, "option");
        
         checkL1(selectList, ret);
        
         return ret;
        
      • getAllOption

        🡅  🡇    
        public static java.util.Vector<java.util.Vector<HTMLNode>> getAllOption​
                    (java.util.Vector<? extends HTMLNode> selectList)
                throws MalformedHTMLException
        
        This does the exact same thing as findAllOption(Vector) but the returned value is converted from "sublist endpoints" (a vector of start/end pairs), and into a "List of Sub-Lists", which is specifically a list (java.util.Vector<>) containing sub-lists (also: java.util.Vector<HTMLNode>)

        NOTE: All of the rules and conditions explained in the comments for method findAllOption(Vector) apply to this method as well.
        Parameters:
        selectList - An HTML list of TagNode's and TextNode's that constitute an selection-option drop-down menu. This list cannot contain extraneous TagNode's or TextNode's, but rather, must begin and end with the open and close "select" HTML drop-down menu Tags.
        Returns:
        A "list of lists" - specifically, a list of java.util.Vector<HTMLNode> (sublists), each of which delineate a complete <OPTION> ... </OPTION> sub-list that are present within this HTML "select" drop-down-menu structure.
        Throws:
        MalformedHTMLException - This method in no way performs a complete evaluation of the HTML structure provided by the user in the Vector<? extends HTMLNode> list parameter that is passed. However rules that are related to the HTML elements "Select Option" <SELECT>...<OPTION> ... </OPTION> ... </SELECT> are inspected.

        • If the passed list parameter does not start and end with the exact HTML elements - <SELECT>, </SELECT>, then this exception is thrown.
        • If the passed list parameter contains "extraneous HTML tags" or "extraneous text" in between the <OPTION> ... </OPTION> or <SELECT> ... </SELECT> list-start and list-end demarcated HTML TagNodes, then the Torello.HTML.MalformedHTMLException will, again, be thrown
        See Also:
        DotPair.toVectors(Vector, Iterable)
        Code:
        Exact Method Body:
         return DotPair.toVectors(selectList, findAllOption(selectList));
        
      • findAllLI

        🡅  🡇    
        public static java.util.Vector<DotPairfindAllLI​
                    (java.util.Vector<? extends HTMLNode> list)
                throws MalformedHTMLException
        
        This will use the "L1 Inclusive" concept defined in this HTML package to provide a list (returned using the type: java.util.Vector<DotPair>) of each element that fits the <LI> ... </LI> HTML "list element" structure.
        Parameters:
        list - An HTML list of TagNode's and TextNode's that constitute an ordered or unordered list. This list cannot contain extraneous TagNode's or TextNode's, but rather, must begin and end with the open and close list Tags.
        Returns:
        A "list of lists" - specifically, a list of Torello.HTML.DotPair, each of which delineate a complete <LI> ... </LI> sub-list that are present within this HTML list structure.
        Throws:
        MalformedHTMLException - This method in no way performs a complete evaluation of the HTML structure provided by the user in the Vector<? extends HTMLNode> list parameter that is passed. However rules that are related to the HTML elements "Ordered List" <OL>...</OL> and "unordered list" <UL>...</UL> are inspected.

        • If the passed list parameter does not start and end with the same HTML elements - specifically <OL>, <UL> , then this exception is thrown.
        • If the passed list parameter contains "extraneous HTML tags" or "extraneous text" in between the <OL> or <UL> ... </OL> or </UL> list-start and list-end demarcated HTML TagNodes, then the Torello.HTML.MalformedHTMLException will, again, be thrown
        See Also:
        checkEndPoints(Vector, String[]), checkL1(Vector, Vector), TagNodeFindL1Inclusive
        Code:
        Exact Method Body:
         checkEndPoints(list, "ol", "ul");
        
         Vector<DotPair> ret = TagNodeFindL1Inclusive.all(list, "li");
        
         checkL1(list, ret);
        
         return ret;
        
      • getAllLI

        🡅  🡇    
        public static java.util.Vector<java.util.Vector<HTMLNode>> getAllLI​
                    (java.util.Vector<? extends HTMLNode> list)
                throws MalformedHTMLException
        
        This does the exact same thing as findAllLI(Vector) but the returned value is converted from "sublist endpoints" (a vector of start/end pairs), and into a "List of Sub-Lists", which is specifically a list (java.util.Vector<>) containing sub-lists (also: java.util.Vector<HTMLNode>)

        NOTE: All of the rules and conditions explained in the comments for method findAllLI(Vector) apply to this method as well.
        Parameters:
        list - An HTML list of TagNode's and TextNode's that constitute an ordered or unordered list. This list cannot contain extraneous TagNode's or TextNode's, but rather, must begin and end with the open and close list Tags.
        Returns:
        A "list of lists" - specifically, a list of java.util.Vector<HTMLNode> (sublists), each of which delineate a complete <UL>...</UL> sub-list that are present within this HTML list structure.
        Throws:
        MalformedHTMLException - This method in no way performs a complete evaluation of the HTML structure provided by the user in the Vector<? extends HTMLNode> list parameter that is passed. However rules that are related to the HTML elements "Ordered List" (<OL>...</OL>) and "unordered list" (<UL>...</UL>) are inspected.

        • If the passed list parameter does not start and end with the same HTML elements - specifically <OL>, <UL> , then this exception is thrown.
        • If the passed list parameter contains "extraneous HTML tags" or "extraneous text" in between the <OL> or <UL> ... </OL> or </UL> list-start and list-end demarcated HTML TagNode's, then the Torello.HTML.MalformedHTMLException will, again, be thrown.
        See Also:
        DotPair.toVectors(Vector, Iterable)
        Code:
        Exact Method Body:
         return DotPair.toVectors(list, findAllLI(list));
        
      • checkEndPoints

        🡅  🡇    
        protected static java.lang.String checkEndPoints​
                    (java.util.Vector<? extends HTMLNode> list,
                     java.lang.String... tokList)
                throws MalformedHTMLException
        
        This method is used to guarantee precisely two conditions to the passed HTML Tag list.

        • Condition 1: The Vector<HTMLNode> list parameter begins and ends with the exact same HTML Tag, (for instance: <H1> ... </H1>, or perhaps <LI> ... </LI> )
        • Condition 2: The HTML-Tag that is found at the start and end of this list is one contained within the 'tokList' variable-length String-array parameter. (if the 'tokList' parameter was a java.lang.String[] tokList = { "th", "tr" }, then the passed "HTMLNode list" (Vector) parameter would have to begin and end with either: <TH> ... </TH> or with <TR> ... </TR>

        Much of the java code in this method is used to provide some explanatory Exception message information.
        Parameters:
        list - This is supposed to be a typical "open" and "close" HTML TagNode structure. It may be anything including: <DIV ID="..."> ... </DIV> , or <TABLE ...> ... </TABLE> , or even <BODY> ... </BODY>
        tokList - This is expected to be the possible set of tokens with which this HTML list may begin or end with.
        Returns:
        If the passed list parameter passes both the conditions specified above, then the token from the list of tokens that were provided is returned.

        NOTE: If the list does not meet these conditions, a Torello.HTML.MalformedHTMLException will be thrown with an explanatory exception-message (and, obviously, the method will not return anything!)
        Throws:
        MalformedHTMLException - Some explanatory information is provided to the coder for what has failed with the input list.
        Code:
        Exact Method Body:
         return checkEndPoints(list, 0, list.size()-1, tokList);
        
      • checkEndPoints

        🡅  🡇    
        protected static java.lang.String checkEndPoints​
                    (java.util.Vector<? extends HTMLNode> list,
                     int sPos,
                     int ePos,
                     java.lang.String... tokList)
                throws MalformedHTMLException
        
        This method, functionally, does the exact same thing as "checkEndPoints" - but with the endpoints specified. It is being kept with protected access since it might be unclear what endpoints are being checked. The previous method has many java exception case strings laboriously typed out. Rather than retype this, this method is being introduced. Functionally, it does the same thing as checkEndPoints(Vector, String) - except it does not use list.elementAt(0) or list.elementAt(element.size()-1) as the starting and ending points.
        Parameters:
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        tokList - The list of valid HTML Element names (tokens).
        Throws:
        MalformedHTMLException
        See Also:
        checkEndPoints(Vector, String[])
        Code:
        Exact Method Body:
         HTMLNode n = null;		String tok = null;
                
         if ((n = list.elementAt(sPos)).isTagNode())
             tok = ((TagNode) n).tok;
        
         else throw new MalformedHTMLException(
             "This list does not begin an HTML TagNode, but rather a: " +
             n.getClass().getName() + "\n" + n.str
         );
                
         if (! (n = list.elementAt(ePos)).isTagNode())
        
             throw new MalformedHTMLException(
                 "This list does not end with an HTML TagNode, but rather a : " +
                 n.getClass().getName() + "\n" + n.str
             );
        
         if (! ((TagNode) n).tok.equals(tok))
        
             throw new MalformedHTMLException(
                 "This list does not begin and end with the same HTML TagNode:\n" +
                 "[OpeningTag: " + tok + "]\t[ClosingTag: " + ((TagNode) n).tok + "]"
             );
        
         for (String t : tokList) if (t.equals(tok)) return tok;
        
         String expectedTokList = "";
        
         for (String t: tokList) expectedTokList += " " + t;
        
         throw new MalformedHTMLException(
             "The opening and closing HTML Tag tokens for this list are not members of the " +
             "tokList parameter set...\n" +
             "Expected HTML Tag List: " + expectedTokList + "\nFound Tag: " + tok
         );
        
      • checkL1

        🡅  🡇    
        protected static void checkL1​(java.util.Vector<? extends HTMLNode> list,
                                      java.util.Vector<DotPair> sublists)
                               throws MalformedHTMLException
        This checks that the sublists demarcated by the Vector<DotPair> htmlSubLists parameter are properly formatted HTML. It would be easier to provide an example of "proper HTML formatting" and "improper HTML formatting" here, rather that trying to explain this using English.

        PROPER HTML:

        HTML Elements:
         <UL>
         	<LI> This is a list element.</LI>
         	<LI> This is another list element.</LI>
         	<LI> This list element contains <B><I> extra-tags</I></B> like "bold", "italics", and
               even a <A HREF="http://Torello.Directory">link!</A></LI>
         </UL>
        

        IMPROPER HTML:

        HTML Elements:
         <UL>
         This text should not be here, and constitutes "malformed HTML"
         <LI> This LI element is just fine.</LI>
         <A HREF="http://ChineseNewsBoard.com">This link</A> should be between LI elements
         <LI> This LI element is also just fine!</LI>
         </UL> 
        

        In the above two lists, the latter would generate a MalformedHTMLException
        Throws:
        MalformedHTMLException - whenever improper HTML is presented to this function
        Code:
        Exact Method Body:
         checkL1(list, 0, list.size()-1, sublists);
        
      • checkL1

        🡅    
        protected static void checkL1​(java.util.Vector<? extends HTMLNode> list,
                                      int sPos,
                                      int ePos,
                                      java.util.Vector<DotPair> sublists)
                               throws MalformedHTMLException
        This method, functionally, does the exact same thing as "checkEL1" - but with the endpoints specified. It is being kept with protected access since it might be unclear what endpoints are being checked. The previous method has many java exception case String's laboriously typed out. Rather than retype this, this method is being introduced. Functionally, it does the same thing as checkL1(Vector, String) - except it does not use list.elementAt(0) or list.elementAt(element.size()-1) as the starting and ending points.
        Parameters:
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Throws:
        MalformedHTMLException
        See Also:
        checkL1(Vector, Vector)
        Code:
        Exact Method Body:
         int         last    = sPos;
         int         t       = ePos - 1;
         HTMLNode    n       = null;
        
         for (DotPair sublist : sublists)
        
             if (sublist.start == (last+1)) last = sublist.end;
        
             else
             {
                 if ((sublist.start < (last+1)) || (sublist.start >= t))
        
                     throw new IllegalArgumentException(
                         "The provided subLists parameter does not contain subLists that are in " +
                         "order of the original list.  The 'list of sublists' must contain " +
                         "sublists that are in increasing sorted order.\n" +
                         "Specifically, each sublist must contain start and end points that are " +
                         "sequentially increasing.  Also, they may not overlap."
                     );
        
                 else
                 {
                     for (int i=(last+1); i < sublist.start; i++)
        
                         if ((n = list.elementAt(i)).isTagNode())
        
                             throw new MalformedHTMLException(
                                 "There is a spurious HTML-Tag element at Vector position: " + i +
                                 "\n=>\t" + n.str
                             );
        
                         else if (n.isTextNode() && (n.str.trim().length() > 0))
        
                             throw new MalformedHTMLException(
                                 "There is a spurious Text-Node element at Vector position: " + i +
                                 "\n=>\t" + n.str
                             );
                 }
             }