Package Torello.HTML

Class Util.Inclusive

  • Enclosing class:
    Util

    public static class Util.Inclusive
    extends java.lang.Object
    Tools for finding the matching-closing tag of any open TagNode.

    These methods provided in this class will search for an inclusive-match to an input, opening, TagNode. The use user must provide the HTML-Vector containing the opening TagNode, and the six search variants, (Count, Find, Get, Peek, Poll, and Remove each have a method in this class for retrieving the type requested.


Stateless Class: This class neither contains any program-state, nor can it be instantiated. The @StaticFunctional Annotation may also be called 'The Spaghetti Report'. Static-Functional classes are, essentially, C-Styled Files, without any constructors or non-static member field. It is very similar to the Java-Bean @Stateless Annotation.
  • 1 Constructor(s), 1 declared private, zero-argument constructor
  • 11 Method(s), 11 declared static
  • 0 Field(s)


    • Method Detail

      • find

        🡇    
        public static int find​(java.util.Vector<? extends HTMLNode> html,
                               int nodeIndex)
        This finds the closing HTML 'TagNode' match for a given opening 'TagNode' in a given-input html page or sub-section.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        nodeIndex - An index into that Vector. This index must point to an HTMLNode element that is:

        1. An instance of TagNode
        2. A TagNode whose 'isClosing' field is FALSE
        3. Is not a 'singleton' HTML element-token (i.e. <IMG>, <BR>, <H1>, etc...)
        Returns:
        An "inclusive search" finds OpeningTag and ClosingTag pairs - and returns all the elements between them in the contents of a return-Vector, or Vector DotPair-end-point value. This method will take a particular node of a Vector, and (as long it has a match) find it's closing HTMLNode match. The integer returned will be the index into this page of the closing, matching TagNode.
        Throws:
        TagNodeExpectedException - If the node in the Vector-parameter 'html' contained at index 'nodeIndex' is not an instance of TagNode, then this exception is thrown.
        OpeningTagNodeExpectedException - If the node in the Vector-parameter 'html' at index 'nodeIndex' is a closing version of the HTML element, then this exception shall throw.
        InclusiveException - If the node in Vector-parameter 'html', pointed-to by index 'nodeIndex' is an HTML 'Singleton' / Self-Closing Tag, then this exception will be thrown.
        See Also:
        TagNode, TagNode.tok, TagNode.isClosing, HTMLNode
        Code:
        Exact Method Body:
         TagNode     tn  = null;
         HTMLNode    n   = null;
         String      tok = null;
        
         if (! html.elementAt(nodeIndex).isTagNode())
        
             throw new TagNodeExpectedException (
                 "You have attempted to find a closing tag to match an opening one, " +
                 "but the 'nodeIndex' (" + nodeIndex + ") you have passed doesn't contain " +
                 "an instance of TagNode."
             );
        
         else tn = (TagNode) html.elementAt(nodeIndex);
        
         if (tn.isClosing) throw new OpeningTagNodeExpectedException(
             "The TagNode indicated by 'nodeIndex' = " + nodeIndex + " has its 'isClosing' " +
             "boolean as TRUE - this is not an opening TagNode, but it must be to continue."
         );
        
         // Checks to ensure this token is not a 'self-closing' or 'singleton' tag.
         // If it is an exception shall throw.
         InclusiveException.check(tok = tn.tok);
        
         int end         = html.size();
         int openCount   = 1;
        
         for (int pos = (nodeIndex+1); pos < end; pos++)
        
             if ((n = html.elementAt(pos)).isTagNode())
                 if ((tn = ((TagNode) n)).tok.equals(tok))
                 {
                     // This keeps a "Depth Count" - where "depth" is just the number of 
                     // opened tags, for which a matching, closing tag hasn't been found yet.
        
                     openCount += (tn.isClosing ? -1 : 1);
        
                     // When all open-tags of the specified HTML Element 'tok' have been
                     // found, search has finished.
        
                     if (openCount == 0) return pos;
                 }
        
         // The closing-matching tag was not found
         return -1;
        
      • peek

        🡅  🡇    
        public static SubSection peek​(java.util.Vector<? extends HTMLNode> html,
                                      int nodeIndex)
        Convenience Method
        Invokes: find(Vector, int)
        Converts: output to 'PEEK' format (SubSection)
        Using: Util.cloneRange(Vector, int, int)
        Code:
        Exact Method Body:
         int endPos = find(html, nodeIndex);
        
         return (endPos == -1) ? null : new SubSection(
             new DotPair(nodeIndex, endPos),
             cloneRange(html, nodeIndex, endPos + 1)
         );
        
      • poll

        🡅  🡇    
        public static java.util.Vector<HTMLNodepoll​
                    (java.util.Vector<? extends HTMLNode> html,
                     int nodeIndex)
        
        Convenience Method
        Invokes: find(Vector, int)
        Converts: output to 'POLL' format (Vector-sublist),
        Using: Util.pollRange(Vector, int, int)
        Removes: The requested Sub-List
        Code:
        Exact Method Body:
         int endPos = find(html, nodeIndex);
        
         return (endPos == -1) ? null : pollRange(html, nodeIndex, endPos + 1);
        
      • remove

        🡅  🡇    
        public static int remove​(java.util.Vector<? extends HTMLNode> html,
                                 int nodeIndex)
        Convenience Method
        Invokes: find(Vector, int)
        Converts: output to 'REMOVE' format (int - number of nodes removed)
        Using: Util.removeRange(Vector, int, int)
        Removes: The requested Sub-List
        Code:
        Exact Method Body:
         int endPos = find(html, nodeIndex);
        
         return (endPos == -1) ? 0 : removeRange(html, nodeIndex, endPos + 1);
        
      • vectorOPT

        🡅  🡇    
        public static java.util.Vector<HTMLNodevectorOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos)
        
        Convenience Method
        Invokes: dotPairOPT(Vector, int)
        Converts: output to Vector<HTMLNode>
        Code:
        Exact Method Body:
         DotPair dp = dotPairOPT(html, tagPos);
        
         if (dp == null) return null;
         else            return Util.cloneRange(html, dp.start, dp.end + 1);
        
      • subSectionOPT

        🡅  🡇    
        public static SubSection subSectionOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos)
        
        Convenience Method
        Invokes: dotPairOPT(Vector, int)
        Converts: output to SubSection
        Code:
        Exact Method Body:
         DotPair dp = dotPairOPT(html, tagPos);
        
         if (dp == null) return null;
         else            return new SubSection(dp, Util.cloneRange(html, dp.start, dp.end + 1));
        
      • dotPairOPT

        🡅  🡇    
        public static DotPair dotPairOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos)
        
        OPT: Optimized Which means this method expects that any parameter-error checking has already been performed.

        There are no error-checks, nor validity-checks performed on the input to this method. This is a heavily-used, internally-used method for this package. Originally, this was included in the internal-helper set of classes for the Node-Search package.

        PURPOSE AND USE: This method expects to receive a vectorized-html page, or sub-page, along with a valid-index into that page pointing to an instance of a TagNode. The TagNode instance is expected to be BOTH an OpeningTag, and a non-singleton (non-self-closing) HTML Element. This method finds the corresponding "closing, matching, paired" TagNode HTML Element. For instance, a "<DIV ..."> HTML element is matched to it's corresponding "</DIV>" element, and an "<A ...>" element to it's closing "</A>" element.

        This method is heavily used in any class in the Node-Search Package that contains or uses the word 'inclusive.' This is because 'inclusive' is closely-similar to the "java-script function" '.innerHTML' All three of the following optimization methods perform identical tasks, but have different return types (of similar / identical data):

        • public static DotPair inclusiveDotPairOPT(Vector, int, int) - which returns the matching 'innerHTML' as an index-pointer pair.
        • public static Vector<HTMLNode> inclusiveVectorOPT(Vector, int, int) - which returns the matching 'innerHTML' as cloned-copy of the Vector-sublist as new instance of 'Vector<HTMLNode>'.
        • public static DotPair inclusiveDotPairOPT(Vector, int, int) - which returns the matching 'innerHTML' as cloned-copy of the Vector-sublist combined-with it's DotPair (both the 'Vector' clone and the 'DotPair' index-pointers are returned, together, as an instance of SubSection).
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        tagPos - This may be any valid position within this html-Vector, and for obvious reasons it must both be positive, and less than the size of the Vector. It must also point to a valid MObject-reference to an instance of class TagNode.
        Returns:
        A 'DotPair' version of an inclusive, end-to-end HTML tag-element.

        Again, there is a strong similarity between the term "inclusive-match" and the java-script Object-field 'innerHTML.' Both of these terms essentially refer to a block of HTML code that begins with a non-singleton HTML element (like a <DIV> - divider) that has an opening-tag: <DIV> and a closing-tag </DIV> - and includes all HTMLNode's between these.
        See Also:
        TagNode, TagNode.isClosing, TagNode.tok, DotPair
        Code:
        Exact Method Body:
         // Temp Variables
         HTMLNode n;		TagNode tn;		int openCount = 1;
        
         int len = html.size();
        
         // This is the name (token) of the "Opening HTML Element", we are searching for
         // the matching, closing element
        
         String tok = ((TagNode) html.elementAt(tagPos)).tok;
        
         for (int i = (tagPos+1); i < len; i++)
        
             if ((n = html.elementAt(i)).isTagNode())
                 if ((tn = (TagNode) n).tok.equals(tok))
                 {
                     // This keeps a "Depth Count" - where "depth" is just the number of 
                     // opened tags, for which a matching, closing tag hasn't been found yet.
        
                     openCount += (tn.isClosing ? -1 : 1);
        
                     // When all open-tags of the specified HTML Element 'tok' have been
                     // found, search has finished.
        
                     if (openCount == 0) return new DotPair(tagPos, i);
                 }
        
         // Was not found
         return null;
        
      • vectorOPT

        🡅  🡇    
        public static java.util.Vector<HTMLNodevectorOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos,
                     int end)
        
        Convenience Method
        Invokes: dotPairOPT(Vector, int, int)
        Converts: output to Vector<HTMLNode>
        Code:
        Exact Method Body:
         DotPair dp = dotPairOPT(html, tagPos, end);
        
         if (dp == null) return null;
         else            return Util.cloneRange(html, dp.start, dp.end + 1);
        
      • subSectionOPT

        🡅  🡇    
        public static SubSection subSectionOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos,
                     int end)
        
        Convenience Method
        Invokes: dotPairOPT(Vector, int, int)
        Converts: output to SubSection
        Code:
        Exact Method Body:
         DotPair dp = dotPairOPT(html, tagPos, end);
        
         if (dp == null) return null;
         else            return new SubSection(dp, Util.cloneRange(html, dp.start, dp.end + 1));
        
      • dotPairOPT

        🡅    
        public static DotPair dotPairOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos,
                     int end)
        
        OPT: Optimized Which means this method expects that any parameter-error checking has already been performed.

        There are no error-checks, nor validity-checks performed on the input to this method. This is a heavily-used, internally-used method for this package. Originally, this was included in the internal-helper set of classes for the Node-Search package.

        PURPOSE AND USE: This method expects to receive a vectorized-html page, or sub-page, along with a valid-index into that page pointing to an instance of a TagNode. The TagNode instance is expected to be BOTH an OpeningTag, and a non-singleton (non-self-closing) HTML Element. This method finds the corresponding "closing, matching, paired" TagNode HTML Element. For instance, a "<DIV ..."> HTML element is matched to it's corresponding "</DIV>" element, and an "<A ...>" element to it's closing "</A>" element.

        This method is heavily used in any class in the Node-Search Package that contains or uses the word 'inclusive.' This is because 'inclusive' is closely-similar to the "java-script function" '.innerHTML' All three of the following optimization methods perform identical tasks, but have different return types (of similar / identical data):

        • public static DotPair inclusiveDotPairOPT(Vector, int, int) - which returns the matching 'innerHTML' as an index-pointer pair.
        • public static Vector<HTMLNode> inclusiveVectorOPT(Vector, int, int) - which returns the matching 'innerHTML' as cloned-copy of the Vector-sublist as new instance of 'Vector<HTMLNode>'.
        • public static DotPair inclusiveDotPairOPT(Vector, int, int) - which returns the matching 'innerHTML' as cloned-copy of the Vector-sublist combined-with it's DotPair (both the 'Vector' clone and the 'DotPair' index-pointers are returned, together, as an instance of SubSection).
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        tagPos - This may be any valid position within this html-Vector, and for obvious reasons it must both be positive, and less than the size of the Vector. It must also point to a valid MObject-reference to an instance of class TagNode.
        end - This is a "loop-variable" instance that establishes an ending-perimeter around the search-location for finding an inclusive-match. (As an aside, it essentially maps to int ePos' in all of the node-search methods). If a complete end-to-end open-and-close "inclusive-match" is not found within the perimeter of 'tagPos' and 'end', then a 'null' shall be returned.
        Returns:
        A 'DotPair' version of an inclusive, end-to-end HTML tag-element.

        Again, there is a strong similarity between the term "inclusive-match" and the java-script Object-field 'innerHTML.' Both of these terms essentially refer to a block of HTML code that begins with a non-singleton HTML element (like a <DIV> - divider) that has an opening-tag: <DIV> and a closing-tag </DIV> - and includes all HTMLNode's between these.
        See Also:
        TagNode, TagNode.isClosing, TagNode.tok, DotPair
        Code:
        Exact Method Body:
         // Temp Variables
         HTMLNode n;		TagNode tn;		int openCount = 1;		int endPos;
        
         // This is the name (token) of the "Opening HTML Element", we are searching for
         // the matching, closing element
         String tok = ((TagNode) html.elementAt(tagPos)).tok;
        
         for (endPos = (tagPos+1); endPos < end; endPos++)
        
             if ((n = html.elementAt(endPos)).isTagNode())
                 if ((tn = (TagNode) n).tok.equals(tok))
                 {
                     // This keeps a "Depth Count" - where "depth" is just the number of
                     // opened tags, for which a matching, closing tag hasn't been found yet.
                     openCount += (tn.isClosing ? -1 : 1);
        
                     // When all open-tags of the specified HTML Element 'tok' have been
                     // found, search has finished.
                     if (openCount == 0) return new DotPair(tagPos, endPos);
                 }
        
         // The end of the vectorized-html page (or subsection) was reached, but the
         // matching-closing element was not found.
         return null; // assert(endPos == html.size());