Class TextNodePeek


  • public class TextNodePeek
    extends java.lang.Object
    "Peeks" into Vectorized-HTML for text matching a search-criteria and returns the Vector-index where matches are found, and the TextNode at that Vector-location, as an instance of TextNodeIndex.

    TextNodePeek =>

    1. TextNode: This implies that TagNode elements are ignored completely in this search, and instead, the "text" (a.k.a. 'page-content'), represented as instances of TextNode, are searched.
    2. Peek: This implies that BOTH the Vector-index / indices where a match occurred, AND the the HTMLNode at that index are SIMULTANEOUSLY returned by these methods - using the data-type classes NodeIndex and SubSection.

    Methods Available

    Method Explanation
    first (...) This will retrieve the first TextNode match found in the vectorized-page parameter 'html', along with the underlying HTMLNode Vector-Index where this match was found. This object-reference and int are returned as the result, wrapped in the Wrapper-Class TextNodeIndex.
    nth (...) This will retrieve the nth TextNode match found in the vectorized-page parameter 'html', along with the underlying HTMLNode Vector-Index where this match was found. This object-reference and int are returned as the result, wrapped in the Wrapper-Class TextNodeIndex.
    last (...) This will retrieve the last TextNode match found in the vectorized-page parameter 'html', along with the underlying HTMLNode Vector-Index where this match was found. This object-reference and int are returned as the result, wrapped in the Wrapper-Class TextNodeIndex.
    nthFromEnd (...) This will retrieve the nth-from-last TextNode match found in the vectorized-page parameter 'html', along with the underlying HTMLNode Vector-Index where this match was found. This object-reference and int are returned as the result, wrapped in the Wrapper-Class TextNodeIndex.
    all (...) This will return a Vector<TextNodeIndex> that contains every TextNode that matches the specified search-criteria found inside the vectorized-page parameter 'html'. Each match is wrapped in the Wrapper-Class TextNodeIndex that stores (encapsulates) both the TextNode object-reference as well as the int index to the HTML Vector where the match was found. These results are then added into the return Vector<TextNodeIndex>.
    allExcept (...) This will return a Vector<TextNodeIndex> that contains every TextNode that does not match the specified search-criteria found inside the vectorized-page parameter 'html'. Each 'un-match' is wrapped in the Wrapper-Class TextNodeIndex that stores (encapsulates) both the TextNode object-reference as well as the int index to the HTML Vector where the match was found. These results are then added into the return Vector<TextNodeIndex>.

    Method Parameters

    Parameter Explanation
    Vector<? HTMLNode> html This represents any vectorized HTML page, sub-page, or list of partial-elements.
    int nth This represents the 'nth' match of a comparison for-loop. When the method-signature used includes the parameter 'nth' , the first n-1 matches that are found - will be skipped, and the 'nth' match is, instead, returned.

    EXCEPTIONS: An NException shall throw if the value of parameter 'nth' is zero, negative, or larger than the size of the input html-Vector.
    int sPos, int ePos When these parameters are present, only HTMLNode's that are found between the specified Vector indices will be considered for matching with the search criteria.

    NOTE: In every situation where the parameters int sPos, int ePos are used, parameter 'ePos' will accept a negative value, but parameter 'sPos' will not. When 'ePos' is passed a negative-value, the internal LV ('Loop Variable Counter') will have its public final int end field set to the length of the vectorized-html page that was passed. (html.size() of parameter Vector<HTMLNode> html).

    EXCEPTIONS: An IndexOutOfBoundsException will be thrown if:

    • If sPos is negative, or if sPos is greater-than or equal-to the size of the input Vector
    • If ePos is zero, or greater than the size of the input Vector.
    • If sPos is a larger integer than ePos
    TextComparitor tc WORKS WITH: This parameter utilizes / works-with parameter String... compareStr to perform the requested comparisons. The comparisons are computed using the TextNode.str String-field of a TextNode.

    When this parameter is present in the method-signature parameter-list, the decision of whether a TextNode is to be included in the search result-set is defined by this parameter's FunctionalInterface Predicate 'test' method. TextComparitor is a Java BiPredicate<String, String[]> which compares its first String-parameter against the String's in its second.
    Pattern p This parameter references Java's "Regular Expression" processing engine. If the method-signature includes the java.util.regex.Pattern parameter, the search-loops will use the standard Regular-Expression pattern matching routine: p.asPredicate().test(text_node.str) when deciding which TextNode's "match" this search-criteria.
    Predicate<String> p When this parameter is present in the method-signature parameter-list, the decision of whether a TextNode is to be included in the search result-set are made by the results of the Java Predicate.test(String) method.

    Specifically: p.test(text_node.str)
    String... compareStr WORKS WITH: This parameter works in coordination with TextComparitor tc. This parameter supplies the strings with which the comparisons of the TextNode.str field may be compared.

    Return Values:

    1. TextNodeIndex represents a matched TextNode, and it's position/vector-index, from the html vectorized-page parameter 'html'
    2. A return value of null implies no matches were found.
    3. Vector<TextNodeIndex> - A vector of TextNodeIndex represents all matches from the vectorized-page parameter 'html'. The class TextNodeIndex is a return-value way of encapsulating both the TextNode and it's index into the underlying HTMLNode Vector.
    4. A zero-length Vector<TextNodeIndex> means no matches were found on the page or sub-page. Empty Vectors are returned from any method where the possibility existed for multiple-matches being provided as a result-set.


Stateless Class: This class neither contains any program-state, nor can it be instantiated. The @StaticFunctional Annotation may also be called 'The Spaghetti Report'. Static-Functional classes are, essentially, C-Styled Files, without any constructors or non-static member field. It is very similar to the Java-Bean @Stateless Annotation.
  • 1 Constructor(s), 1 declared private, zero-argument constructor
  • 36 Method(s), 36 declared static
  • 0 Field(s)