Package Torello.HTML

Class DotPair

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Cloneable, java.lang.Comparable<DotPair>, java.lang.Iterable<java.lang.Integer>

    public final class DotPair
    extends java.lang.Object
    implements java.io.Serializable, java.lang.Comparable<DotPair>, java.lang.Cloneable, java.lang.Iterable<java.lang.Integer>
    A simple utility class that, used ubiquitously throughout Java HTML, which maintains two integer fields - DotPai.start and DotPai.end , for demarcating the begining and ending of a sub-list within an HTML web-page.

    The purpose of this class is to keep the starting and ending points of an array sub-list together. In a much older computer language (LISP/Scheme) a 'dotted pair' is just two integers (numbers) that are glued to each other. Here, the two numbers are intended to represent Array Start and Array End Position values for the sub-list of a Vector.

    NOTE: Calling this class "Arraysub-listEndPoints" would be a lot more descriptive, but the name would be so long to type that instead, it is going to be called 'DotPair'

    IMPORTANT NOTE: For every one of the Find, Get and Remove node methods, the input parameters sPos, ePos are designed such that:

    • the "sPos" is inclusive, meaning that the Vector index denoted by the value of this parameter is included in the sub-list.
    • the "ePos" is exclusive, meaning that the Vector index denoted by the value of this parameter is NOT included in the sub-list.

    HOWEVER, HERE: in class DotPair

    • the "start" is inclusive, meaning that the Vector index denoted by the value of this class field is included in the sub-list.
    • the "end" is ALSO inclusive, meaning that the Vector index denoted by the value of this class field is ALSO included in the sub-list.

    Generally the "sPos, ePos" method parameters and a DotPair.start or DotPair.end field have exactly identical meanings - EXCEPT for the above noted difference.
    See Also:
    NodeIndex, SubSection, Serialized Form


    • Field Detail

      • serialVersionUID

        🡇    
        public static final long serialVersionUID
        This fulfils the SerialVersion UID requirement for all classes that implement Java's interface java.io.Serializable. Using the Serializable Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.
        See Also:
        Constant Field Values
        Code:
        Exact Field Declaration Expression:
        public static final long serialVersionUID = 1;
        
      • start

        🡅  🡇    
        public final int start
        This is intended to be the "starting index" into an sub-array of an HTML Vector of HTMLNode elements.
        Code:
        Exact Field Declaration Expression:
        public final int start;
        
      • end

        🡅  🡇    
        public final int end
        This is intended to be the "ending index" into a sub-array of an HTML Vector of HTMLNode elements.
        Code:
        Exact Field Declaration Expression:
        public final int end;
        
      • comp2

        🡅  🡇    
        public static java.util.Comparator<DotPair> comp2
        This is an "alternative Comparitor" that can be used for sorting instances of this class. It should work with the Collections.sort(List, Comparator) method in the standard JDK package java.util.*;

        NOTE: This simply compares the size of one DotPair to a second. The smaller shall be sorted first, and the larger (longer-in-length) DotPair shall be sorted later. If they are of equal size, whichever of the two has an earlier 'start' position in the Vector is considered first.
        See Also:
        CommentNode.body
        Code:
        Exact Field Declaration Expression:
        public static Comparator<DotPair> comp2 = (DotPair dp1, DotPair dp2) ->
            {
                int ret = dp1.size() - dp2.size();
        
                return (ret != 0) ? ret : (dp1.start - dp2.start);
            };
        
    • Constructor Detail

      • DotPair

        🡅  🡇    
        public DotPair​(int start,
                       int end)
        This constructor takes two integers and saves them into the public member fields.
        Parameters:
        start - This is intended to store the starting position of a vectorized-webpage sub-list or subpage.
        end - This will store the ending position of a vectorized-html webpage or subpage.
        Throws:
        java.lang.IndexOutOfBoundsException - A negative 'start' or 'end' parameter-value will cause this exception throw.
        java.lang.IllegalArgumentException - A 'start' parameter-value that is larger than the 'end' parameter will cause this exception throw.
        See Also:
        NodeIndex, SubSection
        Code:
        Exact Constructor Body:
         if (start < 0) throw new IndexOutOfBoundsException
             ("Negative start value passed to DotPair constructor: start = " + start);
        
         if (end < 0) throw new IndexOutOfBoundsException
             ("Negative ending value passed to DotPair constructor: end = " + end);
        
         if (end < start) throw new IllegalArgumentException(
             "Start-parameter value passed to constructor is greater than ending-parameter: " +
             "start: [" + start + "], end: [" + end + ']'
         );
        
         this.start  = start;
         this.end    = end;
        
    • Method Detail

      • shift

        🡅  🡇    
        public DotPair shift​(int delta)
        Creates a new instance that has been shifted by 'delta'.
        Parameters:
        delta - The number of array indices to shift 'this' intance. This parameter may be negative, and if so, 'this' will be shifted left, instead of right.
        Returns:
        A new, shifted, instance of 'this'
        Code:
        Exact Method Body:
         return new DotPair(this.start + delta, this.end + delta);
        
      • hashCode

        🡅  🡇    
        public int hashCode()
        Implements the standard java 'hashCode()' method. This will provide a hash-code that is likely to avoid crashes.
        Overrides:
        hashCode in class java.lang.Object
        Returns:
        A hash-code that may be used for inserting 'this' instance into a hashed table, map or list.
        Code:
        Exact Method Body:
         return this.start + (1000 * this.end);
        
      • size

        🡅  🡇    
        public int size()
        The purpose of this is to remind the user that the array bounds are inclusive at BOTH ends of the sub-list. Often, in many java.lang.String operations, the start-position is included in the results, but the end position is not.

        NOTICE: For a instance of 'DotPair', the intention is to include both the start and ending positions are both INCLUSIVE, meaning they are both included in the sub-list.
        Returns:
        The length of a sub-array that would be indicated by this dotted pair.
        Code:
        Exact Method Body:
         return this.end - this.start + 1;
        
      • toString

        🡅  🡇    
        public java.lang.String toString()
        Java's toString() requirement.
        Overrides:
        toString in class java.lang.Object
        Returns:
        A string representing 'this' instance of DotPair.
        Code:
        Exact Method Body:
         return "[" + start + ", " + end + "]";
        
      • equals

        🡅  🡇    
        public boolean equals​(java.lang.Object o)
        Java's public boolean equals(Object o) requirements.
        Overrides:
        equals in class java.lang.Object
        Parameters:
        o - This may be any Java Object, but only ones of 'this' type whose internal-values are identical will force this method to return TRUE.
        Returns:
        TRUE if (and only if) parameter 'o' is an instanceof DotPair and, also, both have equal start and ending field values.
        Code:
        Exact Method Body:
         if (o instanceof DotPair)
         {
             DotPair dp = (DotPair) o;
             return (this.start == dp.start) && (this.end == dp.end);
         }
        
         else return false;
        
      • clone

        🡅  🡇    
        public DotPair clone()
        Java's interface Cloneable requirements. This instantiates a new DotPair with identical 'start', 'end' fields.
        Overrides:
        clone in class java.lang.Object
        Returns:
        A new DotPair whose internal fields are identical to this one.
        Code:
        Exact Method Body:
         return new DotPair(this.start, this.end);
        
      • compareTo

        🡅  🡇    
        public int compareTo​(DotPair other)
        Java's interface Comparable<T> requirements. This is not the only comparison4 operation possible, but it does satisfy one reasonable requirement - SPECIFICALLY: which of two separate instances of DotPair start first.

        NOTE: If two DotPair instances begin at the same Vector-index, then the shorter of the two shall come first.
        Specified by:
        compareTo in interface java.lang.Comparable<DotPair>
        Parameters:
        other - Any other DotPair to be compared to 'this' DotPair
        Returns:
        An integer that fulfils Java's interface Comparable<T> public boolean compareTo(T t) method requirements.
        Code:
        Exact Method Body:
         int ret = this.start - other.start;
        
         return (ret != 0) ? ret : (this.size() - other.size());
        
      • iterator

        🡅  🡇    
        public java.util.PrimitiveIterator.OfInt iterator()
        This shall return an int Iterator (which is properly named class java.util.PrimitiveIterator.OfInt) that iterates integers beginning with the value in this.start and ending with the value in this.end.
        Specified by:
        iterator in interface java.lang.Iterable<java.lang.Integer>
        Returns:
        An Iterator that iterates 'this' instance of DotPair from the beginning of the range, to the end of the range. The Iterator returned will produce Java's primitive type int.

        NOTE: The elements returned by the Iterator are integers, and this is, in effect, nothing more than one which counts from start to end.
        Code:
        Exact Method Body:
         return new PrimitiveIterator.OfInt()
         {
             private int cursor = start;
        
             public boolean hasNext()    { return this.cursor <= end; }
        
             public int nextInt()
             {
                 if (cursor == end) throw new NoSuchElementException
                     ("Cursor has reached the value stored in 'end' [" + end + "]");
        
                 return cursor++;
             }
         };
        
      • iterator

        🡅  🡇    
        public <T extends HTMLNode> java.util.Iterator<T> iterator​
                    (java.util.Vector<T> page)
        
        A simple Iterator that will iterate elements on an input page, using 'this' intance of DotPair's indices, start, and end.
        Parameters:
        page - This may be any HTML page or sub-page. This page should correspond to 'this' instance of DotPair.
        Returns:
        An Iterator that will iterate each node in the page, beginning with the node at page.elementAt(this.start), and ending with page.elementAt(this.end)
        Throws:
        java.lang.IndexOutOfBoundsException - This throws if 'this' instance does not have a range that adheres to the size of the input 'page' parameter.
        Code:
        Exact Method Body:
         if (this.start >= page.size()) throw new IndexOutOfBoundsException(
             "This instance of DotPair points to elements that are outside of the range of the" +
             "input 'page' Vector.\n" +
             "'page' parameter size: " + page.size() + ", this.start: [" + this.start + "]"
         );
        
         if (this.end >= page.size()) throw new IndexOutOfBoundsException(
             "This instance of DotPair points to elements that are outside of the range of the" +
             "input 'page' Vector.\n" +
             "'page' parameter size: " + page.size() + ", this.end: [" + this.end + "]"
         );
        
         return new Iterator<T>()
         {
             private int cursor          = start;    // a.k.a. 'this.start'
             private int expectedSize    = page.size();
             private int last            = end;      // a.k.a. 'this.end'
        
             public boolean hasNext() { return cursor < last; }
        
             public T next()
             {
                 if (++cursor > last) throw new NoSuchElementException(
                     "This iterator's cursor has run past the end of the DotPaiar instance that " +
                     "formed this Iterator.  No more elements to iterate.  Did you call hasNext() ?"
                 );
        
                 if (page.size() != expectedSize) throw new ConcurrentModificationException(
                     "The expected size of the underlying vector has changed." +
                     "\nCurrent-Size " +
                     "[" + page.size() + "], Expected-Size [" + expectedSize + "]\n" +
                     "\nCursor location: [" + cursor + "]"
                 );
        
                 return page.elementAt(cursor);
             }
        
             // Removes the node from the underlying {@code Vector at the cursor's location.
             public void remove()
             { page.removeElementAt(cursor); expectedSize--; cursor--; last--; }
         };
        
      • isInside

        🡅  🡇    
        public boolean isInside​(int index)
        This will test whether a specific index is contained (between this.start and this.end, inclusively.
        Parameters:
        index - This is any integer index value. It must be greater than zero.
        Returns:
        TRUE If the value of index is greater-than-or-equal-to the value stored in field 'start' and furthermore is less-than-or-equal-to the value of field 'end'
        Throws:
        java.lang.IndexOutOfBoundsException - If the value is negative, this exception will throw.
        Code:
        Exact Method Body:
         if (index < 0) throw new IndexOutOfBoundsException
             ("You have passed a negative index [" + index + "] here, but this is not allowed.");
        
         return (index >= start) && (index <= end);
        
      • enclosedBy

        🡅  🡇    
        public boolean enclosedBy​(DotPair other)
        Tests whether 'this' DotPair is fully enclosed by DotPair parameter 'other'
        Parameters:
        other - Another DotPair. This parameter is expected to be a descriptor of the same vectorized-webpage as 'this' DotPair is. It is not mandatory, but if not, the comparison is likely meaningless.
        Returns:
        TRUE If (and only if) parameter 'other' encloses 'this'.
        Code:
        Exact Method Body:
         return (other.start <= this.start) && (other.end >= this.end);
        
      • encloses

        🡅  🡇    
        public boolean encloses​(DotPair other)
        Tests whether 'this' DotPair is enclosed, completely, by parameter DotPair parameter 'other'
        Parameters:
        other - Another DotPair. This parameter is expected to be a descriptor of the same vectorized-webpage as 'this' DotPair is. It is not mandatory, but if not, the comparison is likely meaningless.
        Returns:
        TRUE If (and only if) parameter 'other' is enclosed completely by 'this'.
        Code:
        Exact Method Body:
         return (this.start <= other.start) && (this.end >= other.end);
        
      • overlaps

        🡅  🡇    
        public boolean overlaps​(DotPair other)
        Tests whether parameter 'other' has any overlapping Vector-indices with 'this' DotPair
        Parameters:
        other - Another DotPair. This parameter is expected to be a descriptor of the same vectorized-webpage as 'this' DotPair is. It is not mandatory, but if not, the comparison is likely meaningless.
        Returns:
        TRUE If (and only if) parameter 'other' and 'this' have any overlap.
        Code:
        Exact Method Body:
         return
             ((this.start >= other.start)    && (this.start <= other.end)) ||
             ((this.end >= other.start)      && (this.end <= other.end));
        
      • isBefore

        🡅  🡇    
        public boolean isBefore​(DotPair other)
        Tests whether 'this' lays, completely, before DotPair parameter 'other'.
        Parameters:
        other - Another DotPair. This parameter is expected to be a descriptor of the same vectorized-webpage as 'this' DotPair is. It is not mandatory, but if not, the comparison is likely meaningless.
        Returns:
        TRUE if every index of 'this' has a value that is less than every index of 'other'
        Code:
        Exact Method Body:
         return this.end < other.start;
        
      • startsBefore

        🡅  🡇    
        public boolean startsBefore​(DotPair other)
        Tests whether 'this' begins before DotPair parameter 'other'.
        Parameters:
        other - Another DotPair. This parameter is expected to be a descriptor of the same vectorized-webpage as 'this' DotPair is. It is not mandatory, but if not, the comparison is likely meaningless.
        Returns:
        TRUE if this.start is less than other.start, and FALSE otherwise.
        Code:
        Exact Method Body:
         return this.start < other.start;
        
      • isAfter

        🡅  🡇    
        public boolean isAfter​(DotPair other)
        Tests whether 'this' lays, completely, after DotPair parameter 'other'.
        Parameters:
        other - Another DotPair. This parameter is expected to be a descriptor of the same vectorized-webpage as 'this' DotPair is. It is not mandatory, but if not, the comparison is likely meaningless.
        Returns:
        TRUE if every index of 'this' has a value that is greater than every index of 'other'
        Code:
        Exact Method Body:
         return this.start > other.end;
        
      • endsAfter

        🡅  🡇    
        public boolean endsAfter​(DotPair other)
        Tests whether 'this' ends after DotPair parameter 'other'.
        Parameters:
        other - Another DotPair. This parameter is expected to be a descriptor of the same vectorized-webpage as 'this' DotPair is. It is not mandatory, but if not, the comparison is likely meaningless.
        Returns:
        TRUE if this.end is greater than other.end, and FALSE otherwise.
        Code:
        Exact Method Body:
         return this.end > other.end;
        
      • toVector

        🡅  🡇    
        public static java.util.Vector<HTMLNodetoVector​
                    (java.util.Vector<? extends HTMLNode> html,
                     DotPair dp)
        
        This method converts a sublist, represented by a "dotted pair", and converts it into a Vector of HTMLNode.

        NOTE: The DotPair dp parameter contains fields start, end, which simply represent the starting and ending indices into the HTML page Vector. This method cycles through that Vector, beginning with the dp.start field, and ending with the dp.end field. Each HTMLNode reference within the sublist is inserted into the returned Vector.
        Parameters:
        html - Any Vectorized-HTML Web-Page, or sub-page
        dp - Any sublist within that HTML page.
        Returns:
        A Vector version of the original sublist that was represented by passed parameter 'dp'
        Code:
        Exact Method Body:
         Vector<HTMLNode> ret = new Vector<>();
        
         // NOTE: Cannot return the result of "subList" because it is linked/mapped to the original
         //       Vector.  If changes are made to "subList", those changes will be reflected in the
         //       original HTML Vector.
        
         ret.addAll(html.subList(dp.start, dp.end + 1));
        
         return ret;
        
      • toVectors

        🡅  🡇    
        public static java.util.Vector<java.util.Vector<HTMLNode>> toVectors​
                    (java.util.Vector<? extends HTMLNode> html,
                     java.lang.Iterable<DotPair> sublists)
        
        This will cycle through a "list of sublists" and call the method toVector(Vector<? extends HTMLNode> html, DotPair dp) on each sublist in the input parameter 'sublists' Those sublists will be collected into another Vector and returned.
        Parameters:
        html - Any Vectorized-HTML Web-Page, or sub-page
        sublists - A "List of sublists" within that HTML page.
        Returns:
        This method shall return a Vector containing vectors as sublists.
        Code:
        Exact Method Body:
         Vector<Vector<HTMLNode>> ret = new Vector<>();
        
         for (DotPair sublist : sublists) ret.addElement(toVector(html, sublist));
        
         return ret;
        
      • toSubSections

        🡅  🡇    
        public static java.util.Vector<SubSectiontoSubSections​
                    (java.util.Vector<? extends HTMLNode> html,
                     java.lang.Iterable<DotPair> sublists)
        
        This will cycle through a "list of sublists" and call the method toVector(Vector<? extends HTMLNode> html, DotPair dp) on each sublist in the input parameter 'sublists'. Those sublists will be collected into another Vector and returned.
        Parameters:
        html - Any Vectorized-HTML Web-Page, or sub-page
        sublists - A "List of sublists" within that HTML page.
        Returns:
        This method shall return a Vector containing vectors as sublists.
        Code:
        Exact Method Body:
         Vector<SubSection> ret = new Vector<>();
        
         for (DotPair sublist : sublists)
             ret.addElement(new SubSection(sublist, toVector(html, sublist)));
        
         return ret;
        
      • iterator

        🡅  🡇    
        public static java.util.PrimitiveIterator.OfInt iterator​
                    (java.lang.Iterable<DotPair> dpi,
                     boolean leastToGreatest)
        
        Convenience Method
        Invokes: toStream(Iterable, boolean)
        Converts output to an Iterator
        Code:
        Exact Method Body:
         return toStream(dpi, leastToGreatest).iterator();
        
      • toPosArray

        🡅  🡇    
        public static int[] toPosArray​(java.lang.Iterable<DotPair> dpi,
                                       boolean leastToGreatest)
        Convenience Method
        Invokes: toStream(Iterable, boolean)
        Converts output to an int[] array.
        Code:
        Exact Method Body:
         return toStream(dpi, leastToGreatest).toArray();
        
      • toStream

        🡅  🡇    
        public static java.util.stream.IntStream toStream​
                    (java.lang.Iterable<DotPair> dpi,
                     boolean leastToGreatest)
        
        This method will convert a list of DotPair instances to a Java java.util.stream.IntStream. The generated IntStream shall contain all Vector-indices (integers) that are within the bounds of any of the DotPair's listed by parameter 'dpi'.

        Stating this a second time, this position-index list (IntStream) is built out of the contents of the 'dpi' parameter. The returned index-list that's created will have all indices that are "inside" (as in isInside(int)) any of the 'DotPair's' within parameter 'dpi'.

        HINT: Many of the "Find" Methods available in the HTML.NodeSearch package return instances of Vector<DotPair>. These Vectors of DotPair are to be thought-of as "lists of sub-lists of a vectorized-html web-page. This method can help identify each and every integer-index that are "inside any of these passed sublists."

        SUBTLE POINT: The sublists (The DotPair's of input-parameter 'dpi') might overlap. Furthermore, others might have spaces/gaps between them. This method shall return an 'IntStream' of integer-indices, all of which are guaranteed to be members of a least one (but possibly many of) the 'dpi' DotPair sublists.

        NOTE ABOUT STALE-DATA: Try to keep in mind, always, that when writing code that modifies vectorized-HTML, the moment any node is inserted or deleted all Vector indices in your memory and data-structures may / might become stale or "invalid."
        There are myriad ways to handle this issue, many of which are beyond the scope of this Documentation Entry. Generally, the best suggestion to keep in mind, is that if you are modifying a vectorized-html page, perform your updates or removals in reverse order, and your Vector index-list pointers will not become stale pointers.
        Parameters:
        dpi - This may be any source for a class 'Dotpair' instance which implements the public interface java.lang.Iterable<Dotpair> interface.
        leastToGreatest - When this parameter receives a TRUE value, the results that are returned from this IntStream will be sorted least to greatest. To generated an IntStream that produces results that are sorted from greatest to least, pass FALSE to this parameter.
        Returns:
        A java java.util.stream.IntStream of the integers in that are members of this Iterable<DotPair>
        Code:
        Exact Method Body:
         Iterator<DotPair>   iter    = dpi.iterator();
         TreeSet<DotPair>    ts      = new TreeSet<>();
        
         // The tree-set will add the "DotPair" to the tree - and keep them sorted,
         // since that's what "TreeSet" does.
        
         while (iter.hasNext()) ts.add(iter.next());
        
         Iterator<DotPair>   tsIter  = leastToGreatest ? ts.iterator() : ts.descendingIterator();
         IntStream.Builder   builder = IntStream.builder();
         DotPair             dp      = null;
        
         if (leastToGreatest)
        
             // We are building a "forward-index" stream... DO AS MUCH SORTING... AS POSSIBLE!
             while (tsIter.hasNext())
                 for (int i=(dp=tsIter.next()).start; i <= dp.end; i++)
                     builder.add(i);
        
         else
        
             // we are building a "reverse-index" stream... Make sure to add the sub-lists in
             // reverse-order.
        
             while (tsIter.hasNext())
                 for (int i=(dp=tsIter.next()).end; i >= dp.start; i--)
                     builder.add(i);
        
         if (leastToGreatest)
        
             // We have added them in order (mostly!!) - VERY-TRICKY, and this is the whole point... 
             // MULTIPLE, OVERLAPPING DOTPAIRS
             // We need to sort because the DotPair sublists have been added in "sorted order" but
             // the overall list is not (necessarily, but possibly) sorted!
        
             return builder.build().sorted().distinct();
        
         else
        
             // Here, the exact same argument holds, but also, when "re-sorting" we have to futz
             // around with the fact that Java's 'IntStream' class does not have a specialized
             // reverse-sort() (or alternate-sort()) method... (Kind of another JDK bug).
        
             return builder.build().map(i -> -i).sorted().map(i -> -i).distinct();
        
      • endPointsToStream

        🡅  🡇    
        public static java.util.stream.IntStream endPointsToStream​
                    (java.lang.Iterable<DotPair> dpi,
                     boolean leastToGreatest)
        
        Collates a list of dotted-pairs into an IntStream. Here, only the starting and ending values of the DotPair's are inserted into the returned IntStream. Any indices that lay between DotPair.start and DotPair.end are not placed into the output-IntStream.

        All other behaviors of this method are the same as toStream(Iterable, boolean).
        Parameters:
        dpi - This may be any source for a class 'Dotpair' instance which implements the public interface java.lang.Iterable<Dotpair> interface.
        leastToGreatest - When this parameter receives a TRUE value, the results that are returned from this IntStream will be sorted least to greatest. To generated an IntStream that produces results that are sorted from greatest to least, pass FALSE to this parameter.
        Returns:
        A java java.util.stream.IntStream of the integers in that are members of this Iterable<DotPair>. Only the values DotPair.start, and DotPair.end are included in the output-IntStream. This is unlike the method toStream(Iterable, boolean) in that, here, only the starting and ending points of the dotted-pair are placed into result. In the other method, the start-index, end-index and all indices in between them are placed into the returned-Stream.
        See Also:
        toStream(java.lang.Iterable<Torello.HTML.DotPair>,boolean)
        Code:
        Exact Method Body:
         Iterator<DotPair>   iter    = dpi.iterator();
         TreeSet<DotPair>    ts      = new TreeSet<>();
        
         // The tree-set will add the "DotPair" to the tree - and keep them sorted,
         // since that's what "TreeSet" does.
        
         while (iter.hasNext()) ts.add(iter.next());
        
         Iterator<DotPair>   tsIter  = leastToGreatest ? ts.iterator() : ts.descendingIterator();
         IntStream.Builder   builder = IntStream.builder();
         DotPair             dp      = null;
        
         if (leastToGreatest)
        
             // We are building a "forward-index" stream... DO AS MUCH SORTING... AS POSSIBLE!
             while (tsIter.hasNext())
             {
                 dp = tsIter.next();
        
                 // In this method, only the start/end are placed into the IntStream
                 builder.add(dp.start);
        
                 // The indices BETWEEN start/end ARE NOT appened to the IntStream
                 builder.add(dp.end);
             }
        
         else
        
             // we are building a "reverse-index" stream... Make sure to add the sub-lists in
             // reverse-order.
        
             while (tsIter.hasNext())
             {
                 dp = tsIter.next();
        
                 // Only start/end are appended.
                 builder.add(dp.end);
        
                 // NOTE: This is a "reverse order" IntStream
                 builder.add(dp.start);
             }
        
         if (leastToGreatest)
        
             // We have added them in order (mostly!!) - VERY-TRICKY, and this is the whole point...
             // MULTIPLE, OVERLAPPING DOTPAIRS
             // We need to sort because the DotPair sublists have been added in "sorted order" but
             // the overall list is not (necessarily, but possibly) sorted!
        
             return builder.build().sorted().distinct();
        
         else
        
             // Here, the exact same argument holds, but also, when "re-sorting" we have to futz
             // around with the fact that Java's 'IntStream' class does not have a specialized
             // reverse-sort() (or alternate-sort()) method... (Kind of another JDK bug).
        
             return builder.build().map(i -> -i).sorted().map(i -> -i).distinct();
        
      • exceptionCheck

        🡅  🡇    
        public void exceptionCheck​(java.util.Vector<HTMLNode> page,
                                   java.lang.String... possibleTokens)
        A method that will do a fast check that 'this' intance holds index-pointers to an opening and closing HTML-Tag pair. Note, though these mistakes may seem trivial, when parsing Internet Web-Pages, these are exactly the type of basic mistakes that users will make when their level of 'concentration' is low. This is no different that checking an array-index or String-index for an IndexOutOfBoundsException.

        This type of detailed exception message can make analyzing web-pages more direct and less error-prone. The 'cost' incurred includes only a few if-statement comparisons, and this check should be performed immediatley before a loop is entered.
        Parameters:
        page - Any web-page, or sub-page. It needs to be the page from whence 'this' instance of DotPair was retrieved.
        Throws:
        TagNodeExpectedException - If 'this' instance' start or end fields do not point to TagNode elements on the 'page'.
        HTMLTokException - If start or end do not point to a TagNode whose TagNode.tok field equals the String contained by parameter 'token'.
        OpeningTagNodeExpectedException - If start does not point to an opening TagNode.
        ClosingTagNodeExpectedException - If end does not point to a closing TagNode.
        java.lang.NullPointerException - If the 'page' parameter is null.
        ExceptionCheckError - IMPORTANT Since this method is, indubuitably, a method for performing error checking, the presumption is that the programmer is trying to check for his users input. If in the processes of checking for user error, another mistake is made that would generate an exception, this must thought of as a more serious error.

        The purpose of the 'possibleTokens' array is to check that those tokens match the tokens that are contained by the TagNode's on the page at index this.start, and this.end. If invalid HTML tokens, null tokens, or even HTML Singleton tokens are passed this exception-check, itself, is flawed! If there are problems with this var-args array, this error is thrown.

        It is more serious because it indicates that the programmer has made a mistake in attempting to check for user-errors.
        Code:
        Exact Method Body:
         if (page == null) throw new NullPointerException
             ("HTML-Vector parameter was passed a null reference.");
        
         if (possibleTokens == null) throw new ExceptionCheckError
             ("HTML tags string-list was passed a null reference.");
        
         for (String token : possibleTokens)
         {
             if (token == null) throw new ExceptionCheckError
                 ("One of the HTML Tag's in the tag-list String-array was null.");
        
             if (! HTMLTags.isTag(token)) throw new ExceptionCheckError
                 ("One of the passed tokens [" + token +"] is not a valid HTML token.");
        
             if (HTMLTags.isSingleton(token)) throw new ExceptionCheckError
                 ("One of the passed tokens [" + token +"] is an HTML Singleton.");
         }
        
        
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
         // Check the DotPair.start
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
        
         if (this.start >= page.size()) throw new IndexOutOfBoundsException(
             "DotPair's 'start' field [" + this.start + "], is greater than or equal to the " +
             "size of the HTML-Vector [" + page.size() + "]."
         );
        
         if (! (page.elementAt(this.start) instanceof TagNode))
             throw new TagNodeExpectedException(this.start);
        
         TagNode t1 = (TagNode) page.elementAt(this.start);
        
         if (t1.isClosing) throw new OpeningTagNodeExpectedException(
             "The TagNode at index [" + this.start + "] was a closing " +
             "</" + t1.tok.toUpperCase() + ">, but an opening tag was expected here."
         );
        
        
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
         // Now Check the DotPair.end
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
        
         if (this.end >= page.size()) throw new IndexOutOfBoundsException(
             "DotPair's 'end' field [" + this.end + "], is greater than or equal to the " +
             "size of the HTML-Vector [" + page.size() + "]."
         );
        
         if (! (page.elementAt(this.end) instanceof TagNode))
             throw new TagNodeExpectedException(this.end);
        
         TagNode t2 = (TagNode) page.elementAt(this.end);
        
         if (! t2.isClosing) throw new ClosingTagNodeExpectedException(
             "The TagNode at index [" + this.start + "] was an opening " +
             "<" + t2.tok.toUpperCase() + ">, but a closing tag was expected here."
         );
        
        
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
         // Token Check
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
        
         if (! t1.tok.equalsIgnoreCase(t2.tok)) throw new HTMLTokException(
             "The opening TagNode was the [" + t1.tok.toLowerCase() + "] HTML Tag, while the " +
             "closing Tag was the [" + t2.tok.toLowerCase() + "].  These two tag's must be an " +
             "opening and closing pair, and therefore must match each-other."
         );
        
         for (String possibleToken : possibleTokens)
             if (possibleToken.equalsIgnoreCase(t1.tok))
                 return;
        
         String t = t1.tok.toUpperCase();
        
         throw new HTMLTokException(
             "The opening and closing tags were: <" + t + ">, and </" + t + ">, but " +
             "unfortunately this Tag is not included among the list of expected tags:\n" +
             "    [" + StrCSV.toCSV(possibleTokens, false, false, 60) + "]."
         );
        
      • exceptionCheck

        🡅    
        public void exceptionCheck​(java.util.Vector<HTMLNode> page)
        Performs an exception check, using 'this' instance of DotPair, and throws an IndexOutOfBoundsException if 'this' contains end-points that do not fit inside the 'page' Vector Parameter.
        Parameters:
        page - Any HTML Page, or subpage. page.size() must return a value that is larger than BOTH start AND end.
        Throws:
        java.lang.IndexOutOfBoundsException - A value for start or end which are larger than the size of the Vector parameter 'page' will cause this exception throw.
        Code:
        Exact Method Body:
         if (this.end >= page.size()) throw new IndexOutOfBoundsException(
             "The value of this.end [" + this.end + "] is greater than the size of Vector " +
             "parameter 'page' [" + page.size() + "]"
         );
        
         // This is actually unnecessary.  If 'end' is fine, then 'start' must be fine.  If 'end' is
         // out of bounds, then it is irrelevant whether 'start' is out of bounds.  "They" play with
         // your brain when you are coding.
        
         /*
         if (this.start >= page.size()) throw new IndexOutOfBoundsException(
             "The value of this.start [" + this.start + "] is greater than the size of Vector " +
             "parameter 'page' [" + page.size() + "]"
         );
         */