Package Torello.HTML

Class DPUtil

    • Constructor Summary

      Constructors 
      Constructor Description
      DPUtil()  
    • Method Summary

       
      Collate Multiple DotPair's to a Single Index-List
      Modifier and Type Method
      static PrimitiveIterator.OfInt iterator​(Iterable<DotPair> dpi, boolean leastToGreatest)
      static int[] toPosArray​(Iterable<DotPair> dpi, boolean leastToGreatest)
      static IntStream toStream​(Iterable<DotPair> dpi, boolean leastToGreatest)
       
      Collate Multiple DotPair's to a Single Index-List, Include End-Points Only
      Modifier and Type Method
      static PrimitiveIterator.OfInt endPointsIterator​(Iterable<DotPair> dpi, boolean leastToGreatest)
      static int[] endPointsToPosArray​(Iterable<DotPair> dpi, boolean leastToGreatest)
      static IntStream endPointsToStream​(Iterable<DotPair> dpi, boolean leastToGreatest)
       
      Retrieve HTMLNode's from an HTML-Vector, using one or more DotPair's
      Modifier and Type Method
      static Vector<SubSection> toSubSections​(Vector<? extends HTMLNode> html, Iterable<DotPair> sublists)
      static Vector<HTMLNode> toVector​(Vector<? extends HTMLNode> html, DotPair dp)
      static Vector<Vector<HTMLNode>> toVectors​(Vector<? extends HTMLNode> html, Iterable<DotPair> sublists)
       
      Collate the Mirror-Inverse (a.k.a. "The Split") of Multiple DotPair's
      Modifier and Type Method
      static PrimitiveIterator.OfInt excludedToIterator​(Iterable<DotPair> dpi, int vectorSize, boolean leastToGreatest)
      static int[] excludedToPosArray​(Iterable<DotPair> dpi, int vectorSize, boolean leastToGreatest)
      static IntStream excludedToStream​(Iterable<DotPair> dpi, int vectorSize, boolean leastToGreatest)
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • toVector

        🡅  🡇     🗕  🗗  🗖
        public static java.util.Vector<HTMLNodetoVector​
                    (java.util.Vector<? extends HTMLNode> html,
                     DotPair dp)
        
        This method converts a sublist, represented by a "dotted pair", and converts it into a Vector of HTMLNode.

        Retrieval Heuristic:
        It should be obvious that the parameter @code 'dp'} contains fields named DotPair.start and DotPair.end. These two simply represent the starting and ending indices into an HTML page Vector. In this method, the intention is that they are indices into the HTML-Vector parameter simply-named 'html'.

        This method cycles through that Vector, beginning with the field dp.start (inclusive) and ending with dp.end (inclusive, as well). Each HTMLNode reference within this sublist is inserted into the Vector that is ultimately returned.
        Parameters:
        html - Any Vectorized-HTML Web-Page, or sub-page
        dp - Any sublist within that HTML page.
        Returns:
        A Vector version of the original sublist that was represented by passed parameter 'dp'
        Code:
        Exact Method Body:
         Vector<HTMLNode> ret = new Vector<>();
        
         // NOTE: Cannot return the result of "subList" because it is linked/mapped to the original
         //       Vector.  If changes are made to "subList", those changes will be reflected in the
         //       original HTML Vector.
        
         ret.addAll(html.subList(dp.start, dp.end + 1));
        
         return ret;
        
      • toVectors

        🡅  🡇     🗕  🗗  🗖
        public static java.util.Vector<java.util.Vector<HTMLNode>> toVectors​
                    (java.util.Vector<? extends HTMLNode> html,
                     java.lang.Iterable<DotPair> sublists)
        
        This will cycle through a "list of sublists" and call the method toVector(Vector<? extends HTMLNode> html, DotPair dp) on each sublist in the input parameter 'sublists' Those sublists will be collected into another Vector and returned.
        Parameters:
        html - Any Vectorized-HTML Web-Page, or sub-page
        sublists - A "List of sublists" within that HTML page.
        Returns:
        This method shall return a Vector containing vectors as sublists.
        Code:
        Exact Method Body:
         Vector<Vector<HTMLNode>> ret = new Vector<>();
        
         for (DotPair sublist : sublists) ret.addElement(toVector(html, sublist));
        
         return ret;
        
      • toSubSections

        🡅  🡇     🗕  🗗  🗖
        public static java.util.Vector<SubSectiontoSubSections​
                    (java.util.Vector<? extends HTMLNode> html,
                     java.lang.Iterable<DotPair> sublists)
        
        This will cycle through a "list of sublists" and call the method toVector(Vector<? extends HTMLNode> html, DotPair dp) on each sublist in the input parameter 'sublists'. Those sublists will be collected into another Vector and returned.
        Parameters:
        html - Any Vectorized-HTML Web-Page, or sub-page
        sublists - A "List of sublists" within that HTML page.
        Returns:
        This method shall return a Vector containing vectors as sublists.
        Code:
        Exact Method Body:
         Vector<SubSection> ret = new Vector<>();
        
         for (DotPair sublist : sublists)
             ret.addElement(new SubSection(sublist, toVector(html, sublist)));
        
         return ret;
        
      • toStream

        🡅  🡇     🗕  🗗  🗖
        public static java.util.stream.IntStream toStream​
                    (java.lang.Iterable<DotPair> dpi,
                     boolean leastToGreatest)
        
        This method will convert a list of DotPair instances to a Java java.util.stream.IntStream. The generated IntStream shall contain all Vector-indices (integers) that are within the bounds of any of the DotPair's listed by parameter 'dpi'.

        Stating this a second time, the returned position-index list (which is of Java Type IntStream) is built out of the contents of each and every one of the the DotPair's in Iterable-parameter 'dpi'. This index-list which is ultimately returned from this method will contain all indices that are "inside" all .

        Repeated Indices & Gaps:
        This method will never include an index more than once, and all integers in the returned IntStream will be unique.

        The sublists (the DotPair's of input-parameter 'dpi') may possibly (or even 'likely') overlap each-other. Furthermore, other DotPair / sublists may have several gaps between them. This method shall return an IntStream of unique, integer indices all of which are guaranteed to be inside at least one of dpi's sublists.

        NodeSearch Find 'all' Methods:
        Many of the "Find" Methods available in the Torello.HTML.NodeSearch package return instances of Vector<DotPair>. These DotPair lists are to be thought-of as "lists of sub-lists" for a Vectorized-HTML web-page. This method can help identify each and every integer-index that is inside at least one of these returned sublists.

        Stale Data Reminder:
        Try to keep in mind, always, that when writing code that modifies Vectorized-HTML, the moment any node is inserted or deleted into a Vector all references / indices to that exact Vector will become invalid (unless special care is taken to update those indices by the number of nodes that were inserted or removed!)

        There are myriad ways to handle this issue, many of which are beyond the scope of this Documentation Entry. Generally, one of the better suggestions to keep in mind, is that if you are modifying a Vectorized-HTML page, perform your updates or removals in reverse order, and your Vector index-list pointers will never become stale pointers.

        The interface Replaceable is also another way to avoid making elementary mistakes involving stale Vector-indices.
        Parameters:
        dpi - This may be any source for a class 'Dotpair' instance which implements the public interface java.lang.Iterable.
        leastToGreatest - When this parameter receives a TRUE value, the results that are returned from this IntStream will be sorted least to greatest. To generated an IntStream that produces results that are sorted from greatest to least, pass FALSE to this parameter.
        Returns:
        A java java.util.stream.IntStream of the integers in that are members of this Iterable<DotPair>
        Code:
        Exact Method Body:
         Iterator<DotPair>   iter    = dpi.iterator();
         TreeSet<DotPair>    ts      = new TreeSet<>();
        
         // The tree-set will add the "DotPair" to the tree - and keep them sorted,
         // since that's what "TreeSet" does.
        
         while (iter.hasNext()) ts.add(iter.next());
        
         Iterator<DotPair>   tsIter  = leastToGreatest ? ts.iterator() : ts.descendingIterator();
         IntStream.Builder   builder = IntStream.builder();
         DotPair             dp      = null;
        
         if (leastToGreatest)
        
             // We are building a "forward-index" stream... DO AS MUCH SORTING... AS POSSIBLE!
             while (tsIter.hasNext())
                 for (int i=(dp=tsIter.next()).start; i <= dp.end; i++)
                     builder.add(i);
        
         else
        
             // we are building a "reverse-index" stream... Make sure to add the sub-lists in
             // reverse-order.
        
             while (tsIter.hasNext())
                 for (int i=(dp=tsIter.next()).end; i >= dp.start; i--)
                     builder.add(i);
        
         if (leastToGreatest)
        
             // We have added them in order (mostly!!) - VERY-TRICKY, and this is the whole point... 
             // MULTIPLE, OVERLAPPING DOTPAIRS
             // We need to sort because the DotPair sublists have been added in "sorted order" but
             // the overall list is not (necessarily, but possibly) sorted!
        
             return builder.build().sorted().distinct();
        
         else
        
             // Here, the exact same argument holds, but also, when "re-sorting" we have to futz
             // around with the fact that Java's 'IntStream' class does not have a specialized
             // reverse-sort() (or alternate-sort()) method... (Kind of another JDK bug).
        
             return builder.build().map(i -> -i).sorted().map(i -> -i).distinct();
        
      • endPointsToStream

        🡅  🡇     🗕  🗗  🗖
        public static java.util.stream.IntStream endPointsToStream​
                    (java.lang.Iterable<DotPair> dpi,
                     boolean leastToGreatest)
        
        Collates a list of dotted-pairs into an IntStream. Here, only the starting and ending values of the DotPair's are inserted into the returned IntStream. Any indices that lay between DotPair.start and DotPair.end are not placed into the output-IntStream.

        All other behaviors of this method are the same as toStream(Iterable, boolean).
        Parameters:
        dpi - This may be any source for a class 'Dotpair' instance which implements the public interface java.lang.Iterable.
        leastToGreatest - When this parameter receives a TRUE value, the results that are returned from this IntStream will be sorted least to greatest. To generated an IntStream that produces results that are sorted from greatest to least, pass FALSE to this parameter.
        Returns:
        A java java.util.stream.IntStream of the integers in that are members of this Iterable<DotPair>. Only the values DotPair.start, and DotPair.end are included in the output-IntStream. This is unlike the method toStream(Iterable, boolean) in that, here, only the starting and ending points of the dotted-pair are placed into result. In the other method, the start-index, end-index and all indices in between them are placed into the returned-Stream.
        See Also:
        toStream(Iterable, boolean)
        Code:
        Exact Method Body:
         Iterator<DotPair>   iter    = dpi.iterator();
         TreeSet<DotPair>    ts      = new TreeSet<>();
        
         // The tree-set will add the "DotPair" to the tree - and keep them sorted,
         // since that's what "TreeSet" does.
        
         while (iter.hasNext()) ts.add(iter.next());
        
         Iterator<DotPair>   tsIter  = leastToGreatest ? ts.iterator() : ts.descendingIterator();
         IntStream.Builder   builder = IntStream.builder();
         DotPair             dp      = null;
        
         if (leastToGreatest)
        
             // We are building a "forward-index" stream... DO AS MUCH SORTING... AS POSSIBLE!
             while (tsIter.hasNext())
             {
                 dp = tsIter.next();
        
                 // In this method, only the start/end are placed into the IntStream
                 builder.add(dp.start);
        
                 // The indices BETWEEN start/end ARE NOT appened to the IntStream
                 builder.add(dp.end);
             }
        
         else
        
             // we are building a "reverse-index" stream... Make sure to add the sub-lists in
             // reverse-order.
        
             while (tsIter.hasNext())
             {
                 dp = tsIter.next();
        
                 // Only start/end are appended.
                 builder.add(dp.end);
        
                 // NOTE: This is a "reverse order" IntStream
                 builder.add(dp.start);
             }
        
         if (leastToGreatest)
        
             // We have added them in order (mostly!!) - VERY-TRICKY, and this is the whole point...
             // MULTIPLE, OVERLAPPING DOTPAIRS
             // We need to sort because the DotPair sublists have been added in "sorted order" but
             // the overall list is not (necessarily, but possibly) sorted!
        
             return builder.build().sorted().distinct();
        
         else
        
             // Here, the exact same argument holds, but also, when "re-sorting" we have to futz
             // around with the fact that Java's 'IntStream' class does not have a specialized
             // reverse-sort() (or alternate-sort()) method... (Kind of another JDK bug).
        
             return builder.build().map(i -> -i).sorted().map(i -> -i).distinct();
        
      • excludedToStream

        🡅     🗕  🗗  🗖
        public static java.util.stream.IntStream excludedToStream​
                    (java.lang.Iterable<DotPair> dpi,
                     int vectorSize,
                     boolean leastToGreatest)
        
        This method will first collate and sort a list of input 'DotPair' instances.
        Parameters:
        dpi - This may be any source for a class 'Dotpair' instance which implements the public interface java.lang.Iterable.
        vectorSize - This method internal-loop will begin at index '0' and proceed until 'vectorSize' - 1. Any value in this range that is found not be inside any of the provided DotPair's will be included inside the returned IntStream.
        leastToGreatest - When this parameter receives a TRUE value, the results that are returned from this IntStream will be sorted least to greatest. To generated an IntStream that produces results that are sorted from greatest to least, pass FALSE to this parameter.
        Returns:
        The list of ints
        Code:
        Exact Method Body:
         Iterator<DotPair>   iter    = dpi.iterator();
         TreeSet<DotPair>    ts      = new TreeSet<>();
        
         if (vectorSize < 1) throw new NException(
             "You have passed " + vectorSize + " to parameter vectorSize, but this value must be " +
             "at least 1, or greater."
         );
        
         // All this is going to do is MAKE SURE that the DotPair's are ordered, least to greatest
         while (iter.hasNext()) ts.add(iter.next());
        
         iter = leastToGreatest ? ts.iterator() : ts.descendingIterator();
        
         DotPair             dp  = iter.hasNext() ? iter.next() : null;
         IntStream.Builder   b   = IntStream.builder();
        
         if (leastToGreatest)
        
             for (int i=0; i < vectorSize;)
        
                 if (dp == null)             b.accept(i++);
                 else if (i < dp.start)      b.accept(i++);
                 else if (dp.isInside(i))    { i++; continue; }
                 else if (iter.hasNext())    Objects.requireNonNull(dp = iter.next());
                 else                        dp = null;
        
         else 
        
             for (int i=(vectorSize-1); i >= 0;)
        
                 if (dp == null)             b.accept(i--);
                 else if (i < dp.start)      b.accept(i--);
                 else if (dp.isInside(i))    { i--; continue; }
                 else if (iter.hasNext())    Objects.requireNonNull(dp = iter.next());
                 else                        dp = null;
        
         return b.build();