Package Torello.HTML
Class DPUtil
- java.lang.Object
-
- Torello.HTML.DPUtil
-
public class DPUtil extends java.lang.Object
Hi-Lited Source-Code:- View Here: Torello/HTML/DPUtil.java
- Open New Browser-Tab: Torello/HTML/DPUtil.java
File Size: 19,318 Bytes Line Count: 425 '\n' Characters Found
-
-
Constructor Summary
Constructors Constructor Description DPUtil()
-
Method Summary
Collate Multiple DotPair's to a Single Index-List Modifier and Type Method static PrimitiveIterator.OfInt
iterator(Iterable<DotPair> dpi, boolean leastToGreatest)
static int[]
toPosArray(Iterable<DotPair> dpi, boolean leastToGreatest)
static IntStream
toStream(Iterable<DotPair> dpi, boolean leastToGreatest)
Collate Multiple DotPair's to a Single Index-List, Include End-Points Only Modifier and Type Method static PrimitiveIterator.OfInt
endPointsIterator(Iterable<DotPair> dpi, boolean leastToGreatest)
static int[]
endPointsToPosArray(Iterable<DotPair> dpi, boolean leastToGreatest)
static IntStream
endPointsToStream(Iterable<DotPair> dpi, boolean leastToGreatest)
Retrieve HTMLNode's from an HTML-Vector, using one or more DotPair's Modifier and Type Method static Vector<SubSection>
toSubSections(Vector<? extends HTMLNode> html, Iterable<DotPair> sublists)
static Vector<HTMLNode>
toVector(Vector<? extends HTMLNode> html, DotPair dp)
static Vector<Vector<HTMLNode>>
toVectors(Vector<? extends HTMLNode> html, Iterable<DotPair> sublists)
Collate the Mirror-Inverse (a.k.a. "The Split") of Multiple DotPair's Modifier and Type Method static PrimitiveIterator.OfInt
excludedToIterator(Iterable<DotPair> dpi, int vectorSize, boolean leastToGreatest)
static int[]
excludedToPosArray(Iterable<DotPair> dpi, int vectorSize, boolean leastToGreatest)
static IntStream
excludedToStream(Iterable<DotPair> dpi, int vectorSize, boolean leastToGreatest)
-
-
-
Constructor Detail
-
DPUtil
public DPUtil()
-
-
Method Detail
-
toVector
public static java.util.Vector<HTMLNode> toVector (java.util.Vector<? extends HTMLNode> html, DotPair dp)
This method converts a sublist, represented by a "dotted pair", and converts it into aVector
ofHTMLNode
.
Retrieval Heuristic:
It should be obvious that the parameter @code 'dp'} contains fields namedDotPair.start
andDotPair.end
. These two simply represent the starting and ending indices into an HTML pageVector
. In this method, the intention is that they are indices into the HTML-Vector
parameter simply-named'html'
.
This method cycles through thatVector
, beginning with the fielddp.start
(inclusive) and ending withdp.end
(inclusive, as well). EachHTMLNode
reference within this sublist is inserted into theVector
that is ultimately returned.- Parameters:
html
- Any Vectorized-HTML Web-Page, or sub-pagedp
- Any sublist within that HTML page.- Returns:
- A
Vector
version of the original sublist that was represented by passed parameter'dp'
- Code:
- Exact Method Body:
Vector<HTMLNode> ret = new Vector<>(); // NOTE: Cannot return the result of "subList" because it is linked/mapped to the original // Vector. If changes are made to "subList", those changes will be reflected in the // original HTML Vector. ret.addAll(html.subList(dp.start, dp.end + 1)); return ret;
-
toVectors
public static java.util.Vector<java.util.Vector<HTMLNode>> toVectors (java.util.Vector<? extends HTMLNode> html, java.lang.Iterable<DotPair> sublists)
This will cycle through a "list of sublists" and call the methodtoVector(Vector<? extends HTMLNode> html, DotPair dp)
on each sublist in the input parameter'sublists'
Those sublists will be collected into anotherVector
and returned.- Parameters:
html
- Any Vectorized-HTML Web-Page, or sub-pagesublists
- A "List of sublists" within that HTML page.- Returns:
- This method shall return a
Vector
containing vectors as sublists. - Code:
- Exact Method Body:
Vector<Vector<HTMLNode>> ret = new Vector<>(); for (DotPair sublist : sublists) ret.addElement(toVector(html, sublist)); return ret;
-
toSubSections
public static java.util.Vector<SubSection> toSubSections (java.util.Vector<? extends HTMLNode> html, java.lang.Iterable<DotPair> sublists)
This will cycle through a "list of sublists" and call the methodtoVector(Vector<? extends HTMLNode> html, DotPair dp)
on each sublist in the input parameter'sublists'
. Those sublists will be collected into anotherVector
and returned.- Parameters:
html
- Any Vectorized-HTML Web-Page, or sub-pagesublists
- A "List of sublists" within that HTML page.- Returns:
- This method shall return a
Vector
containing vectors as sublists. - Code:
- Exact Method Body:
Vector<SubSection> ret = new Vector<>(); for (DotPair sublist : sublists) ret.addElement(new SubSection(sublist, toVector(html, sublist))); return ret;
-
iterator
-
toPosArray
public static int[] toPosArray(java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
-
toStream
public static java.util.stream.IntStream toStream (java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
This method will convert a list ofDotPair
instances to a Javajava.util.stream.IntStream
. The generatedIntStream
shall contain allVector
-indices (integers) that are within the bounds of any of theDotPair's
listed by parameter'dpi'
.
Stating this a second time, the returned position-index list (which is of Java TypeIntStream
) is built out of the contents of each and every one of the theDotPair's
inIterable
-parameter'dpi'
. This index-list which is ultimately returned from this method will contain all indices that are "inside" all .
Repeated Indices & Gaps:
This method will never include an index more than once, and all integers in the returnedIntStream
will be unique.
The sublists (theDotPair's
of input-parameter'dpi'
) may possibly (or even 'likely') overlap each-other. Furthermore, otherDotPair
/ sublists may have several gaps between them. This method shall return anIntStream
of unique, integer indices all of which are guaranteed to be inside at least one ofdpi's
sublists.
NodeSearch Find 'all' Methods:
Many of the "Find" Methods available in theTorello.HTML.NodeSearch
package return instances ofVector<DotPair>
. TheseDotPair
lists are to be thought-of as "lists of sub-lists" for a Vectorized-HTML web-page. This method can help identify each and every integer-index that is inside at least one of these returned sublists.
Stale Data Reminder:
Try to keep in mind, always, that when writing code that modifies Vectorized-HTML, the moment any node is inserted or deleted into aVector
all references / indices to that exactVector
will become invalid (unless special care is taken to update those indices by the number of nodes that were inserted or removed!)
There are myriad ways to handle this issue, many of which are beyond the scope of this Documentation Entry. Generally, one of the better suggestions to keep in mind, is that if you are modifying a Vectorized-HTML page, perform your updates or removals in reverse order, and yourVector
index-list pointers will never become stale pointers.
The interfaceReplaceable
is also another way to avoid making elementary mistakes involving staleVector
-indices.- Parameters:
dpi
- This may be any source for aclass 'Dotpair'
instance which implements the public interfacejava.lang.Iterable
.leastToGreatest
- When this parameter receives aTRUE
value, the results that are returned from thisIntStream
will be sorted least to greatest. To generated anIntStream
that produces results that are sorted from greatest to least, passFALSE
to this parameter.- Returns:
- A java
java.util.stream.IntStream
of the integers in that are members of thisIterable<DotPair>
- Code:
- Exact Method Body:
Iterator<DotPair> iter = dpi.iterator(); TreeSet<DotPair> ts = new TreeSet<>(); // The tree-set will add the "DotPair" to the tree - and keep them sorted, // since that's what "TreeSet" does. while (iter.hasNext()) ts.add(iter.next()); Iterator<DotPair> tsIter = leastToGreatest ? ts.iterator() : ts.descendingIterator(); IntStream.Builder builder = IntStream.builder(); DotPair dp = null; if (leastToGreatest) // We are building a "forward-index" stream... DO AS MUCH SORTING... AS POSSIBLE! while (tsIter.hasNext()) for (int i=(dp=tsIter.next()).start; i <= dp.end; i++) builder.add(i); else // we are building a "reverse-index" stream... Make sure to add the sub-lists in // reverse-order. while (tsIter.hasNext()) for (int i=(dp=tsIter.next()).end; i >= dp.start; i--) builder.add(i); if (leastToGreatest) // We have added them in order (mostly!!) - VERY-TRICKY, and this is the whole point... // MULTIPLE, OVERLAPPING DOTPAIRS // We need to sort because the DotPair sublists have been added in "sorted order" but // the overall list is not (necessarily, but possibly) sorted! return builder.build().sorted().distinct(); else // Here, the exact same argument holds, but also, when "re-sorting" we have to futz // around with the fact that Java's 'IntStream' class does not have a specialized // reverse-sort() (or alternate-sort()) method... (Kind of another JDK bug). return builder.build().map(i -> -i).sorted().map(i -> -i).distinct();
-
endPointsIterator
public static java.util.PrimitiveIterator.OfInt endPointsIterator (java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
-
endPointsToPosArray
public static int[] endPointsToPosArray(java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
Convenience Method
Invokes:endPointsToStream(Iterable, boolean)
Converts: output to anint[]
array.
-
endPointsToStream
public static java.util.stream.IntStream endPointsToStream (java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
Collates a list of dotted-pairs into anIntStream
. Here, only the starting and ending values of theDotPair's
are inserted into the returnedIntStream
. Any indices that lay betweenDotPair.start
andDotPair.end
are not placed into the output-IntStream
.
All other behaviors of this method are the same astoStream(Iterable, boolean)
.- Parameters:
dpi
- This may be any source for aclass 'Dotpair'
instance which implements the public interfacejava.lang.Iterable
.leastToGreatest
- When this parameter receives aTRUE
value, the results that are returned from thisIntStream
will be sorted least to greatest. To generated anIntStream
that produces results that are sorted from greatest to least, passFALSE
to this parameter.- Returns:
- A java
java.util.stream.IntStream
of the integers in that are members of thisIterable<DotPair>
. Only the valuesDotPair.start
, andDotPair.end
are included in the output-IntStream
. This is unlike the methodtoStream(Iterable, boolean)
in that, here, only the starting and ending points of the dotted-pair are placed into result. In the other method, the start-index, end-index and all indices in between them are placed into the returned-Stream
. - See Also:
toStream(Iterable, boolean)
- Code:
- Exact Method Body:
Iterator<DotPair> iter = dpi.iterator(); TreeSet<DotPair> ts = new TreeSet<>(); // The tree-set will add the "DotPair" to the tree - and keep them sorted, // since that's what "TreeSet" does. while (iter.hasNext()) ts.add(iter.next()); Iterator<DotPair> tsIter = leastToGreatest ? ts.iterator() : ts.descendingIterator(); IntStream.Builder builder = IntStream.builder(); DotPair dp = null; if (leastToGreatest) // We are building a "forward-index" stream... DO AS MUCH SORTING... AS POSSIBLE! while (tsIter.hasNext()) { dp = tsIter.next(); // In this method, only the start/end are placed into the IntStream builder.add(dp.start); // The indices BETWEEN start/end ARE NOT appened to the IntStream builder.add(dp.end); } else // we are building a "reverse-index" stream... Make sure to add the sub-lists in // reverse-order. while (tsIter.hasNext()) { dp = tsIter.next(); // Only start/end are appended. builder.add(dp.end); // NOTE: This is a "reverse order" IntStream builder.add(dp.start); } if (leastToGreatest) // We have added them in order (mostly!!) - VERY-TRICKY, and this is the whole point... // MULTIPLE, OVERLAPPING DOTPAIRS // We need to sort because the DotPair sublists have been added in "sorted order" but // the overall list is not (necessarily, but possibly) sorted! return builder.build().sorted().distinct(); else // Here, the exact same argument holds, but also, when "re-sorting" we have to futz // around with the fact that Java's 'IntStream' class does not have a specialized // reverse-sort() (or alternate-sort()) method... (Kind of another JDK bug). return builder.build().map(i -> -i).sorted().map(i -> -i).distinct();
-
excludedToIterator
public static java.util.PrimitiveIterator.OfInt excludedToIterator (java.lang.Iterable<DotPair> dpi, int vectorSize, boolean leastToGreatest)
Convenience Method
Invokes:excludedToStream(Iterable, int, boolean)
Converts: output to anIterator
-
excludedToPosArray
public static int[] excludedToPosArray(java.lang.Iterable<DotPair> dpi, int vectorSize, boolean leastToGreatest)
Convenience Method
Invokes:excludedToStream(Iterable, int, boolean)
Converts: output to anint[]
array.
-
excludedToStream
public static java.util.stream.IntStream excludedToStream (java.lang.Iterable<DotPair> dpi, int vectorSize, boolean leastToGreatest)
This method will first collate and sort a list of input'DotPair'
instances.- Parameters:
dpi
- This may be any source for aclass 'Dotpair'
instance which implements the public interfacejava.lang.Iterable
.vectorSize
- This method internal-loop will begin at index'0'
and proceed until'vectorSize' - 1
. Any value in this range that is found not be inside any of the providedDotPair's
will be included inside the returnedIntStream
.leastToGreatest
- When this parameter receives aTRUE
value, the results that are returned from thisIntStream
will be sorted least to greatest. To generated anIntStream
that produces results that are sorted from greatest to least, passFALSE
to this parameter.- Returns:
- The list of ints
- Code:
- Exact Method Body:
Iterator<DotPair> iter = dpi.iterator(); TreeSet<DotPair> ts = new TreeSet<>(); if (vectorSize < 1) throw new NException( "You have passed " + vectorSize + " to parameter vectorSize, but this value must be " + "at least 1, or greater." ); // All this is going to do is MAKE SURE that the DotPair's are ordered, least to greatest while (iter.hasNext()) ts.add(iter.next()); iter = leastToGreatest ? ts.iterator() : ts.descendingIterator(); DotPair dp = iter.hasNext() ? iter.next() : null; IntStream.Builder b = IntStream.builder(); if (leastToGreatest) for (int i=0; i < vectorSize;) if (dp == null) b.accept(i++); else if (i < dp.start) b.accept(i++); else if (dp.isInside(i)) { i++; continue; } else if (iter.hasNext()) Objects.requireNonNull(dp = iter.next()); else dp = null; else for (int i=(vectorSize-1); i >= 0;) if (dp == null) b.accept(i--); else if (i < dp.start) b.accept(i--); else if (dp.isInside(i)) { i--; continue; } else if (iter.hasNext()) Objects.requireNonNull(dp = iter.next()); else dp = null; return b.build();
-
-