java.lang.Object
- Torello.HTML.Util

```
public class Util
extends java.lang.Object
```
A long list of utilities for searching, finding, extracting and removing HTML from Vectorized-HTML.

This is a list of some of the common "helper routines" that I occasionally need. There are not in any particular order. Almost all of these routines are used internally, either in the NodeSearch search-loops and iterators, or else they are found in parts of package "Tools." The possibility to expand classes like this is probably "boundless" - however, keep in mind that classes like public class 'SubSection' and also public class 'NodeIndex' and both of its sub-classes public class 'TagNodeIndex' and 'TextNodeIndex' make some of the short, for-loop-driven, helper-routines seems a little spurious.

The most complicated and easy-to-make-mistakes are the for-loops & iterators of the node-search package. With these solidly tested for over a year, the helper routines that build those for-loops are included in this class here. Extending more utility and modification tools for vectorized-html pages might be the subject of future development work, but easily the most complicated stuff - search and iterate - have been handled. The methods here might be useful, but it is not a "precise science" on what is a usable class, and what is not. Please remember that the methods ending in "OPT" (meaning optimized) really just mean that a couple of the exception throw checks are not there, because those do not need to be repeated on each iteration of a node-search search-for-loop when the for-loop criteria are specified in the method-signature, and (hopefully, obviously) do not need to be checked on each loop iteration.
Hi-Lited Source-Code:
- View Here: Torello/HTML/Util.java
- Open New Browser-Tab: Torello/HTML/Util.java
File Size: 88,298 Bytes Line Count: 2,066 '\n' Characters Found
Stateless Class:
This class neither contains any program-state, nor can it be instantiated. The @StaticFunctional Annotation may also be called 'The Spaghetti Report'. Static-Functional classes are, essentially, C-Styled Files, without any constructors or non-static member fields. It is a concept very similar to the Java-Bean's @Stateless Annotation.
- 1 Constructor(s), 1 declared private, zero-argument constructor
- 36 Method(s), 36 declared static
- 0 Field(s)

Nested Class Summary

Nested Classes
Modifier and Type Class

static class Util.Count

static class Util.Inclusive

static class Util.Remove

Method Summary

Convert Vectorized-HTML to a String

Modifier and Type	Method
`static String`	`pageToString(Vector<? extends HTMLNode> html)`
`static String`	`rangeToString(Vector<? extends HTMLNode> html, int sPos, int ePos)`
`static String`	`rangeToString(Vector<? extends HTMLNode> html, DotPair dp)`

Compact Multiple, Contiguous TextNodes to one TextNode
Modifier and Type	Method
`static int`	`compactTextNodes(Vector<HTMLNode> html)`
`static int`	`compactTextNodes(Vector<HTMLNode> html, int sPos, int ePos)`
`static int`	`compactTextNodes(Vector<HTMLNode> html, DotPair dp)`

Convert all TextNode's to a Single-String
Modifier and Type	Method
`static String`	`textNodesString(Vector<? extends HTMLNode> html)`
`static String`	`textNodesString(Vector<? extends HTMLNode> html, int sPos, int ePos)`
`static String`	`textNodesString(Vector<? extends HTMLNode> html, DotPair dp)`

Invoke String.trim() on all TextNode instances
Modifier and Type	Method
`static int`	`trimTextNodes(Vector<HTMLNode> page, boolean deleteZeroLengthStrings)`
`static int`	`trimTextNodes(Vector<HTMLNode> page, int sPos, int ePos, boolean deleteZeroLengthStrings)`
`static int`	`trimTextNodes(Vector<HTMLNode> page, DotPair dp, boolean deleteZeroLengthStrings)`

Replace 'escapable' Text, with HTML Escape-Strings
Modifier and Type	Method
`static int`	`escapeTextNodes(Vector<HTMLNode> html)`
`static int`	`escapeTextNodes(Vector<HTMLNode> html, int sPos, int ePos)`
`static int`	`escapeTextNodes(Vector<HTMLNode> html, DotPair dp)`

Total String.length() for all HTMLNode.str
Modifier and Type	Method
`static int`	`strLength(Vector<? extends HTMLNode> html)`
`static int`	`strLength(Vector<? extends HTMLNode> html, int sPos, int ePos)`
`static int`	`strLength(Vector<? extends HTMLNode> html, DotPair dp)`

Total String.length() for all TextNode.str
Modifier and Type	Method
`static int`	`textStrLength(Vector<? extends HTMLNode> html)`
`static int`	`textStrLength(Vector<? extends HTMLNode> html, int sPos, int ePos)`
`static int`	`textStrLength(Vector<? extends HTMLNode> html, DotPair dp)`

Retrieve In-Line JSON Script
Modifier and Type	Method
`static Stream<String>`	`getJSONScriptBlocks(Vector<HTMLNode> html)`
`static Stream<String>`	`getJSONScriptBlocks(Vector<HTMLNode> html, int sPos, int ePos)`
`static Stream<String>`	`getJSONScriptBlocks(Vector<HTMLNode> html, DotPair dp)`

java.util.Vector Improvements: Clone Elements
Modifier and Type	Method
`static Vector<HTMLNode>`	`clone(Vector<? extends HTMLNode> html)`
`static Vector<HTMLNode>`	`cloneRange(Vector<? extends HTMLNode> html, int sPos, int ePos)`
`static Vector<HTMLNode>`	`cloneRange(Vector<? extends HTMLNode> html, DotPair dp)`

java.util.Vector Improvements: Insert Elements
Modifier and Type	Method
`static void`	`insertNodes(Vector<HTMLNode> html, int pos, HTMLNode... nodes)`

java.util.Vector Improvements: Poll (Remove & Return) Elements
Modifier and Type	Method
`static Vector<HTMLNode>`	`pollRange(Vector<? extends HTMLNode> html, int sPos, int ePos)`
`static Vector<HTMLNode>`	`pollRange(Vector<? extends HTMLNode> html, DotPair dp)`

java.util.Vector Improvements: Replace Elements
Modifier and Type	Method
`static void`	`replaceRange(Vector<HTMLNode> page, int sPos, int ePos, Vector<HTMLNode> newNodes)`
`static void`	`replaceRange(Vector<HTMLNode> page, DotPair range, Vector<HTMLNode> newNodes)`

Hash Code
Modifier and Type	Method
`static int`	`hashCode(Vector<? extends HTMLNode> html)`
`static int`	`hashCode(Vector<? extends HTMLNode> html, int sPos, int ePos)`
`static int`	`hashCode(Vector<? extends HTMLNode> html, DotPair dp)`

More Functions
Modifier and Type	Method
`static Vector<HTMLNode>`	`split(Vector<? extends HTMLNode> html, int pos)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Detail

trimTextNodes

🡇 ⇈ ⮫ 🗕 🗗 🗖

public static int trimTextNodes(java.util.Vector<HTMLNode> page,
                                boolean deleteZeroLengthStrings)

Convenience Method
Invokes: trimTextNodes(Vector, int, int, boolean)

trimTextNodes

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static int trimTextNodes(java.util.Vector<HTMLNode> page,
                                DotPair dp,
                                boolean deleteZeroLengthStrings)

Convenience Method
Receives: DotPair
Invokes: trimTextNodes(Vector, int, int, boolean)

trimTextNodes

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static int trimTextNodes(java.util.Vector<HTMLNode> page,
                                int sPos,
                                int ePos,
                                boolean deleteZeroLengthStrings)

This will iterate through the entire Vector<HTMLNode>, and invoke java.lang.String.trim() on each TextNode on the page. If this invocation results in a reduction of String.length(), then a new TextNode will be instantiated whose TextNode.str field is set to the result of the String.trim(old_node.str) operation.

Parameters:

deleteZeroLengthStrings - If a TextNode's length is zero (before or after trim() is called) and when this parameter is TRUE, that TextNode must be removed from the Vector.

Returns:

Any node that is trimmed or deleted will increment the counter. This counter final-value is returned

Code:

Exact Method Body:

 int                 counter = 0;
 IntStream.Builder   b       = deleteZeroLengthStrings ? IntStream.builder() : null;
 HTMLNode            n       = null;
 LV                  l       = new LV(page, sPos, ePos);

 for (int i=l.start; i < l.end; i++)

     if ((n = page.elementAt(i)).isTextNode())
     {
         String  trimmed         = n.str.trim();
         int     trimmedLength   = trimmed.length();

         if ((trimmedLength == 0) && deleteZeroLengthStrings)
             { b.add(i); counter++; }

         else if (trimmedLength < n.str.length())
             { page.setElementAt(new TextNode(trimmed), i); counter++; }
     }

 if (deleteZeroLengthStrings) Util.Remove.nodesOPT(page, b.build().toArray());

 return counter;

pageToString

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.lang.String pageToString
            (java.util.Vector<? extends HTMLNode> html)

Convenience Method
Invokes: rangeToString(Vector, int, int)

rangeToString

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.lang.String rangeToString
            (java.util.Vector<? extends HTMLNode> html,
             DotPair dp)

Convenience Method
Receives: DotPair
Invokes: rangeToString(Vector, int, int)

rangeToString

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static java.lang.String rangeToString
            (java.util.Vector<? extends HTMLNode> html,
             int sPos,
             int ePos)
```
The purpose of this method/function is to convert a portion of the contents of an HTML-Page, currently being represented as a Vector of HTMLNode's into a String. Two 'int' parameters are provided in this method's signature to define a sub-list of a page to be converted to a java.lang.String
Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

The Vector converted into a String.

Throws:
java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector

If 'ePos' is zero, or greater than the size of the Vector

If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
See Also:

pageToString(Vector), rangeToString(Vector, DotPair)

Code:
Exact Method Body:

StringBuilder ret = new StringBuilder(); LV l = new LV(html, sPos, ePos); for (int i=l.start; i < l.end; i++) ret.append(html.elementAt(i).str); return ret.toString();

textNodesString

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.lang.String textNodesString
            (java.util.Vector<? extends HTMLNode> html)

Convenience Method
Invokes: textNodesString(Vector, int, int)

textNodesString

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.lang.String textNodesString
            (java.util.Vector<? extends HTMLNode> html,
             DotPair dp)

Convenience Method
Receives: DotPair
Invokes: textNodesString(Vector, int, int)

textNodesString

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static java.lang.String textNodesString
            (java.util.Vector<? extends HTMLNode> html,
             int sPos,
             int ePos)
```
This will return a String that is comprised of ONLY the TextNode's contained within the input Vector - and furthermore, only nodes that are situated between index int 'sPos' and index int 'ePos' in that Vector.

The for-loop that iterates the input-Vector parameter will simply skip an instance of 'TagNode' and 'CommentNode' when building the output return String..
Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

This will return a String that is comprised of the text-only elements in the web-page or sub-page. Only text between the requested Vector-indices is included.

Throws:
java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector

If 'ePos' is zero, or greater than the size of the Vector

If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
See Also:

textNodesString(Vector, DotPair), textNodesString(Vector)

Code:
Exact Method Body:

StringBuilder sb = new StringBuilder(); LV l = new LV(html, sPos, ePos); HTMLNode n; for (int i=l.start; i < l.end; i++) if ((n = html.elementAt(i)).isTextNode()) sb.append(n.str); return sb.toString();

escapeTextNodes

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static int escapeTextNodes(java.util.Vector<HTMLNode> html)
```
Convenience Method
Invokes: escapeTextNodes(Vector, int, int)

escapeTextNodes

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static int escapeTextNodes(java.util.Vector<HTMLNode> html,
                                  DotPair dp)

Convenience Method
Receives: DotPair
Invokes: escapeTextNodes(Vector, int, int)

escapeTextNodes

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static int escapeTextNodes(java.util.Vector<HTMLNode> html,
                                  int sPos,
                                  int ePos)
```
Will call HTML.Escape.replaceAll on each TextNode in the range of sPos ... ePos
Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

The number of TextNode's that changed as a result of the Escape.replaceAll(n.str) loop.

Throws:
java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector

If 'ePos' is zero, or greater than the size of the Vector

If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
See Also:

Escape.replaceAll(String)

Code:
Exact Method Body:

LV l = new LV(html, sPos, ePos); HTMLNode n = null; String s = null; int counter = 0; for (int i=l.start; i < l.end; i++) if ((n = html.elementAt(i)).isTextNode()) if (! (s = Escape.replace(n.str)).equals(n.str)) { html.setElementAt(new TextNode(s), i); counter++; } return counter;

clone

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.util.Vector<HTMLNode> clone
            (java.util.Vector<? extends HTMLNode> html)

Convenience Method
Invokes: cloneRange(Vector, int, int)

cloneRange

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.util.Vector<HTMLNode> cloneRange
            (java.util.Vector<? extends HTMLNode> html,
             DotPair dp)

Convenience Method
Receives: DotPair
Invokes: cloneRange(Vector, int, int)

cloneRange

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static java.util.Vector<HTMLNode> cloneRange
            (java.util.Vector<? extends HTMLNode> html,
             int sPos,
             int ePos)
```
Copies (clones!) a sub-range of the HTML page, stores the results in a Vector, and returns it.
Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

The "cloned" (copied) sub-range specified by 'sPos' and 'ePos'.

Throws:
java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector

If 'ePos' is zero, or greater than the size of the Vector

If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
See Also:

cloneRange(Vector, DotPair)

Code:
Exact Method Body:

LV l = new LV(html, sPos, ePos); Vector<HTMLNode> ret = new Vector<>(l.size()); // Copy the range specified into the return vector // // HOW THIS WAS DONE BEFORE NOTICING Vector.subList // // for (int i = l.start; i < l.end; i++) ret.addElement(html.elementAt(i)); ret.addAll(html.subList(l.start, l.end)); return ret;

textStrLength

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static int textStrLength(java.util.Vector<? extends HTMLNode> html,
                                DotPair dp)

Convenience Method
Receives: DotPair
Invokes: textStrLength(Vector, int, int)

textStrLength

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static int textStrLength(java.util.Vector<? extends HTMLNode> html)
```
Convenience Method
Invokes: textStrLength(Vector, int, int)

textStrLength

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static int textStrLength(java.util.Vector<? extends HTMLNode> html,
                                int sPos,
                                int ePos)
```
This method will return the length of the strings contained by all/only instances of 'TextNode' among the nodes of the input HTML-Vector. This is identical to the behavior of the method with the same name, but includes starting and ending bounds on the html Vector: 'sPos' & 'ePos'.
Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

The sum of the lengths of the text contained by text-nodes in the Vector between 'sPos' and 'ePos'.

Throws:
java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector

If 'ePos' is zero, or greater than the size of the Vector

If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
Code:
Exact Method Body:

HTMLNode n; int sum = 0; LV l = new LV(html, sPos, ePos); // Counts the length of each "String" in a "TextNode" between sPos and ePos for (int i=l.start; i < l.end; i++) if ((n = html.elementAt(i)).isTextNode()) sum += n.str.length(); return sum;

compactTextNodes

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static int compactTextNodes(java.util.Vector<HTMLNode> html)
```
Convenience Method
Invokes: compactTextNodes(Vector, int, int)

compactTextNodes

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static int compactTextNodes(java.util.Vector<HTMLNode> html,
                                   DotPair dp)

Convenience Method
Receives: DotPair
Invokes: compactTextNodes(Vector, int, int)

compactTextNodes

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static int compactTextNodes(java.util.Vector<HTMLNode> html,
                                   int sPos,
                                   int ePos)

Occasionally, when removing instances of TagNode from a vectorized-html page, certain instances of TextNode which were not adjacent / neighbours in the Vector, all of a sudden become adjacent. Although there are no major problems with contiguous instances of TextNode from the Search Algorithm's perspective, for programmer's, it can sometimes be befuddling to realize that the output text that is returned from a call to Util.pageToString(html) is not being found because the text that is left is broken amongst multiple instances of adjacent TextNodes.

This method merely combines "Adjacent" instances of class TextNode in the Vector into single instances of class TextNode

Parameters:

html - Any vectorized-html web-page. If this page contain any contiguously placed TextNode's, the extra's will be eliminated, and the internal-string's inside the node's (TextNode.str) will be combined. This action will reduce the size of the actual html-Vector.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

The number of nodes that were eliminated after being combined, or 0 if there were no text-nodes that were removed.

Throws:

java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
If 'ePos' is zero, or greater than the size of the Vector
If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.

See Also:

HTMLNode.str, TextNode

Code:

Exact Method Body:

 LV      l           = new LV(html, sPos, ePos);
 boolean compacting  = false;
 int     firstPos    = -1;
 int     delta       = 0;

 for (int i=l.start; i < (l.end - delta); i++)

     if (html.elementAt(i).isTextNode())
     {
         if (compacting) continue;   // Not in "Compacting Mode"
         compacting  = true;         // Start "Compacting Mode" - this is a TextNode
         firstPos    = i;
     }

     else if (compacting && (firstPos < (i-1)))  // Else - Must be a TagNode or CommentNode
     {
         // Save compacted TextNode String's into this StringBuilder
         StringBuilder compacted = new StringBuilder();

         // Iterate all TextNodes that were adjacent, put them together into StringBuilder
         for (int j=firstPos; j < i; j++) compacted.append(html.elementAt(j).str);

         // Place this new "aggregate TextNode" at location of the first TextNode that
         // was compacted into this StringBuilder

         html.setElementAt(new TextNode(compacted.toString()), firstPos);

         // Remove the rest of the positions in the Vector that had TextNode's.  These have
         // all been put together into the "Aggregate TextNode" at position "firstPos"

         Util.Remove.range(html, firstPos + 1, i);

         // The change in the size of the Vector needs to be accounted for.
         delta += (i - firstPos - 1);

         // Change the loop-counter variable, too, since the size of the Vector has changed.
         i = firstPos + 1;

         // Since we just hit a CommentNode, or TagNode, exit "Compacting Mode."
         compacting = false;

     }

     // NOTE: This, ALSO, MUST BE a TagNode or CommentNode (just like the previous
     //       if-else branch !)
     // TRICKY: Don't forget this 'else' !

     else compacting = false;

 // Added - Don't forget the case where the Vector ends with a series of TextNodes
 // TRICKY TOO! (Same as the HTML Parser... The ending or 'trailing' nodes must be parsed

 int lastNodePos = html.size() - 1;

 if (html.elementAt(lastNodePos).isTextNode()) if (compacting && (firstPos < lastNodePos))
 {
     StringBuilder compacted = new StringBuilder();

     // Compact the TextNodes that were identified at the end of the Vector range.
     for (int j=firstPos; j <= lastNodePos; j++) compacted.append(html.elementAt(j).str);

     // Replace the group of TextNode's at the end of the Vector, with the single, aggregate
     html.setElementAt(new TextNode(compacted.toString()), firstPos);
     Util.Remove.range(html, firstPos + 1, lastNodePos + 1);
 }

 return delta;

strLength

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static int strLength(java.util.Vector<? extends HTMLNode> html)
```
Convenience Method
Invokes: strLength(Vector, int, int)

strLength

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static int strLength(java.util.Vector<? extends HTMLNode> html,
                            DotPair dp)

Convenience Method
Receives: DotPair
Invokes: strLength(Vector, int, int)

strLength

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static int strLength(java.util.Vector<? extends HTMLNode> html,
                            int sPos,
                            int ePos)
```
This method simply adds / sums the String-length of every HTMLNode.str field in the passed page-Vector. It only counts nodes between parameters sPos (inclusive) and ePos (exclusive).
Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

The total length - in characters - of the sub-page of HTML between 'sPos' and 'ePos'

Throws:
java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector

If 'ePos' is zero, or greater than the size of the Vector

If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
See Also:

strLength(Vector)

Code:
Exact Method Body:

int ret = 0; LV l = new LV(html, sPos, ePos); for (int i=l.start; i < l.end; i++) ret += html.elementAt(i).str.length(); return ret;

hashCode

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static int hashCode(java.util.Vector<? extends HTMLNode> html)
```
Convenience Method
Invokes: hashCode(Vector, int, int)

hashCode

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static int hashCode(java.util.Vector<? extends HTMLNode> html,
                           DotPair dp)

Convenience Method
Receives: DotPair
Invokes: hashCode(Vector, int, int)

hashCode

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static int hashCode(java.util.Vector<? extends HTMLNode> html,
                           int sPos,
                           int ePos)
```
Generates a hash-code for a vectorized html page-Vector.
Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

Returns the String.hashCode() of the partial HTML-page as if it were not being stored as a Vector, but rather as HTML inside of a Java-String.

Throws:
java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector

If 'ePos' is zero, or greater than the size of the Vector

If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
See Also:

hashCode(Vector)

Code:
Exact Method Body:

int h = 0; LV lv = new LV(html, sPos, ePos); for (int j=lv.start; j < lv.end; j++) { String s = html.elementAt(j).str; int l = s.length(); // This line has been copied from the jdk8/jdk8 "String.hashCode()" method. // The difference is that it iterates over the entire vector for (int i=0; i < l; i++) h = 31 * h + s.charAt(i); } return h;

getJSONScriptBlocks

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.util.stream.Stream<java.lang.String> getJSONScriptBlocks
            (java.util.Vector<HTMLNode> html)

Convenience Method
Invokes: getJSONScriptBlocks(Vector, int, int)

getJSONScriptBlocks

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.util.stream.Stream<java.lang.String> getJSONScriptBlocks
            (java.util.Vector<HTMLNode> html,
             DotPair dp)

Convenience Method
Receives: DotPair.
Invokes: getJSONScriptBlocks(Vector, int, int)

getJSONScriptBlocks

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.util.stream.Stream<java.lang.String> getJSONScriptBlocks
            (java.util.Vector<HTMLNode> html,
             int sPos,
             int ePos)

This method shall search for any and all <SCRIPT TYPE="json"> JSON TEXT </SCRIPT> block present in a range of Vectorized HTML. The search method shall simply look for the toke "JSON" in the TYPE attribute of each and every <SCRIPT> TagNode that is found on the page. The validity of the JSON found within such blocks is not checked for validity, nor is it even guaranteed to be JSON data!

Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

This will return a java.util.stream.Stream<String> of each of the JSON elements present in the specified range of the Vectorized HTML passed to parameter 'html'.

Conversion-Target	Stream-Method Invocation
`String[]`	`Stream.toArray(String[]::new);`
`List<String>`	`Stream.collect(Collectors.toList());`
`Vector<String>`	`Stream.collect(Collectors.toCollection(Vector::new));`
`TreeSet<String>`	`Stream.collect(Collectors.toCollection(TreeSet::new));`
`Iterator<String>`	`Stream.iterator();`

Code:

Exact Method Body:

 // Whenever building lists, it is usually easiest to use a Stream.Builder
 Stream.Builder<String> b = Stream.builder();

 // This Predicate simply tests that if the substring "json" (CASE INSENSITIVE) is found
 // in the TYPE attribute of a <SCRIPT TYPE=...> node, that the token-string is, indeed a
 // word - not a substring of some other word.  For instance: TYPE="json" would PASS, but
 // TYPE="rajsong" would FAIL - because the token string is not surrounded by white-space

 final Predicate<String> tester = (String s) ->
     StrTokCmpr.containsIgnoreCase
         (s, (Character c) -> ! Character.isLetterOrDigit(c), "json");

 // Find all <SCRIPT> node-blocks whose "TYPE" attribute abides by the tester
 // String-Predicate named above.

 Vector<DotPair> jsonDPList = InnerTagFindInclusive.all
     (html, sPos, ePos, "script", "type", tester);

 // Convert each of these DotPair element into a java.lang.String
 // Add the String to the Stream.Builder<String>

 for (DotPair jsonDP : jsonDPList)
     if (jsonDP.size() > 2)
         b.accept(Util.rangeToString(html, jsonDP.start + 1, jsonDP.end));

 // Build the Stream, and return it.
 return b.build();

insertNodes

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static void insertNodes(java.util.Vector<HTMLNode> html,
                               int pos,
                               HTMLNode... nodes)

Inserts nodes, and allows a 'varargs' parameter.

Parameters:

html - Any HTML Page

pos - The position in the original Vector where the nodes shall be inserted.

nodes - A list of nodes to insert.

Code:

Exact Method Body:

 Vector<HTMLNode> nodesVec = new Vector<>(nodes.length);
 for (HTMLNode node : nodes) nodesVec.addElement(node);
 html.addAll(pos, nodesVec);

replaceRange

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static void replaceRange(java.util.Vector<HTMLNode> page,
                                DotPair range,
                                java.util.Vector<HTMLNode> newNodes)

Convenience Method
Invokes: replaceRange(Vector, int, int, Vector)

replaceRange

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static void replaceRange(java.util.Vector<HTMLNode> page,
                                int sPos,
                                int ePos,
                                java.util.Vector<HTMLNode> newNodes)
```
Replaces any all and all HTMLNode's located between the Vector locations 'sPos' (inclusive) and 'ePos' (exclusive). By exclusive, this means that the HTMLNode located at positon 'ePos' will not be replaced, but the one at 'sPos' is replaced.

The size of the Vector will change by newNodes.size() - (ePos + sPos). The contents situated between Vector location sPos and sPos + newNodes.size() will, indeed, be the contents of the 'newNodes' parameter.
Parameters:

page - Any Java HTML page, constructed of HTMLNode (TagNode & TextNode)

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

newNodes - Any Java HTML page-Vector of HTMLNode.

Throws:
java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector

If 'ePos' is zero, or greater than the size of the Vector

If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
See Also:

pollRange(Vector, int, int), Util.Remove.range(Vector, int, int), replaceRange(Vector, DotPair, Vector)

Code:
Exact Method Body:

// Torello.Java.LV LV l = new LV(sPos, ePos, page); int oldSize = ePos - sPos; int newSize = newNodes.size(); int insertPos = sPos; int i = 0; while ((i < newSize) && (i < oldSize)) page.setElementAt(newNodes.elementAt(i++), insertPos++); // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // CASE ONE: // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** if (newSize == oldSize) return; // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // CASE TWO: // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // // The new Vector is SMALLER than the old sub-range // The rest of the nodes just need to be trashed // // OLD-WAY: (Before realizing what Vector.subList is actually doing) // Util.removeRange(page, insertPos, ePos); if (newSize < oldSize) page.subList(insertPos, ePos).clear(); // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // CASE THREE: // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // // The new Vector is BIGGER than the old sub-range // There are still more nodes to insert. else page.addAll(ePos, newNodes.subList(i, newSize));

pollRange

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
public static java.util.Vector<HTMLNode> pollRange
            (java.util.Vector<? extends HTMLNode> html,
             int sPos,
             int ePos)
```
Java's java.util.Vector class does not allow public access to the removeRange(start, end) function. It is listed as 'protected' in Java's Documentation about the class Vector. This method upstages that, and performs the 'Poll' operation, where the nodes are first removed, stored, and then return as a function result.

Poll a Range:
The nodes that are removed are placed in a separate return Vector, and returned as a result to this method.
Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.

ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter.

This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.

Returns:

A complete list (Vector<HTMLNode>) of the nodes that were removed.

Throws:
java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector

If 'ePos' is zero, or greater than the size of the Vector

If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
See Also:

Util.Remove.range(Vector, int, int), Util.Remove.range(Vector, DotPair), pollRange(Vector, DotPair)

Code:
Exact Method Body:

// The original version of this method is preserved inside comments at the bottom of this // method. Prior to seeing the Sun-Oracle Docs explaining that the return from the SubList // operation "mirrors changes" back to to the original vector, the code in the comments is // how this method was accomplished. LV l = new LV(html, sPos, ePos); Vector<HTMLNode> ret = new Vector<HTMLNode>(l.end - l.start); List<? extends HTMLNode> list = html.subList(l.start, l.end); // Copy the Nodes into the return Vector that the end-user receives ret.addAll(list); // Clear the nodes out of the original Vector. The Sun-Oracle Docs // state that the returned sub-list is "mirrored back into" the original list.clear(); // Return the Vector to the user. Note that the List<HTMLNode> CANNOT be returned, // because of it's mirror-qualities, and because this method expects a vector. return ret; /* // BEFORE READING ABOUT Vector.subList(...), this is how this was accomplished: // NOTE: It isn't so clear how the List<HTMLNode> works - likely it doesn't actually // create any new memory-allocated arrays, it is just an "overlay" // Copy the elements from the input vector into the return vector for (int i=l.start; i < l.end; i++) ret.add(html.elementAt(i)); // Remove the range from the input vector (this is the meaning of 'poll') Util.removeRange(html, sPos, ePos); return ret; */

pollRange

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static java.util.Vector<HTMLNode> pollRange
            (java.util.Vector<? extends HTMLNode> html,
             DotPair dp)

Convenience Method
Receives: DotPair
Invokes: pollRange(Vector, int, int).

split

🡅 ⇈ ⮫ 🗕 🗗 🗖
```
public static java.util.Vector<HTMLNode> split
            (java.util.Vector<? extends HTMLNode> html,
             int pos)
```
This removes every element from the Vector beginning at position 0, all the way to position 'pos' (exclusive). The elementAt(pos) remains in the original page input-Vector. This is the definition of 'exclusive'.
Parameters:

html - This may be any Vectorized-HTML Web-Page (or sub-page).

The Variable-Type Wild-Card Expression '? extends HTMLNode' means that a Vector<TagNode>, Vector<TextNode> or Vector<CommentNode> will all be accepted by this paramter without causing an exception throw.

These 'sub-type' Vectors are often returned as search results from the classes in the 'NodeSearch'vpackage.

pos - Any position within the range of the input Vector.

Returns:

The elements in the Vector from position: 0 ('zero') all the way to position: 'pos'

Code:
Exact Method Body:

return pollRange(html, 0, pos);

Modifier and Type	Class
`static class`	`Util.Count`
`static class`	`Util.Inclusive`
`static class`	`Util.Remove`

Class Util

Nested Class Summary

Method Summary

Methods inherited from class java.lang.Object

Method Detail

trimTextNodes

trimTextNodes

trimTextNodes

pageToString

rangeToString

rangeToString

textNodesString

textNodesString

textNodesString

escapeTextNodes

escapeTextNodes

escapeTextNodes

clone

cloneRange

cloneRange

textStrLength

textStrLength

textStrLength

compactTextNodes

compactTextNodes

compactTextNodes

strLength

strLength

strLength

hashCode

hashCode

hashCode

getJSONScriptBlocks

getJSONScriptBlocks

getJSONScriptBlocks

insertNodes

replaceRange

replaceRange

pollRange

pollRange

split