Package Torello.HTML
Class NodeIndex<NODE extends HTMLNode>
- java.lang.Object
-
- Torello.HTML.NodeIndex<NODE>
-
- Type Parameters:
NODE- The class ofHTMLNoderepresented by thisNodeIndexinstance.
- All Implemented Interfaces:
java.io.Serializable,java.lang.CharSequence,java.lang.Cloneable,java.lang.Comparable<Replaceable>,Replaceable
- Direct Known Subclasses:
CommentNodeIndex,TagNodeIndex,TextNodeIndex
public abstract class NodeIndex<NODE extends HTMLNode> extends java.lang.Object implements java.lang.CharSequence, java.io.Serializable, java.lang.Cloneable, Replaceable
The abstract parent class of all threeNodeIndexclasses,TagNodeIndex,TextNodeIndexandCommentNodeIndex.NodeIndex: HTMLNode 'Plus' the
This class is just an extremely simple data-structure-class used, generally, for returning both the index of an instance-node ofVector-Indexclass 'HTMLNode'inside a vectorized-html web-page, and also the node itself. This class is the "parent class" of the extending classes:TextNodeIndex,TagNodeIndexandCommentNodeIndex. These instances are returned by allPEEKoperations in the node-search package. The constructor of this class accepts an index, and a node and saves them aspublicfields of this class.
STALE DATA NOTE:
If a vectorized-html webpage is modified after any of theseNode + Indexclasses are created / instantiated, and nodes are added or removed from the webpage, then the (integer) index data inside these classes would have become stale when they are next accessed.
It is important to remember thatVector-position (a.k.a. "Vector-index") information that is stored inside instances of these (extremely-simple) classes will become stale, immediately if nodes are ever added or removed to the underlyingVectorfrom which theseNode + Indexobject-classes are created.- See Also:
HTMLNode,CommentNodeIndex,TagNodeIndex,TextNodeIndex, Serialized Form
Hi-Lited Source-Code:- View Here: Torello/HTML/NodeIndex.java
- Open New Browser-Tab: Torello/HTML/NodeIndex.java
File Size: 14,489 Bytes Line Count: 372 '\n' Characters Found
-
-
Field Summary
Serializable ID Modifier and Type Field static longserialVersionUIDPrimary NodeIndex Fields Modifier and Type Field intindexNODEnAlternative Comparators Modifier and Type Field static Comparator<TextNodeIndex>comp2static Comparator<TextNodeIndex>comp3
-
Method Summary
Methods: interface Torello.HTML.Replaceable Modifier and Type Method booleanaddAllInto(int index, Vector<HTMLNode> fileVec)booleanaddAllInto(Vector<HTMLNode> fileVec)Vector<HTMLNode>currentNodes()intcurrentSize()HTMLNodefirstCurrentNode()HTMLNodelastCurrentNode()intoriginalLocationEnd()intoriginalLocationStart()intoriginalSize()intupdate(Vector<HTMLNode> fileVec)Simple Static Builder Modifier and Type Method static NodeIndex<?>newNodeIndex(int index, HTMLNode n)Methods: interface java.lang.CharSequence Modifier and Type Method charcharAt(int index)intlength()CharSequencesubSequence(int start, int end)StringtoString()Methods: class java.lang.Object Modifier and Type Method booleanequals(Object o)inthashCode()-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface Torello.HTML.Replaceable
clearHTML, compareTo, isSynthetic, moveAndUpdate, setHTML, setHTML, summarize
-
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
This fulfils the SerialVersion UID requirement for all classes that implement Java'sinterface java.io.Serializable. Using theSerializableImplementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.- See Also:
- Constant Field Values
- Code:
- Exact Field Declaration Expression:
public static final long serialVersionUID = 1;
-
index
public final int index
An index to a node from a web-page. This index must point to a the exact same node inside of a vectorized-html page as the node stored in member-fieldHTMLNode 'n'.- Code:
- Exact Field Declaration Expression:
public final int index;
-
n
-
comp2
public static final java.util.Comparator<TextNodeIndex> comp2
This is an "alternative Comparitor" that can be used for sorting instances of this class. It should work with theCollections.sort(List, Comparator)method in the standard JDK packagejava.util.*;
Comparitor Heuristic:
This version utilizes the standard JDK methodString.compareTo(String).- See Also:
HTMLNode.str- Code:
- Exact Field Declaration Expression:
public static final Comparator<TextNodeIndex> comp2 = (TextNodeIndex txni1, TextNodeIndex txni2) -> txni1.n.str.compareTo(txni2.n.str);
-
comp3
public static final java.util.Comparator<TextNodeIndex> comp3
This is an "alternative Comparitor" that can be used for sorting instances of this class. It should work with theCollections.sort(List, Comparator)method in the standard JDK packagejava.util.*;
Comparitor Heuristic:
This version utilizes the standard JDK methodString.compareToIgnoreCase(String).- See Also:
HTMLNode.str- Code:
- Exact Field Declaration Expression:
public static final Comparator<TextNodeIndex> comp3 = (TextNodeIndex txni1, TextNodeIndex txni2) -> txni1.n.str.compareToIgnoreCase(txni2.n.str);
-
-
Constructor Detail
-
NodeIndex
protected NodeIndex(int index, NODE n)
a default constructor. This assigns a value to the index field.- Parameters:
index- This is the vector-index location of HTMLNode 'n' inside of a vectorized-HTML web-page.
Stale Data Note:
This class is a less commonly used class, rahter than one of the primary data classes. Instances of this class become 'useless' the moment theVectorthat was used to instantiate this class is modified, implying that the node 'n' is no longer atVector-index 'index.'
These "NodeIndex" classes are retained, not deprecated due to the fundamental nature of using the classes of the NodeSearch Package. Data is easily made stale. Generally, when modifying HTML Vectors, the easiest thing to do is to remember to modify aVector<HTMLNode>at specific locations by iterating from the end of theVector, in reverse order, back to the beginning.
Such a practice will generally prevent Stale-DataVector-Indices from rearing their ugly head.n- An HTMLNode that needs to be the node found in the underlying vector at vector-index 'index.'- Throws:
java.lang.IndexOutOfBoundsException- ifindexis negative, this exception is thrown.java.lang.NullPointerException- ifnis null.- Code:
- Exact Constructor Body:
this.index = index; this.n = n; if (n == null) throw new NullPointerException( "HTMLNode parameter 'n' to this constructor was passed a null value, but this " + "is not allowed here." ); if (index < 0) throw new IndexOutOfBoundsException( "Integer parameter 'index' to this constructor was passed a negative value: " + index );
-
-
Method Detail
-
newNodeIndex
public static final NodeIndex<?> newNodeIndex(int index, HTMLNode n)
Simple dispatch method helper that switches on the class of input parameter'n'.- Parameters:
n- Any of the three Java HTML definedHTMLNodesubclasses -TagNode,TextNodeorCommentNode- Returns:
- A
NodeIndexinheriting class that is appropriate to'n'. - Throws:
java.lang.IllegalArgumentException- If the user has extended classHTMLNode, and passed this unrecognizedHTMLNodeType.- Code:
- Exact Method Body:
Class<?> newNodeClass = n.getClass(); if (TagNode.class.isAssignableFrom(newNodeClass)) return new TagNodeIndex(index, (TagNode) n); if (TextNode.class.isAssignableFrom(newNodeClass)) return new TextNodeIndex(index, (TextNode) n); if (CommentNode.class.isAssignableFrom(newNodeClass)) return new CommentNodeIndex(index, (CommentNode) n); throw new IllegalArgumentException ("Parameter 'n' has a Type that is an Unrecognized HTMLNode-SubClass Type");
-
equals
public final boolean equals(java.lang.Object o)
Java'spublic boolean equals(Object o)requirements.
Final Method:
This method is final, and cannot be modified by sub-classes.- Overrides:
equalsin classjava.lang.Object- Parameters:
o- This may be any Java Object, but only ones of'this'type whose internal-values are identical will bring this method to return true.- Returns:
TRUEIf'this'equals another objectHTMLNode.- Code:
- Exact Method Body:
if (o == null) return false; if (o == this) return true; if (! this.getClass().equals(o.getClass())) return false; NodeIndex<?> other = (NodeIndex) o; return other.n.str.equals(this.n.str) && (other.index == this.index);
-
hashCode
-
charAt
public final char charAt(int index)
Returns the char value at the specified index of thepublic final String strfield of'this'fieldpublic final HTMLNode n. An index ranges from zero to length() - 1. The first char value of the sequence is at index zero, the next at index one, and so on, as for array indexing.
If the char value specified by the index is a surrogate, the surrogate value is returned.
Final Method:
This method is final, and cannot be modified by sub-classes.- Specified by:
charAtin interfacejava.lang.CharSequence- Parameters:
index- The index of the char value to be returned- Returns:
- The specified char value
- Code:
- Exact Method Body:
return n.str.charAt(index);
-
length
public final int length()
Returns the length of thepublic final String strfield of'this'fieldpublic final HTMLNode n. The length is the number of 16-bit chars in the sequence.
Final Method:
This method is final, and cannot be modified by sub-classes.- Specified by:
lengthin interfacejava.lang.CharSequence- Returns:
- the number of chars in
this.n.str - Code:
- Exact Method Body:
return n.str.length();
-
subSequence
public final java.lang.CharSequence subSequence(int start, int end)
Returns a CharSequence that is a subsequence of thepublic final String strfield of'this'fieldpublic final HTMLNode n. The subsequence starts with the char value at the specified index and ends with the char value at index end - 1. The length (in chars) of the returned sequence is end - start, so if start == end then an empty sequence is returned.
Final Method:
This method is final, and cannot be modified by sub-classes.- Specified by:
subSequencein interfacejava.lang.CharSequence- Parameters:
start- The start index, inclusiveend- The end index, exclusive- Returns:
- The specified subsequence
- Code:
- Exact Method Body:
return n.str.substring(start, end);
-
toString
public final java.lang.String toString()
Returns thepublic final String strfield of'this'fieldpublic final HTMLNode n.
Final Method:
This method is final, and cannot be modified by sub-classes.- Specified by:
toStringin interfacejava.lang.CharSequence- Overrides:
toStringin classjava.lang.Object- Returns:
- A string consisting of exactly this sequence of characters.
- See Also:
HTMLNode.str- Code:
- Exact Method Body:
return n.str;
-
originalSize
public int originalSize()
Description copied from interface:ReplaceableReports how many nodes were copied intothisinstance. For implementing classes that inheritNodeIndex, this value will always be one. For others, it should report exactly how manyHTMLNode'swere copied.- Specified by:
originalSizein interfaceReplaceable- Returns:
- Number of nodes originally contained by
thisinstance.
The purpose ofReplaceable'sis to allow a user to modify HTML using a smaller sub-list, without having to operate on the entire HTML-Vectorsince adding & removing nodes is one variant ofVector-modification, the original-size may often differ from the current-size.
When modifying HTML, if a web-page is broken into smaller-pieces, and changes are restricted to those smaller sub-lists (and the original page is rebuilt, all at once, after all changes have been made) then those modifications should require far-fewer time-consuming list-shift operations, tremendously improving the performance of the code. - Code:
- Exact Method Body:
return 1;
-
currentSize
public int currentSize()
Description copied from interface:ReplaceableReturns how many nodes are currently inthisinstance.- Specified by:
currentSizein interfaceReplaceable- Returns:
- Number of nodes. See explanation of the original size,
versus the current size
here - Code:
- Exact Method Body:
return vectorAssigned ? CURRENT_NODES.size() : 1;
-
originalLocationStart
public int originalLocationStart()
Description copied from interface:ReplaceableReturns the start-location within the original page-Vectorfrom whence the HTML contents ofthisinstance were retrieved.
Start is Inclusive:
The returned value is inclusive of the actual, original-range ofthisinstance. This means the firstHTMLNodecopied intothisinstance' internal data-structure was atoriginalLocationStart().
Implementations of Replaceable:
The two concrete implementatons of this interface (NodeIndexandSubSection) - both enforce the'final'modifier on their location-fields. (See:indexandSubSection.location).- Specified by:
originalLocationStartin interfaceReplaceable- Returns:
- The
Vectorstart-index from whence this HTML was copied. - Code:
- Exact Method Body:
return index;
-
originalLocationEnd
public int originalLocationEnd()
Description copied from interface:ReplaceableReturns the end-location within the original page-Vectorfrom whence the HTML contents ofthisinstance were retrieved.
Start is Exclusive:
The returned value is exclusive of the actual, original-range ofthisinstance. This means the lastHTMLNodecopied intothisinstance' internal data-structure was atoriginalLocationEnd() - 1
Implementations of Replaceable:
The two concrete implementatons of this interface (NodeIndexandSubSection) - both enforce the'final'modifier on their location-fields. (See:indexandSubSection.location).- Specified by:
originalLocationEndin interfaceReplaceable- Returns:
- The
Vectorend-index from whence this HTML was copied. - Code:
- Exact Method Body:
return index + 1;
-
firstCurrentNode
public HTMLNode firstCurrentNode()
Description copied from interface:ReplaceableThe first node currently contained by thisReplaceable- Specified by:
firstCurrentNodein interfaceReplaceable- Returns:
- The First Node
- Code:
- Exact Method Body:
if (vectorAssigned) return (this.CURRENT_NODES.size() > 0) ? this.CURRENT_NODES.elementAt(0) : null; else return this.n;
-
lastCurrentNode
public HTMLNode lastCurrentNode()
Description copied from interface:ReplaceableThe last node currently contained by thisReplaceable- Specified by:
lastCurrentNodein interfaceReplaceable- Returns:
- The last node
- Code:
- Exact Method Body:
if (vectorAssigned) { final int S = this.CURRENT_NODES.size(); return (S > 0) ? this.CURRENT_NODES.elementAt(S-1) : null; } else return this.n;
-
currentNodes
public java.util.Vector<HTMLNode> currentNodes()
Description copied from interface:ReplaceableAll nodes currently contained by thisReplaceable. The concrete-classes which implementReplaceable(SubSection&TagNodeIndex) allow for the html they hold to be modified. The modification to aReplaceablehappens independently from the original HTML Page out of which it was copied.Replaceable'sare, sort-of, the exact opposite of Java'sListmethod'subList'. According to the Sun / Oracle Documentation forjava.util.List.subList(int fromIndex, int toIndex), any changes made to an instance of a'subList'are immediately reflected back into the originalListfrom where they were created.
TheList.subListoperation has the advantage of being extremely easy to work with - however, an HTML-PageVectorhas the potential of being hundreds of nodes long. Any operations that involve insertion or deletion will likely be terribly inefficient.
When the HTML inside of aReplaceableis modified - nothing happens to the originalVectorwhatsoever!. Until a user requests that the original HTML-Vectorbe updated to reflect all changes that he or she has made, the original HTML remains untouched. When an update request is finally issued, all changes are made all at once, and at the same time!
Again - seeReplacement.runto understand how quick updates on HTML-Pages is done using theReplaceableinterface.- Specified by:
currentNodesin interfaceReplaceable- Returns:
- An HTML-
Vectorof the nodes.
The HTML-Vectorwhich is returned by this method may be modified in any way that is necessary! If or when a user requests that the original HTML-Vectorbe updated to accomodate the changes that have been made, the contents of theVectorwhich is returned by this method will be used to replace the original-HTML.
If this method is invoked more than once, the same exactVectorwill be returned each time that the Current-Nodes are requested. The internal "Current-Nodes" HTML-Vectoris a "per instance" Singleton-InstanceVector. - Code:
- Exact Method Body:
if (! this.vectorAssigned) { this.CURRENT_NODES = new Vector<>(1); this.CURRENT_NODES.add(n); this.vectorAssigned = true; } return CURRENT_NODES;
-
addAllInto
public boolean addAllInto(java.util.Vector<HTMLNode> fileVec)
Description copied from interface:ReplaceableAdd all nodes currently retained inthisinstance into the HTML-Vectorparameterhtml. The nodes are appended to the end of'html'. Implementing classesNodeIndexandSubSectionsimply use the JavaVectormethod'sadd(forNodeIndex) andaddAll(forSubSection).- Specified by:
addAllIntoin interfaceReplaceable- Parameters:
fileVec- The HTML-Vectorinto which the nodes will be appended (to the end of thisVector, usingVectormethodsaddoraddAlldependent upon whether one or more-than-one nodes are being inserted).- Returns:
- The result of
Vectormethodadd, or methodallAll - Code:
- Exact Method Body:
if (this.vectorAssigned) return fileVec.addAll(CURRENT_NODES); else return fileVec.add(n);
-
addAllInto
public boolean addAllInto(int index, java.util.Vector<HTMLNode> fileVec)
Description copied from interface:ReplaceableAdd all nodes currently retained inthisinstance into the HTML-Vectorparameterhtml.- Specified by:
addAllIntoin interfaceReplaceable- Parameters:
index- The'html'parameter'sVector-index where these nodes are to be insertedfileVec- The HTML-Vectorinto which the nodes will be appended (to the end of thisVector, usingVectormethodsaddoraddAlldependent upon whether one or more-than-one nodes are being inserted).- Returns:
- The result of
Vectormethodadd, or methodallAll - Code:
- Exact Method Body:
if (this.vectorAssigned) return fileVec.addAll(index, this.CURRENT_NODES); else fileVec.insertElementAt(this.n, index); return true;
-
update
public int update(java.util.Vector<HTMLNode> fileVec)
Description copied from interface:ReplaceableReplaces the original range of nodes insideoriginalHTMLwith the current-nodes ofthisinstance, using the original-location of the node(s).
Replaceable's Primary Value:
The main value of using theReplaceableinterface is to allow for more expedient replacing / modifying HTML Pages. If many changes need to be made to a page, first extracting and copying the sub-sections that need changing intoReplaceable'sinstances (using the Peek operations in package NodeSearch), and then re-copying those sections back into the original page-Vectorafter changing them - avoids the cost that would be incurred from repeatedly inserting and shifting a long list of nodes in a large HTML Page.
Therefore, this method is probably best avoided, as it is defeating the entire-purpose of aRelaceable. This method will update the nodes at the location in the original-Vector, which is fine, but if more than one update / change is needed, using this method over-and-over again will re-introduce the exact shifting that was supposed to be avoided by (and is the whole reason for...) usingReplaceable'sin the first place!
The following example should make this clear:
Example:
Vector<HTMLNode> page = HTMLPage.getPageTokens(new URL("http://some.url.com/"), false); Vector<SubSection> myTableRows = TagNodePeekInclusive.all(page, "tr"); TagNode OPEN_SPAN = HTMLTags.hasTag("SPAN", TC.OpeningTags); TagNode CLOSE_SPAN = HTMLTags.hasTag("SPAN", TC.ClosingTags); int counter = 1; for (SubSection tableRow : myTableRows) { // Retrieve the <TR> Tag & Give it a CSS-ID TagNode tr = tableRow.html.elementAt(0).asTagNode().setID("ROW" + counter++, null); // Put the newly created <TR ID=..> into the vector. It was the first-element in the SubSection tableRow.html.setElementAt(tr, 0); // Add a <SPAN>...</SPAN> surrounding the first line of text // NOTE: This assumes that tableRow[1] (second SubSection node) is a TextNode with text tableRow.html.insertElementAt(OPEN_SPAN, 1); tableRow.html.insertElementAt(CLOSE_SPAN, 3); } // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // This version DESTROYS THE BENEFIT of using TagNodePeekInclusive // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // // Here, if the original html-page was thousands of nodes long, every table-row // update will force thousands of nodes to be shifted to the right over-and-over // again! for (SubSection tableRow : myTableRows) tableRow.update(page); // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // This builds a new Vector much more efficiently, avoiding costly node-shifting // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** page = ReplaceNodes.r(page, myTableRows, false).a;
- Specified by:
updatein interfaceReplaceable- Parameters:
fileVec- The original page-Vectorwhere the nodes inthisinstance were retrieved- Returns:
- The change in the size of the
Vector - See Also:
Replacement.run(Vector, Iterable, boolean)- Code:
- Exact Method Body:
if (! this.vectorAssigned) { fileVec.setElementAt(n, index); return 0; } else { ReplaceNodes.r(fileVec, this.index, this.CURRENT_NODES); return this.CURRENT_NODES.size() - 1; }
-
-