Package Torello.HTML
Class DotPair
- java.lang.Object
-
- Torello.HTML.DotPair
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,java.lang.Comparable<DotPair>
,java.lang.Iterable<java.lang.Integer>
public final class DotPair extends java.lang.Object implements java.io.Serializable, java.lang.Comparable<DotPair>, java.lang.Cloneable, java.lang.Iterable<java.lang.Integer>
A simple utility class that, used ubiquitously throughout Java HTML, which maintains two integer fields -DotPai.start
andDotPai.end
, for demarcating the begining and ending of a sub-list within an HTML web-page.
The purpose of this class is to keep the starting and ending points of an array sub-list together. In a much older computer language (LISP/Scheme) a'dotted pair'
is just two integers (numbers) that are glued to each other. Here, the two numbers are intended to represent List-Start and List-End Index-Position values for the sub-list of aVector
.
The Name'DotPair'
:
Calling this class "Arraysub-listEndPoints" would be a lot more descriptive, but the name would be so long to type that instead, it is going to be called'DotPair'
.
Important Note:
For every one of the Find, Get and Remove methods in theNodeSearch
package, the input parameterssPos, ePos
are designed such that:- the
"sPos"
is inclusive, meaning that theVector
index denoted by the value of this parameter is included in the sub-list. - the
"ePos"
is exclusive, meaning that theVector
index denoted by the value of this parameter is NOT included in the sub-list.
However, in ClassDotPair
:- the
"start"
is inclusive, meaning that theVector
index denoted by the value of this class field is included in the sub-list. - the
"end"
is ALSO inclusive, meaning that theVector
index denoted by the value of this class field is ALSO included in the sub-list.
Generally the"sPos, ePos"
method parameters and aDotPair.start
orDotPair.end
field have exactly identical meanings - EXCEPT for the above noted difference.- See Also:
NodeIndex
,SubSection
, Serialized Form
Hi-Lited Source-Code:- View Here: Torello/HTML/DotPair.java
- Open New Browser-Tab: Torello/HTML/DotPair.java
File Size: 26,248 Bytes Line Count: 594 '\n' Characters Found
-
-
Field Summary
Serializable ID Modifier and Type Field static long
serialVersionUID
Start & End Field Modifier and Type Field int
end
int
start
Alternate Sort Comparator Modifier and Type Field static Comparator<DotPair>
comp2
-
Constructor Summary
Constructors Constructor Description DotPair(int start, int end)
This constructor takes two integers and saves them into thepublic
member fields.
-
Method Summary
Basic Methods Modifier and Type Method boolean
isInside(int index)
DotPair
shift(int delta)
int
size()
Comparison-Methods that accept another DotPair instance Modifier and Type Method boolean
enclosedBy(DotPair other)
boolean
encloses(DotPair other)
boolean
endsAfter(DotPair other)
boolean
isAfter(DotPair other)
boolean
isBefore(DotPair other)
boolean
overlaps(DotPair other)
boolean
startsBefore(DotPair other)
Methods: class java.lang.Object Modifier and Type Method DotPair
clone()
boolean
equals(Object o)
int
hashCode()
String
toString()
Methods: interface java.lang.Iterable Modifier and Type Method PrimitiveIterator.OfInt
iterator()
<T extends HTMLNode>
Iterator<T>iterator(Vector<T> page)
Methods: interface java.lang.Comparable Modifier and Type Method int
compareTo(DotPair other)
Exception Check Helper Method's Modifier and Type Method void
exceptionCheck(Vector<HTMLNode> page)
void
exceptionCheck(Vector<HTMLNode> page, String... possibleTokens)
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
This fulfils the SerialVersion UID requirement for all classes that implement Java'sinterface java.io.Serializable
. Using theSerializable
Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.- See Also:
- Constant Field Values
- Code:
- Exact Field Declaration Expression:
public static final long serialVersionUID = 1;
-
start
public final int start
This is intended to be the "starting index" into an sub-array of an HTMLVector
ofHTMLNode
elements.
-
end
public final int end
This is intended to be the "ending index" into a sub-array of an HTMLVector
ofHTMLNode
elements.
-
comp2
public static java.util.Comparator<DotPair> comp2
This is an "alternative Comparitor" that can be used for sorting instances of this class. It should work with theCollections.sort(List, Comparator)
method in the standard JDK packagejava.util.*;
Comparator Heuristic:
This "extraComparitor
" simply compares the size of oneDotPair
to a second. The smaller shall be sorted first, and the larger (longer-in-length)DotPair
shall be sorted later. If they are of equal size, whichever of the two has an earlierstart
position in theVector
is considered first.- See Also:
CommentNode.body
- Code:
- Exact Field Declaration Expression:
public static Comparator<DotPair> comp2 = (DotPair dp1, DotPair dp2) -> { int ret = dp1.size() - dp2.size(); return (ret != 0) ? ret : (dp1.start - dp2.start); };
-
-
Constructor Detail
-
DotPair
public DotPair(int start, int end)
This constructor takes two integers and saves them into thepublic
member fields.- Parameters:
start
- This is intended to store the starting position of a vectorized-webpage sub-list or subpage.end
- This will store the ending position of a vectorized-html webpage or subpage.- Throws:
java.lang.IndexOutOfBoundsException
- A negative'start'
or'end'
parameter-value will cause this exception throw.java.lang.IllegalArgumentException
- A'start'
parameter-value that is larger than the'end'
parameter will cause this exception throw.- See Also:
NodeIndex
,SubSection
- Code:
- Exact Constructor Body:
if (start < 0) throw new IndexOutOfBoundsException ("Negative start value passed to DotPair constructor: start = " + start); if (end < 0) throw new IndexOutOfBoundsException ("Negative ending value passed to DotPair constructor: end = " + end); if (end < start) throw new IllegalArgumentException( "Start-parameter value passed to constructor is greater than ending-parameter: " + "start: [" + start + "], end: [" + end + ']' ); this.start = start; this.end = end;
-
-
Method Detail
-
shift
public DotPair shift(int delta)
Creates a new instance that has been shifted by'delta'
.- Parameters:
delta
- The number of array indices to shift'this'
intance. This parameter may be negative, and if so,'this'
will be shifted left, instead of right.- Returns:
- A new, shifted, instance of
'this'
- Code:
- Exact Method Body:
return new DotPair(this.start + delta, this.end + delta);
-
hashCode
public int hashCode()
Implements the standard java'hashCode()'
method. This will provide a hash-code that is likely to avoid crashes.- Overrides:
hashCode
in classjava.lang.Object
- Returns:
- A hash-code that may be used for inserting
'this'
instance into a hashed table, map or list. - Code:
- Exact Method Body:
return this.start + (1000 * this.end);
-
size
public int size()
The purpose of this is to remind the user that the array bounds are inclusive at BOTH ends of the sub-list.
Inclusive & Exclusive:
For an instance of'DotPair'
, the intention is to include both the characters located at theVector
-index positionsstart
and the one atend
. Specifically, (and unlike many of theNode-Search
package methods) both of the internal fields to this class are inclusive, rather than exclusive.
For many of the search methods in packageTorello.HTML.NodeSearch
, the'ePos'
parameters are always exclusive - meaning the character atVector
=index'ePos'
is not included in the search.- Returns:
- The length of a sub-array that would be indicated by this dotted pair.
- Code:
- Exact Method Body:
return this.end - this.start + 1;
-
toString
public java.lang.String toString()
Java'stoString()
requirement.- Overrides:
toString
in classjava.lang.Object
- Returns:
- A string representing 'this' instance of DotPair.
- Code:
- Exact Method Body:
return "[" + start + ", " + end + "]";
-
equals
public boolean equals(java.lang.Object o)
Java'spublic boolean equals(Object o)
requirements.- Overrides:
equals
in classjava.lang.Object
- Parameters:
o
- This may be any JavaObject
, but only ones of'this'
type whose internal-values are identical will force this method to returnTRUE
.- Returns:
TRUE
if (and only if) parameter'o'
is aninstanceof DotPair
and, also, both have equal start and ending field values.- Code:
- Exact Method Body:
if (o instanceof DotPair) { DotPair dp = (DotPair) o; return (this.start == dp.start) && (this.end == dp.end); } else return false;
-
clone
public DotPair clone()
Java'sinterface Cloneable
requirements. This instantiates a newDotPair
with identical'start', 'end'
fields.- Overrides:
clone
in classjava.lang.Object
- Returns:
- A new
DotPair
whose internal fields are identical to this one. - Code:
- Exact Method Body:
return new DotPair(this.start, this.end);
-
compareTo
public int compareTo(DotPair other)
Java'sinterface Comparable<T>
requirements. This is not the only comparison4 operation possible, but it does satisfy one reasonable requirement - SPECIFICALLY: which of two separate instances ofDotPair
start first.
Comparator Heuristic:
If twoDotPair
instances begin at the sameVector
-index, then the shorter of the two shall come first.- Specified by:
compareTo
in interfacejava.lang.Comparable<DotPair>
- Parameters:
other
- Any otherDotPair
to be compared to'this' DotPair
- Returns:
- An integer that fulfils Java's
interface Comparable<T> public boolean compareTo(T t)
method requirements. - Code:
- Exact Method Body:
int ret = this.start - other.start; return (ret != 0) ? ret : (this.size() - other.size());
-
iterator
public java.util.PrimitiveIterator.OfInt iterator()
This shall return anint Iterator
(which is properly namedclass java.util.PrimitiveIterator.OfInt
) that iterates integers beginning with the value inthis.start
and ending with the value inthis.end
.- Specified by:
iterator
in interfacejava.lang.Iterable<java.lang.Integer>
- Returns:
- An
Iterator
that iterates'this'
instance ofDotPair
from the beginning of the range, to the end of the range. TheIterator
returned will produce Java's primitive typeint
.
NOTE: The elements returned by theIterator
are integers, and this is, in effect, nothing more than one which counts fromstart
toend
. - Code:
- Exact Method Body:
return new PrimitiveIterator.OfInt() { private int cursor = start; public boolean hasNext() { return this.cursor <= end; } public int nextInt() { if (cursor == end) throw new NoSuchElementException ("Cursor has reached the value stored in 'end' [" + end + "]"); return cursor++; } };
-
iterator
public <T extends HTMLNode> java.util.Iterator<T> iterator (java.util.Vector<T> page)
A simpleIterator
that will iterate elements on an input page, using'this'
intance ofDotPair's
indices,start
, andend
.- Parameters:
page
- This may be any HTML page or sub-page. This page should correspond to'this'
instance ofDotPair
.- Returns:
- An
Iterator
that will iterate each node in the page, beginning with the node atpage.elementAt(this.start)
, and ending withpage.elementAt(this.end)
- Throws:
java.lang.IndexOutOfBoundsException
- This throws if'this'
instance does not have a range that adheres to the size of the input'page'
parameter.- Code:
- Exact Method Body:
if (this.start >= page.size()) throw new IndexOutOfBoundsException( "This instance of DotPair points to elements that are outside of the range of the" + "input 'page' Vector.\n" + "'page' parameter size: " + page.size() + ", this.start: [" + this.start + "]" ); if (this.end >= page.size()) throw new IndexOutOfBoundsException( "This instance of DotPair points to elements that are outside of the range of the" + "input 'page' Vector.\n" + "'page' parameter size: " + page.size() + ", this.end: [" + this.end + "]" ); return new Iterator<T>() { private int cursor = start; // a.k.a. 'this.start' private int expectedSize = page.size(); private int last = end; // a.k.a. 'this.end' public boolean hasNext() { return cursor < last; } public T next() { if (++cursor > last) throw new NoSuchElementException( "This iterator's cursor has run past the end of the DotPaiar instance that " + "formed this Iterator. No more elements to iterate. Did you call hasNext() ?" ); if (page.size() != expectedSize) throw new ConcurrentModificationException( "The expected size of the underlying vector has changed." + "\nCurrent-Size " + "[" + page.size() + "], Expected-Size [" + expectedSize + "]\n" + "\nCursor location: [" + cursor + "]" ); return page.elementAt(cursor); } // Removes the node from the underlying {@code Vector at the cursor's location. public void remove() { page.removeElementAt(cursor); expectedSize--; cursor--; last--; } };
-
isInside
public boolean isInside(int index)
This will test whether a specific index is contained (betweenthis.start
andthis.end
, inclusively.- Parameters:
index
- This is any integer index value. It must be greater than zero.- Returns:
TRUE
If the value of index is greater-than-or-equal-to the value stored in field'start'
and furthermore is less-than-or-equal-to the value of field'end'
- Throws:
java.lang.IndexOutOfBoundsException
- If the value is negative, this exception will throw.- Code:
- Exact Method Body:
if (index < 0) throw new IndexOutOfBoundsException ("You have passed a negative index [" + index + "] here, but this is not allowed."); return (index >= start) && (index <= end);
-
enclosedBy
public boolean enclosedBy(DotPair other)
Tests whether'this' DotPair
is fully enclosed byDotPair
parameter'other'
- Parameters:
other
- AnotherDotPair
. This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
TRUE
If (and only if) parameter'other'
encloses'this'
.- Code:
- Exact Method Body:
return (other.start <= this.start) && (other.end >= this.end);
-
encloses
public boolean encloses(DotPair other)
Tests whether'this' DotPair
is enclosed, completely, by parameterDotPair
parameter'other'
- Parameters:
other
- AnotherDotPair
. This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
TRUE
If (and only if) parameter'other'
is enclosed completely by'this'
.- Code:
- Exact Method Body:
return (this.start <= other.start) && (this.end >= other.end);
-
overlaps
public boolean overlaps(DotPair other)
Tests whether parameter'other'
has any overlappingVector
-indices with'this' DotPair
- Parameters:
other
- AnotherDotPair
. This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
TRUE
If (and only if) parameter'other'
and'this'
have any overlap.- Code:
- Exact Method Body:
return ((this.start >= other.start) && (this.start <= other.end)) || ((this.end >= other.start) && (this.end <= other.end));
-
isBefore
public boolean isBefore(DotPair other)
Tests whether'this'
lays, completely, beforeDotPair
parameter'other'
.- Parameters:
other
- AnotherDotPair
. This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
TRUE
if every index of'this'
has a value that is less than every index of'other'
- Code:
- Exact Method Body:
return this.end < other.start;
-
startsBefore
public boolean startsBefore(DotPair other)
Tests whether'this'
begins beforeDotPair
parameter'other'
.- Parameters:
other
- AnotherDotPair
. This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
TRUE
ifthis.start
is less thanother.start
, andFALSE
otherwise.- Code:
- Exact Method Body:
return this.start < other.start;
-
isAfter
public boolean isAfter(DotPair other)
Tests whether'this'
lays, completely, afterDotPair
parameter'other'
.- Parameters:
other
- AnotherDotPair
. This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
TRUE
if every index of'this'
has a value that is greater than every index of'other'
- Code:
- Exact Method Body:
return this.start > other.end;
-
endsAfter
public boolean endsAfter(DotPair other)
Tests whether'this'
ends afterDotPair
parameter'other'
.- Parameters:
other
- AnotherDotPair
. This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
TRUE
ifthis.end
is greater thanother.end
, andFALSE
otherwise.- Code:
- Exact Method Body:
return this.end > other.end;
-
exceptionCheck
public void exceptionCheck(java.util.Vector<HTMLNode> page, java.lang.String... possibleTokens)
A method that will do a fast check that'this'
intance holds index-pointers to an opening and closing HTML-Tag pair. Note, though these mistakes may seem trivial, when parsing Internet Web-Pages, these are exactly the type of basic mistakes that users will make when their level of 'concentration' is low. This is no different that checking an array-index orString
-index for anIndexOutOfBoundsException
.
This type of detailed exception message can make analyzing web-pages more direct and less error-prone. The 'cost' incurred includes only a fewif
-statement comparisons, and this check should be performed immediatley before a loop is entered.- Parameters:
page
- Any web-page, or sub-page. It needs to be the page from whence'this'
instance ofDotPair
was retrieved.- Throws:
TagNodeExpectedException
- If'this'
instance'start
orend
fields do not point toTagNode
elements on the'page'
.HTMLTokException
- Ifstart
orend
do not point to aTagNode
whoseTagNode.tok
field equals theString
contained by parameter'token'
.OpeningTagNodeExpectedException
- Ifstart
does not point to an openingTagNode
.ClosingTagNodeExpectedException
- Ifend
does not point to a closingTagNode
.java.lang.NullPointerException
- If the'page'
parameter is null.ExceptionCheckError
- IMPORTANT Since this method is, indubuitably, a method for performing error checking, the presumption is that the programmer is trying to check for his users input. If in the processes of checking for user error, another mistake is made that would generate an exception, this must thought of as a more serious error.
The purpose of the'possibleTokens'
array is to check that those tokens match the tokens that are contained by theTagNode's
on the page at indexthis.start
, andthis.end
. If invalid HTML tokens, null tokens, or even HTML Singleton tokens are passed this exception-check, itself, is flawed! If there are problems with this var-args array, this error is thrown.
It is more serious because it indicates that the programmer has made a mistake in attempting to check for user-errors.- Code:
- Exact Method Body:
if (page == null) throw new NullPointerException ("HTML-Vector parameter was passed a null reference."); if (possibleTokens == null) throw new ExceptionCheckError ("HTML tags string-list was passed a null reference."); for (String token : possibleTokens) { if (token == null) throw new ExceptionCheckError ("One of the HTML Tag's in the tag-list String-array was null."); if (! HTMLTags.isTag(token)) throw new ExceptionCheckError ("One of the passed tokens [" + token +"] is not a valid HTML token."); if (HTMLTags.isSingleton(token)) throw new ExceptionCheckError ("One of the passed tokens [" + token +"] is an HTML Singleton."); } // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Check the DotPair.start // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** if (this.start >= page.size()) throw new IndexOutOfBoundsException( "DotPair's 'start' field [" + this.start + "], is greater than or equal to the " + "size of the HTML-Vector [" + page.size() + "]." ); if (! (page.elementAt(this.start) instanceof TagNode)) throw new TagNodeExpectedException(this.start); TagNode t1 = (TagNode) page.elementAt(this.start); if (t1.isClosing) throw new OpeningTagNodeExpectedException( "The TagNode at index [" + this.start + "] was a closing " + "</" + t1.tok.toUpperCase() + ">, but an opening tag was expected here." ); // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Now Check the DotPair.end // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** if (this.end >= page.size()) throw new IndexOutOfBoundsException( "DotPair's 'end' field [" + this.end + "], is greater than or equal to the " + "size of the HTML-Vector [" + page.size() + "]." ); if (! (page.elementAt(this.end) instanceof TagNode)) throw new TagNodeExpectedException(this.end); TagNode t2 = (TagNode) page.elementAt(this.end); if (! t2.isClosing) throw new ClosingTagNodeExpectedException( "The TagNode at index [" + this.start + "] was an opening " + "<" + t2.tok.toUpperCase() + ">, but a closing tag was expected here." ); // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Token Check // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** if (! t1.tok.equalsIgnoreCase(t2.tok)) throw new HTMLTokException( "The opening TagNode was the [" + t1.tok.toLowerCase() + "] HTML Tag, while the " + "closing Tag was the [" + t2.tok.toLowerCase() + "]. These two tag's must be an " + "opening and closing pair, and therefore must match each-other." ); for (String possibleToken : possibleTokens) if (possibleToken.equalsIgnoreCase(t1.tok)) return; String t = t1.tok.toUpperCase(); throw new HTMLTokException( "The opening and closing tags were: <" + t + ">, and </" + t + ">, but " + "unfortunately this Tag is not included among the list of expected tags:\n" + " [" + StrCSV.toCSV(possibleTokens, false, false, 60) + "]." );
-
exceptionCheck
public void exceptionCheck(java.util.Vector<HTMLNode> page)
Performs an exception check, using'this'
instance ofDotPair
, and throws anIndexOutOfBoundsException
if'this'
contains end-points that do not fit inside the'page'
Vector Parameter.- Parameters:
page
- Any HTML Page, or subpage.page.size()
must return a value that is larger than BOTHstart
ANDend
.- Throws:
java.lang.IndexOutOfBoundsException
- A value forstart
orend
which are larger than the size of theVector
parameter'page'
will cause this exception throw.- Code:
- Exact Method Body:
if (this.end >= page.size()) throw new IndexOutOfBoundsException( "The value of this.end [" + this.end + "] is greater than the size of Vector " + "parameter 'page' [" + page.size() + "]" ); // This is actually unnecessary. If 'end' is fine, then 'start' must be fine. If 'end' is // out of bounds, then it is irrelevant whether 'start' is out of bounds. "They" play with // your brain when you are coding. /* if (this.start >= page.size()) throw new IndexOutOfBoundsException( "The value of this.start [" + this.start + "] is greater than the size of Vector " + "parameter 'page' [" + page.size() + "]" ); */
-
-