Package Torello.HTML
Class Util.Count
- java.lang.Object
-
- Torello.HTML.Util.Count
-
- Enclosing class:
- Util
public static class Util.Count extends java.lang.Object
Hi-Lited Source-Code:- View Here: Torello/HTML/Util.java
- Open New Browser-Tab: Torello/HTML/Util.java
File Size: 13,283 Bytes Line Count: 320 '\n' Characters Found
Stateless Class:This class neither contains any program-state, nor can it be instantiated. The@StaticFunctionalAnnotation may also be called 'The Spaghetti Report'.Static-Functionalclasses are, essentially, C-Styled Files, without any constructors or non-static member fields. It is a concept very similar to the Java-Bean's@StatelessAnnotation.
- 1 Constructor(s), 1 declared private, zero-argument constructor
- 15 Method(s), 15 declared static
- 0 Field(s)
-
-
Method Summary
Count CommentNode instances Modifier and Type Method static intcommentNodes(Vector<HTMLNode> page)static intcommentNodes(Vector<HTMLNode> page, int sPos, int ePos)static intcommentNodes(Vector<HTMLNode> page, DotPair dp)Count TagNode instances Modifier and Type Method static inttagNodes(Vector<HTMLNode> page)static inttagNodes(Vector<HTMLNode> page, int sPos, int ePos)static inttagNodes(Vector<HTMLNode> page, DotPair dp)Count TextNode intances Modifier and Type Method static inttextNodes(Vector<HTMLNode> page)static inttextNodes(Vector<HTMLNode> page, int sPos, int ePos)static inttextNodes(Vector<HTMLNode> page, DotPair dp)Count all New-Lines Modifier and Type Method static intnewLines(Vector<? extends HTMLNode> html)static intnewLines(Vector<? extends HTMLNode> html, int sPos, int ePos)static intnewLines(Vector<? extends HTMLNode> html, DotPair dp)Count TagNode Tokens Modifier and Type Method static Ret2<Hashtable<String,
Integer>,
Hashtable<String,
Integer>>tagNodesToTable(Vector<HTMLNode> page)static Ret2<Hashtable<String,
Integer>,
Hashtable<String,
Integer>>tagNodesToTable(Vector<HTMLNode> page, int sPos, int ePos)static Ret2<Hashtable<String,
Integer>,
Hashtable<String,
Integer>>tagNodesToTable(Vector<HTMLNode> page, DotPair dp)
-
-
-
Method Detail
-
textNodes
-
textNodes
-
textNodes
public static int textNodes(java.util.Vector<HTMLNode> page, int sPos, int ePos)
Counts the number ofTextNode'sin aVector<HTMLNode>between the demarcated array /Vectorpositions,'sPos'and'ePos'- Parameters:
page- Any HTML page.sPos- This is the (integer)Vector-index that sets a limit for the left-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'inclusive' meaning that theHTMLNodeat thisVector-index will be visited by this method.If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.ePos- This is the (integer)Vector-index that sets a limit for the right-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'exclusive' meaning that the'HTMLNode'at thisVector-index will not be visited by this method.If this value is larger than the size of input theVector-parameter, an exception will throw.
Passing a negative value to this parameter,'ePos', will cause its value to be reset to the size of the inputVector-parameter.- Returns:
- The number of
TextNode'sin theVectorbetween the demarcated indices. - Throws:
java.lang.IndexOutOfBoundsException- This exception shall be thrown if any of the following are true:- If
'sPos'is negative, or ifsPosis greater-than-or-equal-to thesizeof theVector - If
'ePos'is zero, or greater than the size of theVector - If the value of
'sPos'is a larger integer than'ePos'. If'ePos'was negative, it is first reset toVector.size(), before this check is done.
- If
- Code:
- Exact Method Body:
int counter = 0; LV l = new LV(page, sPos, ePos); // Iterates the entire page between sPos and ePos, incrementing the count for every // instance of text-node. for (int i=l.start; i < l.end; i++) if (page.elementAt(i).isTextNode()) counter++; return counter;
-
commentNodes
public static int commentNodes(java.util.Vector<HTMLNode> page)
- Code:
- Exact Method Body:
return commentNodes(page, 0, -1);
-
commentNodes
public static int commentNodes(java.util.Vector<HTMLNode> page, DotPair dp)
- Code:
- Exact Method Body:
return commentNodes(page, dp.start, dp.end + 1);
-
commentNodes
public static int commentNodes(java.util.Vector<HTMLNode> page, int sPos, int ePos)
Counts the number ofCommentNode'sin anVector<HTMLNode>between the demarcated array /Vectorpositions.- Parameters:
page- Any HTML page.sPos- This is the (integer)Vector-index that sets a limit for the left-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'inclusive' meaning that theHTMLNodeat thisVector-index will be visited by this method.If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.ePos- This is the (integer)Vector-index that sets a limit for the right-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'exclusive' meaning that the'HTMLNode'at thisVector-index will not be visited by this method.If this value is larger than the size of input theVector-parameter, an exception will throw.
Passing a negative value to this parameter,'ePos', will cause its value to be reset to the size of the inputVector-parameter.- Returns:
- The number of
CommentNode'sin theVectorbetween the demarcated indices. - Throws:
java.lang.IndexOutOfBoundsException- This exception shall be thrown if any of the following are true:- If
'sPos'is negative, or ifsPosis greater-than-or-equal-to thesizeof theVector - If
'ePos'is zero, or greater than the size of theVector - If the value of
'sPos'is a larger integer than'ePos'. If'ePos'was negative, it is first reset toVector.size(), before this check is done.
- If
- Code:
- Exact Method Body:
int counter = 0; LV l = new LV(page, sPos, ePos); // Iterates the entire page between sPos and ePos, incrementing the count for every // instance of comment-node. for (int i=l.start; i < l.end; i++) if (page.elementAt(i).isCommentNode()) counter++; return counter;
-
tagNodes
-
tagNodes
-
tagNodes
public static int tagNodes(java.util.Vector<HTMLNode> page, int sPos, int ePos)
Counts the number ofTagNode'sin aVector<HTMLNode>between the demarcated array /Vectorpositions.- Parameters:
page- Any HTML page.sPos- This is the (integer)Vector-index that sets a limit for the left-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'inclusive' meaning that theHTMLNodeat thisVector-index will be visited by this method.If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.ePos- This is the (integer)Vector-index that sets a limit for the right-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'exclusive' meaning that the'HTMLNode'at thisVector-index will not be visited by this method.If this value is larger than the size of input theVector-parameter, an exception will throw.
Passing a negative value to this parameter,'ePos', will cause its value to be reset to the size of the inputVector-parameter.- Returns:
- The number of
TagNode'sin theVector. - Throws:
java.lang.IndexOutOfBoundsException- This exception shall be thrown if any of the following are true:- If
'sPos'is negative, or ifsPosis greater-than-or-equal-to thesizeof theVector - If
'ePos'is zero, or greater than the size of theVector - If the value of
'sPos'is a larger integer than'ePos'. If'ePos'was negative, it is first reset toVector.size(), before this check is done.
- If
- Code:
- Exact Method Body:
int counter = 0; LV l = new LV(page, sPos, ePos); // Iterates the entire page between sPos and ePos, incrementing the count for every // instance of TagNode. for (int i=l.start; i < l.end; i++) if (page.elementAt(i).isTagNode()) counter++; return counter;
-
tagNodesToTable
public static Ret2<java.util.Hashtable<java.lang.String,java.lang.Integer>,java.util.Hashtable<java.lang.String,java.lang.Integer>> tagNodesToTable (java.util.Vector<HTMLNode> page)
- Code:
- Exact Method Body:
return tagNodesToTable(page, 0, -1);
-
tagNodesToTable
public static Ret2<java.util.Hashtable<java.lang.String,java.lang.Integer>,java.util.Hashtable<java.lang.String,java.lang.Integer>> tagNodesToTable (java.util.Vector<HTMLNode> page, DotPair dp)
- Code:
- Exact Method Body:
return tagNodesToTable(page, dp.start, dp.end + 1);
-
tagNodesToTable
public static Ret2<java.util.Hashtable<java.lang.String,java.lang.Integer>,java.util.Hashtable<java.lang.String,java.lang.Integer>> tagNodesToTable (java.util.Vector<HTMLNode> page, int sPos, int ePos)
For each tag in HTML-5 (according to classHTMLTags, this method counts the number of instances of eachTagNodecontained by aVector<HTMLNode>. The count is performed on nodes between the parameter-provided array-indices, and the results are placed into twoHashtable's.- Parameters:
page- Any HTML page.sPos- This is the (integer)Vector-index that sets a limit for the left-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'inclusive' meaning that theHTMLNodeat thisVector-index will be visited by this method.If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.ePos- This is the (integer)Vector-index that sets a limit for the right-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'exclusive' meaning that the'HTMLNode'at thisVector-index will not be visited by this method.If this value is larger than the size of input theVector-parameter, an exception will throw.
Passing a negative value to this parameter,'ePos', will cause its value to be reset to the size of the inputVector-parameter.- Returns:
- The returned
Ret2instance contains the following data:-
ret2.a:
Ajava.util.Hashtablethat contains one entry for each HTML-Tag present within the page's demarcated array-indicies -'sPos'and'ePos'.
The keys in this table are JavaString'sthat contain a Lower-CaseTag-Token(such as: "div", "p", "span", etc...). The values in this table contain a count on the number of Open-Tags that were identified within the page. -
ret2.b:
Ajava.util.Hashtablewith counts for each and every "Closed Tag" on the page, all in an identical manner to that which was described, above, forret2.a- except the counts in this table are for Closed-Tag's rather than Open-Tag's -</div>tags, rather than<DIV ...>tags.
-
- Throws:
java.lang.IndexOutOfBoundsException- This exception shall be thrown if any of the following are true:- If
'sPos'is negative, or ifsPosis greater-than-or-equal-to thesizeof theVector - If
'ePos'is zero, or greater than the size of theVector - If the value of
'sPos'is a larger integer than'ePos'. If'ePos'was negative, it is first reset toVector.size(), before this check is done.
- If
- Code:
- Exact Method Body:
LV l = new LV(page, sPos, ePos); TagNode tn = null; Hashtable<String, Integer> openTags = new Hashtable<>(); Hashtable<String, Integer> closedTags = new Hashtable<>(); // Iterates the entire page between sPos and ePos, incrementing the count for every // instance of TagNode. for (int i=l.start; i < l.end; i++) { if ((tn = page.elementAt(i).ifTagNode()) == null) continue; Hashtable<String, Integer> ht = tn.isClosing ? closedTags : openTags; Integer count = ht.get(tn.tok); if (count == null) count = 1; else count = count + 1; ht.put(tn.tok, count); } return new Ret2<>(openTags, closedTags);
-
newLines
-
newLines
-
newLines
public static int newLines(java.util.Vector<? extends HTMLNode> html, int sPos, int ePos)
This will count the number of new-line symbols present - on the partial HTML page. The count will include a sum of everyHTMLNode.strthat contains the standard new-line symbols:\r\n, \r, \n, meaning that UNIX, MSFT, Apple, etc. forms of text-line rendering should all be treated equally.- Parameters:
html- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'means that aVector<TagNode>, Vector<TextNode>orVector<CommentNode>will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'vpackage.sPos- This is the (integer)Vector-index that sets a limit for the left-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'inclusive' meaning that theHTMLNodeat thisVector-index will be visited by this method.If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.ePos- This is the (integer)Vector-index that sets a limit for the right-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'exclusive' meaning that the'HTMLNode'at thisVector-index will not be visited by this method.If this value is larger than the size of input theVector-parameter, an exception will throw.
Passing a negative value to this parameter,'ePos', will cause its value to be reset to the size of the inputVector-parameter.- Returns:
- The number of new-line characters in all of the
HTMLNode'sthat occur between vectorized-page positions'sPos'and'ePos.'
NOTE: The regular-expression used here 'NEWLINEP' is as follows:
private static final Pattern NEWLINEP = Pattern.compile("\\r\\n|\\r|\\n");
- Throws:
java.lang.IndexOutOfBoundsException- This exception shall be thrown if any of the following are true:- If
'sPos'is negative, or ifsPosis greater-than-or-equal-to thesizeof theVector - If
'ePos'is zero, or greater than the size of theVector - If the value of
'sPos'is a larger integer than'ePos'. If'ePos'was negative, it is first reset toVector.size(), before this check is done.
- If
- See Also:
StringParse.NEWLINEP- Code:
- Exact Method Body:
int newLineCount = 0; LV l = new LV(html, sPos, ePos); for (int i=l.start; i < l.end; i++) // Uses the Torello.Java.StringParse "New Line RegEx" for ( Matcher m = StringParse.NEWLINEP.matcher(html.elementAt(i).str); m.find(); newLineCount++); return newLineCount;
-
-