Package Torello.HTML
Class Listeners
- java.lang.Object
-
- Torello.HTML.Listeners
-
public class Listeners extends java.lang.Object
A basic tool for finding Java-Script Listener Attributes in theTagNodeelements in a Vectorized-HTML Web-Page.
This class allows a user to search for listeners in page or sub-page. It uses the exact same hierarchy of programmer-call options to decide what to look. Search parameters are left as differing method-calls with differing argument marshalling.
Use of JavaScript Listeners:
Quite a number of large web-sites no longer use java-script in their page itself. Searching through the content served by a major hub web-site, and looking for JavaScript Listener-Attributes will often return 0 results.
There are often java-script files downloaded from the<HEAD>...<SCRIPT></SCRIPT>tags, but generally if there is scripted-content, the script will operate on the CSSclass, idor a REACT-JS tag.
In this way inserting script directly into the body-text HTML page directly is avoided. If you are scraping a page you have written yourself, and it does have java-script, then by-all-means - test it out. However If these methods are returning '0' results, at least for many of the large news-websites and search-engines which were tested - listeners inside HTML Elements seemed uncommon.
Find & Get:FINDimplies that an'int'position within theVector(a pointer) will be returned as a search result(s) from the method.GETimplies that the actualTagNode, itself, (not a pointer to its index) shall be returned from the method.
Method Parameters:int sPos, int ePos:When these parameters are present, onlyHTMLNode'sbetween these specifiedVectorindices will be considered for matching the search criteria.String htmlTags:When this parameter is present, only HTMLTagNode'swhose"primary tag"matches this string will be considered.
Hi-Lited Source-Code:- View Here: Torello/HTML/Listeners.java
- Open New Browser-Tab: Torello/HTML/Listeners.java
File Size: 18,217 Bytes Line Count: 448 '\n' Characters Found
Stateless Class:This class neither contains any program-state, nor can it be instantiated. The@StaticFunctionalAnnotation may also be called 'The Spaghetti Report'.Static-Functionalclasses are, essentially, C-Styled Files, without any constructors or non-static member fields. It is a concept very similar to the Java-Bean's@StatelessAnnotation.
- 1 Constructor(s), 1 declared private, zero-argument constructor
- 20 Method(s), 20 declared static
- 1 Field(s), 1 declared static, 1 declared final
-
-
Method Summary
Basic Methods Modifier and Type Method static Propertiesextract(TagNode tn)static Properties[]extractAll(Vector<TagNode> list)static booleanhasListener(TagNode tn)Find: Vector-indices having TagNode's with Listeners Modifier and Type Method static int[]find(Vector<? extends HTMLNode> html)static int[]find(Vector<? extends HTMLNode> html, int sPos, int ePos)static int[]find(Vector<? extends HTMLNode> html, int sPos, int ePos, String... htmlTags)static int[]find(Vector<? extends HTMLNode> html, String... htmlTags)static int[]find(Vector<? extends HTMLNode> html, DotPair dp)static int[]find(Vector<? extends HTMLNode> html, DotPair dp, String... htmlTags)Get: TagNode's that have Listeners Modifier and Type Method static Vector<TagNode>get(Vector<? extends HTMLNode> html)static Vector<TagNode>get(Vector<? extends HTMLNode> html, int sPos, int ePos)static Vector<TagNode>get(Vector<? extends HTMLNode> html, int sPos, int ePos, String... htmlTags)static Vector<TagNode>get(Vector<? extends HTMLNode> html, String... htmlTags)static Vector<TagNode>get(Vector<? extends HTMLNode> html, DotPair dp)static Vector<TagNode>get(Vector<? extends HTMLNode> html, DotPair dp, String... htmlTags)Listeners-List: Modify & Review the Internal-List of Listeners Modifier and Type Method static booleanaddNewListenerName(String listenerName)static Iterator<String>listAllAvailable()Protected Methods Modifier and Type Method protected static booleanHAS_TOK_MATCH(String htmlTag, String... htmlTags)static voidmain(String[] argv)protected static String[]toLowerCase(String[] tags)
-
-
-
Method Detail
-
main
public static void main(java.lang.String[] argv)
- Code:
- Exact Method Body:
for (String s : l) System.out.print(s + ", ");
-
listAllAvailable
public static java.util.Iterator<java.lang.String> listAllAvailable()
This will return anIteratorof the listed java-script listeners available in this class- Code:
- Exact Method Body:
return new RemoveUnsupportedIterator<String>(l.iterator());
-
addNewListenerName
public static boolean addNewListenerName(java.lang.String listenerName)
This just allows the user to add a name of a new listener that was not already stored in the internal-set of known java-script listeners. When searching a page for listeners, this class will only (obviously) be able to find ones whose names are known.- Parameters:
listenerName- The name of a listener that is not already 'known-about' in by this class- Returns:
TRUEIf the internal table of listener names was not already stored in the set,FALSEif attempting to add a listener that is already in the set.- Code:
- Exact Method Body:
return l.add(listenerName.toLowerCase());
-
extract
public static java.util.Properties extract(TagNode tn)
This will test whether listeners are present in theTagNode, and if so - return them.Input TagNodeOutput Properties: <frameset cols="20%,80%" title="Documentation frame" onload="top.loadFrames()">onload: top.loadFrames()<a href="javascript:void(0);" onclick="return j2gb('http://www.gov.cn');">onclick: return j2gb('http://www.gov.cn');- Parameters:
tn- This may be anyTagNode, but it will be tested for JavaScript listeners.- Returns:
- Will return a
java.util.Propertiesobject that contains a key-value table of any/all listeners present in theTagNode.If there are no listeners, this method will not return null, it will return an emptyPropertiesobject. - See Also:
TagNode.AV(String),StrCmpr.containsIgnoreCase(String, String)- Code:
- Exact Method Body:
Properties p = new Properties(); String s; for (String listener : l) if (StrCmpr.containsIgnoreCase(tn.str, listener)) if ((s = tn.AV(listener)) != null) // This **may** seem redundant, but it is not, because what if it was phony? // What if the "listener" key-word was actually buried in some "ALT=..." text? // The initial "StrCmpr.contains..." an optimization p.put(listener, s); return p;
-
extractAll
public static java.util.Properties[] extractAll (java.util.Vector<TagNode> list)
If you have performed a Java-Script Listener Get, this method will cycle through the list that was returned and generate an identical length returnProperties[]array that has calledextract(tn)for-each element in the parameter'list.'- Parameters:
list- A list ofTagNode'sthat are expected to contain Java-Script listeners. If some of the members of this inputVectorhaveTagNode'swith no listeners, the return array will still remain a parallel (same-size) array, however some of it's elements will havePropertieswith no key/value pairs in them (zero-size).- Returns:
- A list of
Propertiesfor each element in this'list.' - See Also:
extract(TagNode)- Code:
- Exact Method Body:
Properties[] ret = new Properties[list.size()]; for (int i=0; i < list.size(); i++) ret[i] = extract(list.elementAt(i)); return ret;
-
find
-
find
-
find
public static int[] find(java.util.Vector<? extends HTMLNode> html, int sPos, int ePos)
Find all HTML Elements (TagNodeelements) that have listeners. Limit the index of the page to a sublist of that page,- Parameters:
html- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'means that aVector<TagNode>, Vector<TextNode>orVector<CommentNode>will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'vpackage.sPos- This is the (integer)Vector-index that sets a limit for the left-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'inclusive' meaning that theHTMLNodeat thisVector-index will be visited by this method.If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.ePos- This is the (integer)Vector-index that sets a limit for the right-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'exclusive' meaning that the'HTMLNode'at thisVector-index will not be visited by this method.If this value is larger than the size of input theVector-parameter, an exception will throw.
Passing a negative value to this parameter,'ePos', will cause its value to be reset to the size of the inputVector-parameter.- Returns:
- A list of index-pointers into the underlying parameter
'html'where each node pointed to by the list contains aTagNodeelement with a listener attribute / inner-tag. Search results shall be limited to only considering elements betweensPos ... ePos. - Throws:
java.lang.IndexOutOfBoundsException- This exception shall be thrown if any of the following are true:- If
'sPos'is negative, or ifsPosis greater-than-or-equal-to thesizeof theVector - If
'ePos'is zero, or greater than the size of theVector - If the value of
'sPos'is a larger integer than'ePos'. If'ePos'was negative, it is first reset toVector.size(), before this check is done.
- If
- See Also:
hasListener(TagNode),LV- Code:
- Exact Method Body:
// Java Streams to keep lists of int's IntStream.Builder b = IntStream.builder(); LV l = new LV(html, sPos, ePos); TagNode tn; for (int i=l.start; i < l.end; i++) // Only check Openening TagNode's, long enought to have attributes, and then only // retain TagNode's that have a listener attribute. if (((tn = html.elementAt(i).openTagPWA()) != null) && hasListener(tn)) b.add(i); return b.build().toArray();
-
find
-
find
-
find
public static int[] find(java.util.Vector<? extends HTMLNode> html, int sPos, int ePos, java.lang.String... htmlTags)
Find all HTML Elements (TagNodeelements) that have listeners. Limit the index of the page to a sublist of that page, and also limit the search to only allow for matches where the HTML Element is among the list of elements in parameter'htmlTags'- Parameters:
html- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'means that aVector<TagNode>, Vector<TextNode>orVector<CommentNode>will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'vpackage.sPos- This is the (integer)Vector-index that sets a limit for the left-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'inclusive' meaning that theHTMLNodeat thisVector-index will be visited by this method.If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.ePos- This is the (integer)Vector-index that sets a limit for the right-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'exclusive' meaning that the'HTMLNode'at thisVector-index will not be visited by this method.If this value is larger than the size of input theVector-parameter, an exception will throw.
Passing a negative value to this parameter,'ePos', will cause its value to be reset to the size of the inputVector-parameter.htmlTags- A list of HTML Elements, as a varargsString...Array, that constitute a match. Any HTML Element in the web-page that has a listener attribute, but whose HTML tag/token is not present in this list will not be considered a match, and will not be returned in this method's search results.- Returns:
- A list of index-pointers into the underlying parameter
'html'where each node pointed to by the list contains aTagNodeelement with a listener attribute / inner-tag. Search results shall be limited to only considering elements betweensPos ... ePos,and also limited to HTML Elements in parameter'htmlTags' - Throws:
java.lang.IndexOutOfBoundsException- This exception shall be thrown if any of the following are true:- If
'sPos'is negative, or ifsPosis greater-than-or-equal-to thesizeof theVector - If
'ePos'is zero, or greater than the size of theVector - If the value of
'sPos'is a larger integer than'ePos'. If'ePos'was negative, it is first reset toVector.size(), before this check is done.
- If
- See Also:
HAS_TOK_MATCH(String, String[]),hasListener(TagNode),LV- Code:
- Exact Method Body:
// Java Streams can keep lists of int's IntStream.Builder b = IntStream.builder(); LV l = new LV(html, sPos, ePos); TagNode tn; htmlTags = toLowerCase(htmlTags); for (int i=l.start; i < l.end; i++) if ( // Only Match Opening-Tags with internal-string's long enough to contain Attributes ((tn = html.elementAt(i).openTagPWA()) != null) // Make sure the HTML Element (.tok field) is among the user-requested 'htmlTags' && HAS_TOK_MATCH(tn.tok, htmlTags) // Check whethr or not that the TagNode has a listener attribute (if yes, save it) && hasListener(tn) ) // Save the array-index b.add(i); return b.build().toArray();
-
get
-
get
-
get
public static java.util.Vector<TagNode> get (java.util.Vector<? extends HTMLNode> html, int sPos, int ePos)
Find all HTML Elements (TagNodeelements) that have listeners. Limit the index of the page to a sublist of that page,- Parameters:
html- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'means that aVector<TagNode>, Vector<TextNode>orVector<CommentNode>will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'vpackage.sPos- This is the (integer)Vector-index that sets a limit for the left-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'inclusive' meaning that theHTMLNodeat thisVector-index will be visited by this method.If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.ePos- This is the (integer)Vector-index that sets a limit for the right-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'exclusive' meaning that the'HTMLNode'at thisVector-index will not be visited by this method.If this value is larger than the size of input theVector-parameter, an exception will throw.
Passing a negative value to this parameter,'ePos', will cause its value to be reset to the size of the inputVector-parameter.- Returns:
- A list TagNode elements that have a listener attribute / inner-tag. Search results shall be limited to only considering elements between sPos ... ePos.
- Throws:
java.lang.IndexOutOfBoundsException- This exception shall be thrown if any of the following are true:- If
'sPos'is negative, or ifsPosis greater-than-or-equal-to thesizeof theVector - If
'ePos'is zero, or greater than the size of theVector - If the value of
'sPos'is a larger integer than'ePos'. If'ePos'was negative, it is first reset toVector.size(), before this check is done.
- If
- See Also:
hasListener(TagNode),LV- Code:
- Exact Method Body:
Vector<TagNode> ret = new Vector<>(); LV l = new LV(html, sPos, ePos); TagNode tn; for (int i=l.start; i < l.end; i++) // Only check Openening TagNode's, long enought to have attributes, and then only // retain TagNode's that have a listener attribute. If this TagNodes does have a // listener, place it in the return vector. if (((tn = html.elementAt(i).openTagPWA()) != null) && hasListener(tn)) ret.add(tn); return ret;
-
get
-
get
public static java.util.Vector<TagNode> get (java.util.Vector<? extends HTMLNode> html, DotPair dp, java.lang.String... htmlTags)
Convenience Method (Range-Limited Method)
Receives:DotPair
Invokes:get(Vector, int, int, String[])- Code:
- Exact Method Body:
return get(html, dp.start, dp.end + 1, htmlTags);
-
get
public static java.util.Vector<TagNode> get (java.util.Vector<? extends HTMLNode> html, int sPos, int ePos, java.lang.String... htmlTags)
Find all HTML Elements (TagNodeelements) that have listeners. Limit the index of the page to a sublist of that page, and also limit the search to only allow for matches where the HTML Element is among the list of elements in parameter'htmlTags'- Parameters:
html- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'means that aVector<TagNode>, Vector<TextNode>orVector<CommentNode>will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'vpackage.sPos- This is the (integer)Vector-index that sets a limit for the left-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'inclusive' meaning that theHTMLNodeat thisVector-index will be visited by this method.If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.ePos- This is the (integer)Vector-index that sets a limit for the right-mostVector-position to inspect/search inside the inputVector-parameter. This value is considered 'exclusive' meaning that the'HTMLNode'at thisVector-index will not be visited by this method.If this value is larger than the size of input theVector-parameter, an exception will throw.
Passing a negative value to this parameter,'ePos', will cause its value to be reset to the size of the inputVector-parameter.htmlTags- A list of HTML Elements, as a varargsStringArray, that constitute a match. Any HTML Element in the web-page that has a listener attribute, but whose HTML tag/token is not present in this list will not be considered a match, and will not be returned in this method's search results.- Returns:
- A list of TagNode elements that have a listener attribute / inner-tag. Search
results shall be limited to only considering elements between sPos ... ePos, and
also limited to HTML Elements in parameter
'htmlTags' - Throws:
java.lang.IndexOutOfBoundsException- This exception shall be thrown if any of the following are true:- If
'sPos'is negative, or ifsPosis greater-than-or-equal-to thesizeof theVector - If
'ePos'is zero, or greater than the size of theVector - If the value of
'sPos'is a larger integer than'ePos'. If'ePos'was negative, it is first reset toVector.size(), before this check is done.
- If
- See Also:
HAS_TOK_MATCH(String, String[]),hasListener(TagNode),LV- Code:
- Exact Method Body:
Vector<TagNode> ret = new Vector<>(); LV l = new LV(html, sPos, ePos); TagNode tn; htmlTags = toLowerCase(htmlTags); for (int i=l.start; i < l.end; i++) if ( // Only Match Opening-Tags with internal-string's long enough to contain Attributes ((tn = html.elementAt(i).openTagPWA()) != null) // Make sure the HTML Element (.tok field) is among the user-requested 'htmlTags' && HAS_TOK_MATCH(tn.tok, htmlTags) // Check whethr or not that the TagNode has a listener attribute (if yes, save it) && hasListener(tn) ) // All requirements have been affirmed, save this node in the return vector. ret.add(tn); return ret;
-
hasListener
public static boolean hasListener(TagNode tn)
Checks if a certainclass TagNodehas a listener inner-tag / attribute.- Parameters:
tn- Any HTML ElementTagNode- Returns:
TRUEIf thisTagNodehas a listener, andFALSEotherwise.- See Also:
StrCmpr.containsIgnoreCase(String, String)- Code:
- Exact Method Body:
Properties p = new Properties(); for (String listener : l) // This is a simple string-comparison - with no reg-ex involved if (StrCmpr.containsIgnoreCase(tn.str, listener)) // Slightly slower, uses a - TagNode.AV(attribute) uses a Regular-Expression if (tn.AV(listener) != null) // This **may** seem redundant, but it is not, because what if it was phony? // What if the "listener" key-word was actually buried in some "ALT=..." text? return true; return false;
-
toLowerCase
protected static java.lang.String[] toLowerCase(java.lang.String[] tags)
Converts the varargs parameter to lower-caseStrings.
Note that this is"Varargs Safe", because a newString-Array is created that has newString-pointers.- Parameters:
tags- The varargsStringparameter acquired from the search-methods in this class.- Returns:
- a lower-case version of the input.
- Code:
- Exact Method Body:
String[] ret = new String[tags.length]; for (int i=0; i < tags.length; i++) if (tags[i] != null) ret[i] = tags[i].toLowerCase(); else throw new HTMLTokException( "One of the HTML tokens you have passed to the variable-length parameter " + "'htmlTags' was null." ); return ret;
-
HAS_TOK_MATCH
protected static boolean HAS_TOK_MATCH(java.lang.String htmlTag, java.lang.String... htmlTags)
Checks if the var-args parameterString... htmlTagsmatches a particular token- Parameters:
htmlTag- The token to be checked against the user's requested'htmlTags'list parameterhtmlTags- The list of acceptable HTML Tag Elements. This is a search specification parameter used by some of the search-methods in this class.- Returns:
TRUEIf the tested token parameter'htmlTag'is a member of this elements in list parameter'htmlTags', andFALSEotherwise.- Code:
- Exact Method Body:
for (String s : htmlTags) if (s.equals(htmlTag)) return true; return false;
-
-