Package Torello.HTML
Class Listeners
- java.lang.Object
-
- Torello.HTML.Listeners
-
public class Listeners extends java.lang.Object
A basic tool for finding Java-Script Listener Attributes in theTagNode
elements in a Vectorized-HTML Web-Page.
This class allows a user to search for listeners in page or sub-page. It uses the exact same hierarchy of programmer-call options to decide what to look. Search parameters are left as differing method-calls with differing argument marshalling.
Use of JavaScript Listeners:
Quite a number of large web-sites no longer use java-script in their page itself. Searching through the content served by a major hub web-site, and looking for JavaScript Listener-Attributes will often return 0 results.
There are often java-script files downloaded from the<HEAD>...<SCRIPT></SCRIPT>
tags, but generally if there is scripted-content, the script will operate on the CSSclass, id
or a REACT-JS tag.
In this way inserting script directly into the body-text HTML page directly is avoided. If you are scraping a page you have written yourself, and it does have java-script, then by-all-means - test it out. However If these methods are returning '0' results, at least for many of the large news-websites and search-engines which were tested - listeners inside HTML Elements seemed uncommon.
Find & Get:FIND
implies that an'int'
position within theVector
(a pointer) will be returned as a search result(s) from the method.GET
implies that the actualTagNode
, itself, (not a pointer to its index) shall be returned from the method.
Method Parameters:int sPos, int ePos:
When these parameters are present, onlyHTMLNode's
between these specifiedVector
indices will be considered for matching the search criteria.String htmlTags:
When this parameter is present, only HTMLTagNode's
whose"primary tag"
matches this string will be considered.
Hi-Lited Source-Code:- View Here: Torello/HTML/Listeners.java
- Open New Browser-Tab: Torello/HTML/Listeners.java
File Size: 18,206 Bytes Line Count: 449 '\n' Characters Found
Stateless Class:This class neither contains any program-state, nor can it be instantiated. The@StaticFunctional
Annotation may also be called 'The Spaghetti Report'.Static-Functional
classes are, essentially, C-Styled Files, without any constructors or non-static member fields. It is a concept very similar to the Java-Bean's@Stateless
Annotation.
- 1 Constructor(s), 1 declared private, zero-argument constructor
- 20 Method(s), 20 declared static
- 1 Field(s), 1 declared static, 1 declared final
-
-
Method Summary
Basic Methods Modifier and Type Method static Properties
extract(TagNode tn)
static Properties[]
extractAll(Vector<TagNode> list)
static boolean
hasListener(TagNode tn)
Find: Vector-indices having TagNode's with Listeners Modifier and Type Method static int[]
find(Vector<? extends HTMLNode> html)
static int[]
find(Vector<? extends HTMLNode> html, int sPos, int ePos)
static int[]
find(Vector<? extends HTMLNode> html, int sPos, int ePos, String... htmlTags)
static int[]
find(Vector<? extends HTMLNode> html, String... htmlTags)
static int[]
find(Vector<? extends HTMLNode> html, DotPair dp)
static int[]
find(Vector<? extends HTMLNode> html, DotPair dp, String... htmlTags)
Get: TagNode's that have Listeners Modifier and Type Method static Vector<TagNode>
get(Vector<? extends HTMLNode> html)
static Vector<TagNode>
get(Vector<? extends HTMLNode> html, int sPos, int ePos)
static Vector<TagNode>
get(Vector<? extends HTMLNode> html, int sPos, int ePos, String... htmlTags)
static Vector<TagNode>
get(Vector<? extends HTMLNode> html, String... htmlTags)
static Vector<TagNode>
get(Vector<? extends HTMLNode> html, DotPair dp)
static Vector<TagNode>
get(Vector<? extends HTMLNode> html, DotPair dp, String... htmlTags)
Listeners-List: Modify & Review the Internal-List of Listeners Modifier and Type Method static boolean
addNewListenerName(String listenerName)
static Iterator<String>
listAllAvailable()
Protected Methods Modifier and Type Method protected static boolean
HAS_TOK_MATCH(String htmlTag, String... htmlTags)
static void
main(String[] argv)
protected static String[]
toLowerCase(String[] tags)
-
-
-
Method Detail
-
main
public static void main(java.lang.String[] argv)
-
listAllAvailable
public static java.util.Iterator<java.lang.String> listAllAvailable()
This will return anIterator
of the listed java-script listeners available in this class
-
addNewListenerName
public static boolean addNewListenerName(java.lang.String listenerName)
This just allows the user to add a name of a new listener that was not already stored in the internal-set of known java-script listeners. When searching a page for listeners, this class will only (obviously) be able to find ones whose names are known.- Parameters:
listenerName
- The name of a listener that is not already 'known-about' in by this class- Returns:
TRUE
If the internal table of listener names was not already stored in the set,FALSE
if attempting to add a listener that is already in the set.- Code:
- Exact Method Body:
return l.add(listenerName.toLowerCase());
-
extract
public static java.util.Properties extract(TagNode tn)
This will test whether listeners are present in theTagNode
, and if so - return them.Input TagNode
Output Properties: <frameset cols="20%,80%" title="Documentation frame" onload="top.loadFrames()">
onload: top.loadFrames()
<a href="javascript:void(0);" onclick="return j2gb('http://www.gov.cn');">
onclick: return j2gb('http://www.gov.cn');
- Parameters:
tn
- This may be anyTagNode
, but it will be tested for JavaScript listeners.- Returns:
- Will return a
java.util.Properties
object that contains a key-value table of any/all listeners present in theTagNode.
If there are no listeners, this method will not return null, it will return an emptyProperties
object. - See Also:
TagNode.AV(String)
,StrCmpr.containsIgnoreCase(String, String)
- Code:
- Exact Method Body:
Properties p = new Properties(); String s; for (String listener : l) if (StrCmpr.containsIgnoreCase(tn.str, listener)) if ((s = tn.AV(listener)) != null) // This **may** seem redundant, but it is not, because what if it was phony? // What if the "listener" key-word was actually buried in some "ALT=..." text? // The initial "StrCmpr.contains..." an optimization p.put(listener, s); return p;
-
extractAll
public static java.util.Properties[] extractAll (java.util.Vector<TagNode> list)
If you have performed a Java-Script Listener Get, this method will cycle through the list that was returned and generate an identical length returnProperties[]
array that has calledextract(tn)
for-each element in the parameter'list.'
- Parameters:
list
- A list ofTagNode's
that are expected to contain Java-Script listeners. If some of the members of this inputVector
haveTagNode's
with no listeners, the return array will still remain a parallel (same-size) array, however some of it's elements will haveProperties
with no key/value pairs in them (zero-size).- Returns:
- A list of
Properties
for each element in this'list.'
- See Also:
extract(TagNode)
- Code:
- Exact Method Body:
Properties[] ret = new Properties[list.size()]; for (int i=0; i < list.size(); i++) ret[i] = extract(list.elementAt(i)); return ret;
-
find
public static int[] find(java.util.Vector<? extends HTMLNode> html, int sPos, int ePos)
Find all HTML Elements (TagNode
elements) that have listeners. Limit the index of the page to a sublist of that page,- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.- Returns:
- A list of index-pointers into the underlying parameter
'html'
where each node pointed to by the list contains aTagNode
element with a listener attribute / inner-tag. Search results shall be limited to only considering elements betweensPos ... ePos.
- Throws:
java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
hasListener(TagNode)
,LV
- Code:
- Exact Method Body:
// Java Streams to keep lists of int's IntStream.Builder b = IntStream.builder(); LV l = new LV(html, sPos, ePos); TagNode tn; for (int i=l.start; i < l.end; i++) // Only check Openening TagNode's, long enought to have attributes, and then only // retain TagNode's that have a listener attribute. if (((tn = html.elementAt(i).openTagPWA()) != null) && hasListener(tn)) b.add(i); return b.build().toArray();
-
find
-
find
-
find
public static int[] find(java.util.Vector<? extends HTMLNode> html, int sPos, int ePos, java.lang.String... htmlTags)
Find all HTML Elements (TagNode
elements) that have listeners. Limit the index of the page to a sublist of that page, and also limit the search to only allow for matches where the HTML Element is among the list of elements in parameter'htmlTags'
- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.htmlTags
- A list of HTML Elements, as a varargsString...
Array, that constitute a match. Any HTML Element in the web-page that has a listener attribute, but whose HTML tag/token is not present in this list will not be considered a match, and will not be returned in this method's search results.- Returns:
- A list of index-pointers into the underlying parameter
'html'
where each node pointed to by the list contains aTagNode
element with a listener attribute / inner-tag. Search results shall be limited to only considering elements betweensPos ... ePos,
and also limited to HTML Elements in parameter'htmlTags'
- Throws:
java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
HAS_TOK_MATCH(String, String[])
,hasListener(TagNode)
,LV
- Code:
- Exact Method Body:
// Java Streams can keep lists of int's IntStream.Builder b = IntStream.builder(); LV l = new LV(html, sPos, ePos); TagNode tn; htmlTags = toLowerCase(htmlTags); for (int i=l.start; i < l.end; i++) if ( // Only Match Opening-Tags with internal-string's long enough to contain Attributes ((tn = html.elementAt(i).openTagPWA()) != null) // Make sure the HTML Element (.tok field) is among the user-requested 'htmlTags' && HAS_TOK_MATCH(tn.tok, htmlTags) // Check whethr or not that the TagNode has a listener attribute (if yes, save it) && hasListener(tn) ) // Save the array-index b.add(i); return b.build().toArray();
-
get
-
get
public static java.util.Vector<TagNode> get (java.util.Vector<? extends HTMLNode> html, int sPos, int ePos)
Find all HTML Elements (TagNode
elements) that have listeners. Limit the index of the page to a sublist of that page,- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.- Returns:
- A list TagNode elements that have a listener attribute / inner-tag. Search results shall be limited to only considering elements between sPos ... ePos.
- Throws:
java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
hasListener(TagNode)
,LV
- Code:
- Exact Method Body:
Vector<TagNode> ret = new Vector<>(); LV l = new LV(html, sPos, ePos); TagNode tn; for (int i=l.start; i < l.end; i++) // Only check Openening TagNode's, long enought to have attributes, and then only // retain TagNode's that have a listener attribute. If this TagNodes does have a // listener, place it in the return vector. if (((tn = html.elementAt(i).openTagPWA()) != null) && hasListener(tn)) ret.add(tn); return ret;
-
get
-
get
public static java.util.Vector<TagNode> get (java.util.Vector<? extends HTMLNode> html, DotPair dp, java.lang.String... htmlTags)
Convenience Method (Range-Limited Method)
Receives:DotPair
Invokes:get(Vector, int, int, String[])
-
get
public static java.util.Vector<TagNode> get (java.util.Vector<? extends HTMLNode> html, int sPos, int ePos, java.lang.String... htmlTags)
Find all HTML Elements (TagNode
elements) that have listeners. Limit the index of the page to a sublist of that page, and also limit the search to only allow for matches where the HTML Element is among the list of elements in parameter'htmlTags'
- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.htmlTags
- A list of HTML Elements, as a varargsString
Array, that constitute a match. Any HTML Element in the web-page that has a listener attribute, but whose HTML tag/token is not present in this list will not be considered a match, and will not be returned in this method's search results.- Returns:
- A list of TagNode elements that have a listener attribute / inner-tag. Search
results shall be limited to only considering elements between sPos ... ePos, and
also limited to HTML Elements in parameter
'htmlTags'
- Throws:
java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
HAS_TOK_MATCH(String, String[])
,hasListener(TagNode)
,LV
- Code:
- Exact Method Body:
Vector<TagNode> ret = new Vector<>(); LV l = new LV(html, sPos, ePos); TagNode tn; htmlTags = toLowerCase(htmlTags); for (int i=l.start; i < l.end; i++) if ( // Only Match Opening-Tags with internal-string's long enough to contain Attributes ((tn = html.elementAt(i).openTagPWA()) != null) // Make sure the HTML Element (.tok field) is among the user-requested 'htmlTags' && HAS_TOK_MATCH(tn.tok, htmlTags) // Check whethr or not that the TagNode has a listener attribute (if yes, save it) && hasListener(tn) ) // All requirements have been affirmed, save this node in the return vector. ret.add(tn); return ret;
-
hasListener
public static boolean hasListener(TagNode tn)
Checks if a certainclass TagNode
has a listener inner-tag / attribute.- Parameters:
tn
- Any HTML ElementTagNode
- Returns:
TRUE
If thisTagNode
has a listener, andFALSE
otherwise.- See Also:
StrCmpr.containsIgnoreCase(String, String)
- Code:
- Exact Method Body:
Properties p = new Properties(); for (String listener : l) // This is a simple string-comparison - with no reg-ex involved if (StrCmpr.containsIgnoreCase(tn.str, listener)) // Slightly slower, uses a - TagNode.AV(attribute) uses a Regular-Expression if (tn.AV(listener) != null) // This **may** seem redundant, but it is not, because what if it was phony? // What if the "listener" key-word was actually buried in some "ALT=..." text? return true; return false;
-
toLowerCase
protected static java.lang.String[] toLowerCase(java.lang.String[] tags)
Converts the varargs parameter to lower-caseStrings.
Note that this is"Varargs Safe"
, because a newString
-Array is created that has newString
-pointers.- Parameters:
tags
- The varargsString
parameter acquired from the search-methods in this class.- Returns:
- a lower-case version of the input.
- Code:
- Exact Method Body:
String[] ret = new String[tags.length]; for (int i=0; i < tags.length; i++) if (tags[i] != null) ret[i] = tags[i].toLowerCase(); else throw new HTMLTokException( "One of the HTML tokens you have passed to the variable-length parameter " + "'htmlTags' was null." ); return ret;
-
HAS_TOK_MATCH
protected static boolean HAS_TOK_MATCH(java.lang.String htmlTag, java.lang.String... htmlTags)
Checks if the var-args parameterString... htmlTags
matches a particular token- Parameters:
htmlTag
- The token to be checked against the user's requested'htmlTags'
list parameterhtmlTags
- The list of acceptable HTML Tag Elements. This is a search specification parameter used by some of the search-methods in this class.- Returns:
TRUE
If the tested token parameter'htmlTag'
is a member of this elements in list parameter'htmlTags'
, andFALSE
otherwise.- Code:
- Exact Method Body:
for (String s : htmlTags) if (s.equals(htmlTag)) return true; return false;
-
-