Package Torello.HTML.NodeSearch
Interface AVT
-
- All Superinterfaces:
java.util.function.Predicate<TagNode>
,java.io.Serializable
- Functional Interface:
- This is a functional interface and can therefore be used as the assignment target for a lambda expression or method reference.
@FunctionalInterface public interface AVT extends java.util.function.Predicate<TagNode>, java.io.Serializable
A functional-interface / lambda-target, and severalstatic
-builders for generating instances of them, which extendsjava.util.function.Predicate
and encapsulates search-criteria into aPredicate<
TagNode
>
AVT: Attribute Value Test
Note: The words 'Attribute' and 'Inner-Tag' are used interchangeably in these search classes, and packages.
The primary impetus for writing this class is to allow for multiple requirements or multiple search criteria when looking for an HTML TagNode. This class will, largely, not be commonly used, however when multiple requirements are needed,'AVT'
will do the trick. Generally, the most common search on HTMLTagNode's
would likely be by the commonplace CSSclass='...'
attribute-value, or often theid='...'
inner-tag. However, there will be many instances, if these search routines are used properly, when certain pieces of data that not identified by any class or id are needed to specify a search criteria - in addition toclass
orid
.
IMPORTANT:
The best way to provide multiple search-specifications for an HTMLTagNode
would be to "chain" (Functional-Interface)AVT
Predicate's using Java 8 / Lambda boolean chaining methods:and(), or(), negate()
. Java 8's Lambda Functional-Interface documentation is very copious on the internet, and this interface has taken very good care to make sure that the usualand, or, not
routines work well.
This class is an extension of the java functional interfacePredicate
It is aTagNode
-Predicate, and can be used to keep/store a "TagNode
test" when searching forTagNode's
inside an HTML page-Vector
. The methods in this class that are deemed static, are so because they are "factory methods" that can easily generate an instance of'AV'
via the the usual list of options available from this Search-Package. The four primary options for finding aTagNode
are usually:- Using a
TextComparitor
on a particular attribute-name, and comparing the retrievedinnerTag
value with a list of "CompareString's
" - Using Java's extremely valuable Regular Expression Processor on any retrieved attribute value.
- Building your own, custom,
Predicate<String>
, and passing that to the search-engine. ThisPredicate
will be used on each and everyTagNode
within the page-Vector
, and return results which contain attribute-values that received a TRUE boolean response from thisPredicate<String>
- Passing an
innerTag
as aString
, with the expectation that each and every TagNode that has this attribute-name (regardless of its value) will be included in the result set.
Following are two examples uses of thisFunctional Interface
. It is important to note that in both of these, presume that the variable'page'
that is used, has been loaded from some web-siteURL
, and parsed usingHTMLPage.getPageTokens(...)
Example:
// In this example a sub-set of images from a web-site should be downloaded. Vector<TagNode> images = TagNodeGet.all(page, TC.OpeningTags, "img"); // gets HTML <IMG> elements Vector<TagNode> businessTripImages = InnerTagGet.all(images, "alt", TextComparitor.CONTAINS, "Business Trip"); Vector<TagNode> deptImages = InnerTagGet.all(businessTripImages, "class", TextComparitor.EQ, "CompanyDiv1"); // These THREE statements could be converted and condensed to ONE as follows: Vector<TagNode> deptImages = InnerTagGet.all( page, AVT .cmp("alt", TextComparitor.CONTAINS, "Business Trip") .and(AVT.cmp("class", TextComparitor.EQ, "CompanyDiv1")), "img" ); // And to download the images - use ImageScraper new ImageScraper(deptImages, url, "myUserID/mySaveDirectory").download();
In this example,URL
links are retrieved and resolved.
Example:
// Here, a list of URL Links are retrieved. Pattern urlPattern = Pattern.compile("company-ABC\\/department-XYZ\\/catalog\\d+"); Vector<TagNode> links = InnerTagGet.all(page, "a", "href", urlPattern); Vector<TagNode> localLinks = InnerTagGet.all(links, "href", TextComparitor.CN_CI, "My-Office"); // The above two lines could, more easily, be accomplished with: Vector<TagNode> localLinks = InnerTagGet.all( page, AVT .cmp("href", urlPattern) .and(AVT.cmp("href", TextComparitor.CN_CI, "My-Office")), "a" ); // And... now get the links Vector<URL> urls = Links.resolveSRCs(localLinks, url);
Hi-Lited Source-Code:- View Here: Torello/HTML/NodeSearch/AVT.java
- Open New Browser-Tab: Torello/HTML/NodeSearch/AVT.java
File Size: 39,003 Bytes Line Count: 786 '\n' Characters Found
-
-
Field Summary
Fields Modifier and Type Field static long
serialVersionUID
-
Method Summary
@FunctionalInterface: (Lambda) Method Modifier and Type Method boolean
test(TagNode tn)
Static-Factory Builder: Attribute-Value Tests Modifier and Type Method static AVT
cmp(String innerTag)
static AVT
cmp(String innerTag, Predicate<String> innerTagValueTest)
static AVT
cmp(String innerTag, Pattern p)
static AVT
cmp(String innerTag, Pattern p, boolean keepOnMatch, boolean keepOnNull)
static AVT
cmp(String innerTag, TextComparitor tc, String... compareStr)
static AVT
cmp(String innerTag, StrFilter innerTagValueTest)
Static-Factory Builder: Keep if Inner-Tag is Not Found Modifier and Type Method static AVT
cmpKIITNF(String innerTag, Predicate<String> innerTagValueTest)
static AVT
cmpKIITNF(String innerTag, Pattern p)
static AVT
cmpKIITNF(String innerTag, TextComparitor tc, String... compareStr)
static AVT
cmpKIITNF(String innerTag, StrFilter innerTagValueTest)
Static-Factory Builder: TagNode Equality Tests Modifier and Type Method static AVT
isEqualKEEP(TagNode expectedTN)
static AVT
isEqualREJECT(TagNode expectedTN)
Default Composition & Builder Methods Modifier and Type Method default AVT
and(AVT additionalTest)
default AVT
negate()
default AVT
or(AVT additionalTest)
-
-
-
Field Detail
-
serialVersionUID
static final long serialVersionUID
This fulfils the SerialVersion UID requirement for all classes that implement Java'sinterface java.io.Serializable
. Using theSerializable
Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.
Functional Interfaces are usually not thought of as Data Objects that need to be saved, stored and retrieved; however, having the ability to store intermediate results along with the lambda-functions that helped get those results can make debugging easier.- See Also:
- Constant Field Values
- Code:
- Exact Field Declaration Expression:
public static final long serialVersionUID = 1;
-
-
Method Detail
-
test
boolean test(TagNode tn)
FUNCTIONAL-INTERFACE BOOLEAN METHOD: This is the method that fulfils thisfunctional-interface 'test'
method.- Specified by:
test
in interfacejava.util.function.Predicate<TagNode>
- Parameters:
tn
- This method will be called - once for eachTagNode
found inside of a vectorized HTML page.- Returns:
- If the
TagNode
meets the test's "inclusion requirements", then this method should returnTRUE
.
-
cmp
static AVT cmp(java.lang.String innerTag, TextComparitor tc, java.lang.String... compareStr)
This is astatic
factory method that generatesAVT-Predicate's
(Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check too. The primary use of this class is that the results of one factory method may be "AND-chained" or "OR-chained" with another to make search requirements more specific.- Parameters:
innerTag
- This also goes by the term "attribute" in many HTML specifications. It is the name of the attribute, not it's value. The value will be found (if theTagNode
contains this attribute), and the parameter'TextComparitor'
will be used to compare this value - dependent upon which'TextComparitor'
is used against the Compare-Stringstc
- This may be any of the listedTextComparitor's
in the class. There are quite a few "pre-defined"static
members in theTextComparitor
class. There are many that have both long names, and abbreviated names which can be interchangeably used for readability purposes.compareStr
- These are passed to the'TextComparitor'
when using to perform tests on the attribute value.- Returns:
- An instance of
'AVT'
that can be passed to the NodeSearch classes search-methods via any one of the methods that accept aPredicate<TagNode>
as a parameter to the search criteria.
EQUIVALENCE NOTE: The two lines of code in the sample below will return equivalent results. It is important to remember that the primary use ofclass AVT
is not for simple, single-line-searches, but rather to allow a programmer to "chain" search parameters using the standard-lambda functions,and(...), or(...)
, andnegate()
.
Example:
// Version 1: int posArr = InnerTagFind.all (page, "div", "class", TextComparitor.CONTAINS_CASE_INSENSITIVE, "departmentXYZ"); // Version 2: AVT specifier = AVT.cmp("class", TextComparitor.CONTAINS_CASE_INSENSITIVE, "departmentXYZ"); int posArr = InnerTagFind.all(page, specifier, "div"); // Both of the version above will result in identical 'posArr' values and lengths. // NOTE: The specifier can be saved, stored, and specifically 'chained' with other specifiers // using and(...), or(...), negate()
- Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
ARGCHECK.innerTag(String)
,ARGCHECK.TC(TextComparitor, String[])
,TagNode.AV(String)
,TextComparitor.test(String, String[])
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If these tests fail, the returned predicate would absolutely fail. final String innerTagLC = ARGCHECK.innerTag(innerTag); ARGCHECK.TC(tc, compareStr); // Minimum length for field TagNode.str to have before it could possible contain the attribute // Obviously, the TagNode would have to have a min-length that includes the // attribute-name length + '< ' and '>' final int MIN_LEN = innerTag.length() + 3; // Java's "Lambda-Expression" Syntax (like an "anonymous method"). // AVT extends functional-interface Predicate<TagNode> return (TagNode tn) -> { // This eliminates testing any TagNode that simply COULD NOT contain the // specified attribute. (an optimization) if (tn.isClosing || (tn.str.length() <= (tn.tok.length() + MIN_LEN))) return false; // Retrieve the value of the requested "inner-tag" (HTML Attribute) Key-Value Pair // from the input HTML-Element (TagNode) String itv = tn.AV(innerTagLC); // REG-EX MATCHER, MORE EXPENSIVE // If the innerTag's value is null, then the inner-tag was not a key-value // found inside the TagNode: return false. // Otherwise return the 'tc' test-results on that value using the named 'tc' // comparison on the compare-strings. return (itv == null) ? false : tc.test(itv, compareStr); };
-
cmpKIITNF
static AVT cmpKIITNF(java.lang.String innerTag, TextComparitor tc, java.lang.String... compareStr)
cmpKIITNF: Compare, and Keep If Inner-Tag is Not Found
This is astatic
factory method that generatesAVT-Predicate's
(Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check as well. The primary use of this class is that the results of one factory method may be "AND-chained" or "OR-chained" with another to make search requirements more specific.
This method differs from the standardcmp
-factory method with identical parameters in that it allows the programmer to build an Attribute Value Tester that "passes" the test ... (filters to include, not exclude) ... if and when the Inner Tag (Attribute) of an HTML Element is not found inside that Element.
Default Behavior:
The default filter-results for the search classes and search methods of the Node-Search Package are such that if an inner-tag is simply not available ... or 'not present' within an HTML Element, then that element will not be included in the search results for that class or method. By using this particularAVT
, a programmer can by-pass that default behavior.
Example:
Below are three HTML'TagNode'
elements, and one'AVT'
filterPredicate
(visible inside the comments). If the filter listed were applied, the first two elements pass the predicate, while the last one fails!- Filter out all HTML Anchor Elements whose
'HREF'
Inner-Tag value were set to'javascript:void'
- Keep all HTML
Anchor
Elements that have typicalURL's
set with the'HREF'
Inner-Tag - Also... Filter in (KEEP) any HTML
Anchor
Elements that simply did not have an'HREF'
Attribute
HTML Elements:
<!-- For filter: AVT.cmpKIITNF("href", TextComparitor.DOES_NOT_START_WITH, "javascript:void"); The two HTML Anchor Elements, below, would PASS the filter criteria. --> <A ID="MyAnchor" CLASS="CompanyXYZ" ALT="This Anchor does not have an HREF Attribute."> ... </A> <A HREF="http://my.company.com/LocalFolder/OurPicNic.html" ALT="This does have a value for HREF"> ... </A> <!-- The following HTML Anchor Element would NOT BE included in the search results. --> <A HREF="javascript:void(0)"> ...</A>
- Parameters:
innerTag
- This also goes by the term "attribute" in many HTML specifications. It is the name of the attribute, not it's value. The value will be found (if theTagNode
contains this attribute), and the parameterTextComparitor
will be used to compare this value - dependent upon which'TextComparitor'
is used against the Compare-String's
tc
- This may be any of the listedTextComparitor's
in the class. There are quite a few "pre-defined"static
members in theTextComparitor
class. There are many that have both long names, and abbreviated names which can be interchangeably used for readability purposes.compareStr
- These are passed to the'TextComparitor'
to perform tests on the attribute value.- Returns:
- An instance of
'AVT'
that can be passed to the NodeSearch classes search-methods via any one of the methods that accepts aPredicate<TagNode>
as a parameter in the search criteria. - Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
cmp(String, TextComparitor, String[])
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If these tests fail, the returned predicate would absolutely fail. final String innerTagLC = ARGCHECK.innerTag(innerTag); ARGCHECK.TC(tc, compareStr); // Java's "Lambda-Expression" Syntax (like an "anonymous method"). // AVT extends functional-interface Predicate<TagNode> return (TagNode tn) -> { // This eliminates testing any TagNode that simply COULD NOT contain // attributes. (an optimization) // // KIITNF -> Empty Opening HTML TagNode Elements cannot be eliminated! // HOWEVER, Closing TagNodes are never included if (tn.isClosing) return false; // Retrieve the value of the requested "inner-tag" (HTML Attribute) Key-Value Pair // from the input HTML-Element (TagNode) String itv = tn.AV(innerTagLC); // REG-EX MATCHER, MORE EXPENSIVE // If the innerTag's value is null, then the inner-tag was not a key-value pair // found inside the TagNode. // // BECAUSE the user requested to "Keep If Inner-Tag Not Found", we must return TRUE // in that case. // In Java '||' uses short-circuit boolean-evaluation, while '|' requires // full-evaluation. // // OTHERWISE return the 'tc' test-results on that value using the named 'tc' comparison // on the compare-strings. return (itv == null) || tc.test(itv, compareStr); };
- Filter out all HTML Anchor Elements whose
-
cmp
static AVT cmp(java.lang.String innerTag, java.util.regex.Pattern p)
This is astatic
factory method that generatesAVT-Predicate's
(Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check too. The primary use of this class is that the results of one factory method may be "AND-chained" or "OR-chained" with another to make search requirements more specific.- Parameters:
innerTag
- This also goes by the term "attribute" in many HTML specifications. It is the name of the attribute, not it's value. The value will be found (if theTagNode
contains this attribute), and then tested using the Regular-Expressionp.matcher(tag_value).find()
method.p
- This may be any regular expressionPattern
. ThisPattern
will be executed against the value of the inner-tag specified by parameter'innerTag'
.- Returns:
- An instance of
'AVT'
that can be passed to the NodeSearch classes search-methods via any one of the methods that accepts aPredicate<TagNode>
as a parameter in the search criteria.
EQUIVALENCE NOTE: The two lines of code in the sample below will return equivalent results. It is important to remember that the primary use ofclass AVT
is not for simple, single-line-searches, but rather to allow a programmer to "chain" search parameters using the standard-lambda functions,and(...), or(...)
, andnegate()
.
Example:
// Version 1: int posArr = InnerTagFind.all(page, "div", "class", Pattern.compile("^DepartmentXYZ.*$")); // Version 2: AVT specifier = AVT.cmp("class", Pattern.compile("^DepartmentXYZ.*$")); int posArr = InnerTagFind.all(page, specifier, "div"); // Both of the version above will result in identical 'posArr' values and lengths. // NOTE: The specifier can be saved, stored, and specifically 'chained' with other // specifiers using and(...), or(...), negate()
- Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
ARGCHECK.innerTag(String)
,ARGCHECK.REGEX(Pattern)
,TagNode.AV(String)
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If these tests fail, the returned predicate would absolutely fail. final String innerTagLC = ARGCHECK.innerTag(innerTag); final Predicate<String> pred = ARGCHECK.REGEX(p); // Minimum length for field TagNode.str to have before it could possible contain the attribute // Obviously, the TagNode would have to have a min-length that includes the attribute-name // length + '< ' and '>' final int MIN_LEN = innerTag.length() + 3; // Java's "Lambda-Expression" Syntax (like an "anonymous method"). // AVT extends functional-interface Predicate<TagNode> return (TagNode tn) -> { // This eliminates testing any TagNode that simply COULD NOT contain the // attribute. (an optimization) if (tn.isClosing || (tn.str.length() <= (tn.tok.length() + MIN_LEN))) return false; // Retrieve the value of the requested "inner-tag" (HTML Attribute) Key-Value Pair // from the input HTML-Element (TagNode) // // REG-EX MATCHER, MORE EXPENSIVE String itv = tn.AV(innerTagLC); // If the innerTag's value is null, then the inner-tag was not a key-value pair // found inside the TagNode: return false. // Otherwise return the results of running the Regular-Expression matcher using the // input 'Pattern' instance. return (itv == null) ? false : pred.test(itv); };
-
cmp
static AVT cmp(java.lang.String innerTag, java.util.regex.Pattern p, boolean keepOnMatch, boolean keepOnNull)
This is astatic
factory method that generatesAVT-Predicate's
(Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check too. The primary use of this class is that the results of one factory method may be "AND-chained" or "OR-chained" with another to make search requirements more specific.- Parameters:
innerTag
- This also goes by the term "attribute" in many HTML specifications. It is the name of the attribute, not it's value. The value will be found (if theTagNode
contains this attribute), and then tested using the Regular-Expressionp.matcher(tag_value).find()
method.p
- This may be any regular expressionPattern
. ThisPattern
will be executed against the value of the inner-tag specified by parameter'innerTag'
.keepOnMatch
- There may be times when it is necessary to specify that a Regular-Expression match should cause the search-filter to reject aTagNode
, rather than keeping it as a search-result match. In this case, the programmer can utilize this variable to indicate whether matches should cause this method to returnTRUE
orFALSE
. If this variable is set toFALSE
, then thePredicate<TagNode>
that is generated will returnFALSE
, whenever the regular-expression matches the Attribute-Value.
DEFAULT BEHAVIOR NOTE: The classes and methods in this Node Search Package that accept regular-expressions as search-parameters will always treat a match to indicate that theTagNode
(orTextNode
) in question has passed the search-filter criteria. This method, therefore, provides a way to bypass this default behavior.keepOnNull
- This parameter allows the user to specify whether the absence of an HTML Inner-Tag should indicate that the TagNode being tested should pass or fail (keep or reject) the search-filter criteria.
DEFAULT BEHAVIOR NOTE: The default filter-results for the search classes and search methods of the Node-Search Package are such that if an inner-tag is simply not available ... or 'not present' within an HTML Element, then that element will not be included in the search results for that class or method. By using this particularAVT
factory-method, a programmer can by-pass that default behavior.- Returns:
- An instance of
'AVT'
that can be passed to the NodeSearch classes search-methods via any one of the methods that accepts aPredicate<TagNode>
as a parameter in the search criteria. - Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
ARGCHECK.innerTag(String)
,ARGCHECK.REGEX(Pattern)
,TagNode.AV(String)
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If these tests fail, the returned predicate would absolutely fail. final String innerTagLC = ARGCHECK.innerTag(innerTag); final Predicate<String> pred = ARGCHECK.REGEX(p); // Java's "Lambda-Expression" Syntax (like an "anonymous method"). // AVT extends functional-interface Predicate<TagNode> return (TagNode tn) -> { // This eliminates testing any TagNode that simply COULD NOT contain // attributes. (an optimization) // // keepOnNull -> Empty Opening HTML TagNode Elements cannot be eliminated! // HOWEVER, Closing TagNodes are never included if (tn.isClosing) return false; // Retrieve the value of the requested "inner-tag" (HTML Attribute) Key-Value Pair // from the input HTML-Element (TagNode) // // REG-EX MATCHER, MORE EXPENSIVE String itv = tn.AV(innerTagLC); // If the Attribute is simply not present in the HTML Element if (itv == null) return keepOnNull; if (pred.test(itv)) return keepOnMatch; // if the Regular-Expression succeeded else return ! keepOnMatch; // If the Regular-Expression failed };
-
cmpKIITNF
static AVT cmpKIITNF(java.lang.String innerTag, java.util.regex.Pattern p)
cmpKIITNF: Compare, and Keep If Inner-Tag is Not Found
This is astatic
factory method that generatesAVT-Predicate's
(Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check too. The primary use of this class is that the results of one factory method may be "AND-chained" or "OR-chained" with another to make search requirements more specific.
This method differs from the standard'cmp'
factory method with identical parameters in that it allows the programmer to build an Attribute Value Tester that "passes" the test ... (filters to include, not exclude) ... if and when the Inner Tag (Attribute) of an HTML Element is not found inside that Element.
Default Behavior:
The default filter-results for the search classes and search methods of the Node-Search Package are such that if an inner-tag is simply not available ... or 'not present' within an HTML Element, then that element will not be included in the search results for that class or method. By using this particularAVT
, a programmer can by-pass that default behavior.
Example:
Below are three HTML'TagNode'
elements, and one'AVT'
filterPredicate
(visible inside the comments). If the filter listed were applied, the first two elements pass the predicate, while the last one fails!- Retain all HTML Divider Elements whose
'CLASS'
attribute value were set to some derivative of'CompanyABC'
- Filter out all HTML Divider Elements that have other values for the
'CLASS'
attribute - Also KEEP any HTML Divider Elements that simply did not have a
'CLASS'
attribute
HTML Elements:
<!-- For filter: AVT.cmpKIITNF("class", Pattern.compile("*CompanyABC*")); The two HTML Divider Elements, below, would PASS the filter criteria. --> <DIV ID="MyAnchor" ALT="This Divider does not have an CLASS Inner-Tag."> ... </DIV> <DIV CLASS="CompanyABC-Department2"> ...</DIV> <!-- The following HTML Divider Element would NOT BE included in the search results --> <DIV CLASS="CompanyXYZ"> ... </DIV>
- Parameters:
innerTag
- This also goes by the term "attribute" in many HTML specifications. It is the name of the attribute, not it's value. The value will be found (if theTagNode
contains this attribute), and then tested using the Regular-Expressionp.matcher(tag_value).find()
method.p
- This may be any regular expressionPattern
. ThisPattern
will be executed against the value of the inner-tag specified by parameter'innerTag'
.- Returns:
- An instance of
'AVT'
that can be passed to the NodeSearch classes search-methods via any one of the methods that accepts aPredicate<TagNode>
as a parameter in the search parameter-list. - Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
cmp(String, Pattern)
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If these tests fail, the returned predicate would absolutely fail. final String innerTagLC = ARGCHECK.innerTag(innerTag); final Predicate<String> pred = ARGCHECK.REGEX(p); // Java's "Lambda-Expression" Syntax (like an "anonymous method"). // AVT extends functional-interface Predicate<TagNode> return (TagNode tn) -> { // This eliminates testing any TagNode that simply COULD NOT contain // attributes. (an optimization) // // KIITNF -> Empty Opening HTML TagNode Elements cannot be eliminated! // HOWEVER, Closing TagNodes are never included if (tn.isClosing) return false; // Retrieve the value of the requested "inner-tag" (HTML Attribute) Key-Value Pair // from the input HTML-Element (TagNode) String itv = tn.AV(innerTagLC); // REG-EX MATCHER, MORE EXPENSIVE // If the innerTag's value is null, then the inner-tag was not a key-value pair // found inside the TagNode. // // BECAUSE the user requested to "Keep If Inner-Tag Not Found", we must return // in that case. // In Java '||' uses short-circuit boolean-evaluation, while '|' requires // full-evaluation. // // OTHERWISE return the results of running the Regular-Expression matcher using the // input 'Pattern' instance. return (itv == null) || pred.test(itv); };
- Retain all HTML Divider Elements whose
-
cmp
static AVT cmp (java.lang.String innerTag, java.util.function.Predicate<java.lang.String> innerTagValueTest)
This is astatic
factory method that generatesAVT-Predicate's
- (Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check too. The primary use of this class is that the results of one factory method may be "AND-chained" or "OR-chained" with another to make search requirements more specific.
NOTE: The astute observer might wonder why change from aString-Predicate
to aTagNode-Predicate
, with the answer being that predicate-chaining on different, multiple inner-tags (and their values) can only be accomplished by using aTagNode-Predicate
, rather than aString-Predicate
- Parameters:
innerTag
- This also goes by the term "attribute" in many HTML specifications. It is the name of the attribute, not it's value. The value will be found (if theTagNode
contains this attribute), and then tested against theString-Predicate
in parameter'innerTagValueTest'
.innerTagValueTest
- This may be any JavaString-Predicate
with atest(...) / accept
method. It will be used to accept or reject the inner-tag's value- Returns:
- An instance of
'AVT'
that can be passed to the NodeSearch classes search-methods via any one of the methods that accepts aPredicate<TagNode>
as a parameter in the search criteria.
EQUIVALENCE NOTE: The two lines of code in the sample below will return equivalent results. It is important to remember that the primary use ofclass AVT
is not for simple, single-line-searches, but rather to allow a programmer to "chain" search parameters using the standard-lambda functions,and(...), or(...), negate()
.
Example:
// Version 1: int posArr = InnerTagFind.all(page, "div", "class", (String innerTagValue) -> innerTagValue.charAt(4).equals('$')); // Version 2: AVT specifier = AVT.cmp("class", (String innerTagValue) -> innerTagValue.charAt(4).equals('$')); int posArr = InnerTagFind.all(page, specifier, "div"); // Both of the version above will result in identical 'posArr' values and lengths. // NOTE: The specifier can be saved, stored, and specifically 'chained' with other specifiers using // and(...), or(...), negate()
- Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
InnerTagFind
,ARGCHECK.innerTag(String)
,TagNode.AV(String)
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If these tests fail, the returned predicate would absolutely fail. final String innerTagLC = ARGCHECK.innerTag(innerTag); if (innerTagValueTest == null) throw new NullPointerException ("Parameter innerTagValueTest was passed null, but this is not allowed here."); // Minimum length for field TagNode.str to have before it could possible contain the attribute // Obviously, the TagNode would have to have a min-length that includes the attribute-name // length + '< ' and '>' final int MIN_LEN = innerTag.length() + 3; // Java's "Lambda-Expression" Syntax (like an "anonymous method"). // AVT extends functional-interface Predicate<TagNode> return (TagNode tn) -> { // This eliminates testing any TagNode that simply COULD NOT contain the // attribute. (an optimization) if (tn.isClosing || (tn.str.length() <= (tn.tok.length() + MIN_LEN))) return false; // Retrieve the value of the requested "inner-tag" (HTML Attribute) Key-Value Pair // from the input HTML-Element (TagNode) String itv = tn.AV(innerTagLC); // REG-EX MATCHER, MORE EXPENSIVE // If the innerTag's value is null, then the inner-tag was not a key-value pair // found inside the TagNode: return false. // Otherwise return the results of the Predicate<String> provided on that // attribute-value. return (itv == null) ? false : innerTagValueTest.test(itv); };
-
cmpKIITNF
static AVT cmpKIITNF (java.lang.String innerTag, java.util.function.Predicate<java.lang.String> innerTagValueTest)
cmpKIITNF: Compare, and Keep If Inner-Tag is Not Found
This is astatic
factory method that generatesAVT-Predicate's
- (Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check too. The primary use of this class is that the results of one factory method may be "AND-chained" or "OR-chained" with another to make search requirements more specific.
This method differs from the standard'cmp'
factory method with identical parameters in that it allows the programmer to build an Attribute Value Tester that "passes" the test ... (filters to include, not exclude) ... if and when the Inner Tag (Attribute) of an HTML Element is not found inside that Element.
Default Behavior:
The default filter-results for the search classes and search methods of the Node-Search Package are such that if an inner-tag is simply not available ... or 'not present' within an HTML Element, then that element will not be included in the search results for that class or method. By using this particularAVT
, a programmer can by-pass that default behavior.
Example:
Below are three HTML'TagNode'
elements, and one'AVT'
filterPredicate
(visible inside the comments). If the filter listed were applied, the first two elements pass the predicate, while the last one fails!- Filter out all HTML Image Elements whose "ALT" Text Contained the words Associated Press
- Keep all other HTML Image Elements whose "ALT" Text did not contain those words.
- Also KEEP any HTML Image Elements that simply did not have an "ALT" Attribute
HTML Elements:
<!-- For filter: AVT.cmpKIITNF("src", (String itv) -> StrCmpr.containsNAND(itv, "Associated", "Press")); The two HTML Image Elements, below, would PASS the filter criteria. --> <IMG SRC="img01.jpg" ALT="Company Photo, Department Picture"> <IMG SRC="img02.jpg"> <!-- The following HTML Anchor Element would NOT BE included in the search results. --> <IMG SRC="img03.jpg" ALT="Photo by: Journalist Ben Bitdiddle, Associated Press">
- Parameters:
innerTag
- This also goes by the term "attribute" in many HTML specifications. It is the name of the attribute, not it's value. The value will be found (if theTagNode
contains this attribute), and then tested against theString-Predicate
parameter'innerTagValueTest'
.innerTagValueTest
- This may be any JavaString-Predicate
with atest(...) / accept
method. It will be used to accept or reject the inner-tag's value.- Returns:
- An instance of
'AVT'
that can be passed to the NodeSearch classes search-methods via any one of the methods that accepts aPredicate<TagNode>
as a parameter in the search criteria. - Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
cmp(String, Predicate)
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If these tests fail, the returned predicate would absolutely fail. final String innerTagLC = ARGCHECK.innerTag(innerTag); if (innerTagValueTest == null) throw new NullPointerException ("Parameter innerTagValueTest was passed null, but this is not allowed here."); // Java's "Lambda-Expression" Syntax (like an "anonymous method"). // AVT extends functional-interface Predicate<TagNode> return (TagNode tn) -> { // This eliminates testing any TagNode that simply COULD NOT contain // attributes. (an optimization) // // KIITNF -> Empty Opening HTML TagNode Elements cannot be eliminated! // HOWEVER, Closing TagNodes are never included if (tn.isClosing) return false; // Retrieve the value of the requested "inner-tag" (HTML Attribute) Key-Value Pair // from the input HTML-Element (TagNode) // // REG-EX MATCHER, MORE EXPENSIVE String itv = tn.AV(innerTagLC); // If the innerTag's value is null, then the inner-tag was not a key-value pair // found inside the TagNode. // // BECAUSE the user requested to "Keep If Inner-Tag Not Found", we must return TRUE // in that case. // In Java '||' uses short-circuit boolean-evaluation, while '|' requires // full-evaluation. // // OTHERWISE return the results of the Predicate<String> provided on that // attribute-value. return (itv == null) || innerTagValueTest.test(itv); };
-
cmp
static AVT cmp(java.lang.String innerTag)
This is astatic
factory method that generatesAVT-Predicate's
- (Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check too. The primary use of this class is that the results of one factory method may be "AND-chained" or "OR-chained" with another to make search requirements more specific.- Parameters:
innerTag
- This also goes by the term "attribute" in many HTML specifications. It is the name of the attribute, not it's value. If this attribute is found, thisPredicate
will always returnTRUE
regardless of it's value - so long as it is not null.
IMPORTANT NOTE: There is a subtlety here between inner-tag's that have a value ofthe-empty-string, a zero-length-string
, and attributes that are "null" or not found at all. Though rare, it is sometimes the case that an HTML Attribute may have a value of<SOME-TAG SOME-INNER-TAG="">
. There can be other versions that leave the quotes off entirely such as:<OTHER-ELEMENT OTHER-ATTRIBUTE=>
- where there are no quotes at all. If the attribute is found, with an equals sign it will evaluate to thethe zero-length-string
, but if the attribute is not found at all, searching for it will return null, and thisPredicate
will returnFALSE
.- Returns:
- An instance of
'AVT'
that can be passed to the NodeSearch classes search-methods via any one of the methods that accepts aPredicate<TagNode>
as a parameter in the search criteria.
EQUIVALENCE NOTE: The two lines of code in the sample below will return equivalent results. It is important to remember that the primary use ofclass AVT
is not for simple, single-line-searches, but rather to allow a programmer to "chain" search parameters using the standard-lambda functions,and(...), or(...), negate()
.
Example:
// Version 1: int posArr = InnerTagFind.all(page, "div", "data-ABC-XYZ"); // If you unfamiliar with HTML Attribute Token "data-", this HTML Search Request is asking for all divider // elements that have a "data-ABC-XYZ" attribute. Attributes that begin with "data-" are used to do just // that, store data-information inside the HTML divider element itself. // Version 2: AVT specifier = AVT.cmp("data-ABC-XYZ"); int posArr = InnerTagFind.all(page, specifier, "div"); // Both of the version above will result in identical 'posArr' values and lengths. // NOTE: The specifier can be saved, stored, and specifically 'chained' with other specifiers using // and(...), or(...), negate()
- Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
InnerTagFind
,ARGCHECK.innerTag(String)
,TagNode.AV(String)
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If this test fails, the returned predicate would absolutely fail. final String innerTagLC = ARGCHECK.innerTag(innerTag); // SIMPLIFIED LAMBDA: The contents of this "anonymous method" can be expressed in a // single-statement. No need for 'return' or 'curly-braces' // Returns TRUE if the HTML Element contained a copy of the named inner-tag, and false // otherwise. return (TagNode tn) -> tn.AV(innerTagLC) != null;
-
and
default AVT and(AVT additionalTest)
Generates a new'AVT'
predicate test thatlogically-AND's
the results of'this' Predicate
with the results of the new, additional passed parameterPredicate 'additionalTest'
.- Parameters:
additionalTest
- This is an additional test of the inner-tag key-value pair (also known as the "attribute-value pair") of HTMLTagNode's
.- Returns:
- A new
Predicate<TagNode>
that will use two tests:'this'
and'additionalTest'
and subsequently perform a logical-AND on the result. Short-circuit evaluation is used (specifically, the'&&'
operator, rather than the'&'
operator are utilized). ThePredicate
that is returned will perform the'this'
test first, and then the'additionalTest'
second. The returnedPredicate
will return the'AND'
of both of them. - Throws:
java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
TagNode
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If this test fails, the returned predicate would absolutely fail. if (additionalTest == null) throw new NullPointerException ("The parameter 'additionalTest' passed to method 'AVT.and(additionalTest)' was null"); // SIMPLIFIED LAMBDA: The contents of this "anonymous method" can be expressed in a // single-statement. No need for 'return' or 'curly-braces' // Returns TRUE if both 'this' evaluates to true on an input HTML Element, // and 'other' also evaluates to true for the same element. return (TagNode tn) -> this.test(tn) && additionalTest.test(tn);
-
or
default AVT or(AVT additionalTest)
Generates a new'AVT'
predicate test thatlogically-OR's
the results of'this' Predicate
with the results of the new, additional passed parameterPredicate 'additionalTest'
.- Parameters:
additionalTest
- This is an additional test of the inner-tag key-value pair (also known as the "attribute-value pair") of HTMLTagNode's
.- Returns:
- A new
Predicate<TagNode>
that will use two tests:'this'
and'additionalTest'
and subsequently perform a logical-OR on the result. Short-circuit evaluation is used (specifically, the'||'
operator, rather than the'|'
operator are utilized). ThePredicate
that is returned will perform the'this'
test first, and then the'additionalTest'
second. The returnedPredicate
will return the'OR'
of both of them. - Throws:
java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
TagNode
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If this test fails, the returned predicate would absolutely fail. if (additionalTest == null) throw new NullPointerException ("The parameter 'additionalTest' passed to method 'AVT.or(additionalTest)' was null"); // SIMPLIFIED LAMBDA: The contents of this "anonymous method" can be expressed in a // single-statement. No need for 'return' or 'curly-braces' // Returns TRUE if either 'this' evaluates to true on an input HTML Element, // and 'other' also evaluates to true for the same element. return (TagNode tn) -> this.test(tn) || additionalTest.test(tn);
-
negate
default AVT negate()
Generates a new'AVT'
predicate test that is thelogical-NOT
of'this' Predicate
.- Specified by:
negate
in interfacejava.util.function.Predicate<TagNode>
- Returns:
- A new
Predicate<TagNode>
that will simply just calls'this' Predicate
, and puts an exclamation point (logical 'NOT'
) in front of the result. - See Also:
TagNode
- Code:
- Exact Method Body:
// SIMPLIFIED LAMBDA: The contents of this "anonymous method" can be expressed in a // single-statement. No need for 'return' or 'curly-braces' // Returns the opposite of whatever result 'this' evaluates using the input HTML Element. return (TagNode tn) -> ! this.test(tn);
-
isEqualKEEP
static AVT isEqualKEEP(TagNode expectedTN)
This is astatic
factory method that generatesAVT-Predicate's
(Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check too.
If theexpectedTN.equals(tn)
fails - specifically using the java-built-in equality-test method'equals(...)'
, then the generated / returnedPredicate
would returnTRUE
, and theTagNode
in question would be included in the results.- Parameters:
expectedTN
- This is compared againstTagNode's
found in the page-Vector
for equality.- Returns:
- A
Predicate<TagNode>
that compares for equality with parameter'expectedTN'
- Throws:
java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
TagNode
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If this test fails, the returned predicate would absolutely fail. if (expectedTN == null) throw new NullPointerException ("The parameter 'expectedTN' passed to method 'AVT.isEqualKEEP(expectedTN)' was null"); // SIMPLIFIED LAMBDA: The contents of this "anonymous method" can be expressed in a // single-statement. No need for 'return' or 'curly-braces' // Returns true if the HTML Element passed to this (anonymous) method is the same as the // one passed to 'isEqualsKEEP' // Identical to: (TagNode tn) -> tn.str.equals(expectedTN.str); return (TagNode tn) -> tn.equals(expectedTN);
-
isEqualREJECT
static AVT isEqualREJECT(TagNode expectedTN)
This is astatic
factory method that generatesAVT-Predicate's
(Predicate<TagNode>
). It saves the user of typing the lambda information by hand, and does a validation check too.
If theexpectedTN.equals(tn)
fails - specifically using the java-built-in equality-test methodequals(...)
- then the generated / returnedPredicate
would returnFALSE
, and theTagNode
in question would be filtered from the results.- Parameters:
expectedTN
- This is compared againstTagNode's
found in the page-Vector
for equality.- Returns:
- A
Predicate<TagNode>
that compares for equality with parameter'expectedTN'
- Throws:
java.lang.NullPointerException
- If any of the provided input reference parameters are null.- See Also:
TagNode
- Code:
- Exact Method Body:
// FAIL-FAST: It is helpful for the user to test the data before building the Predicate. // If this test fails, the returned predicate would absolutely fail. if (expectedTN == null) throw new NullPointerException( "The parameter 'expectedTN' passed to method 'AVT.isEqualREJECT(expectedTN)' "+ "was null" ); // SIMPLIFIED LAMBDA: The contents of this "anonymous method" can be expressed in a // single-statement. No need for 'return' or 'curly-braces' // Returns TRUE if the HTML Element passed to this (anonymous) method is the same as the // one passed to 'isEqualsKEEP' // Identical to: (TagNode tn) -> ! tn.str.equals(expectedTN.str); return (TagNode tn) -> ! tn.equals(expectedTN);
-
-