Package Torello.HTML
Class Attributes
- java.lang.Object
-
- Torello.HTML.Attributes
-
public class Attributes extends java.lang.Object
Utilities for getting, setting and removing attributes from theTagNode
elements in a Web-PageVector
.
This class is used to perform iteration-loops over HTML Element Vectors where each and everyTagNode
Attribute can be updated, modified, added or removed with just a single method invocation. This class can be used in conjunction with the'AUM'
enumerated-type class where the type of updated is refined/specified.
It is important to note that these methods are really just for-loops that update an html-page with nodes whose attributes have changed. Generally, the methods in this class will not save a lot of typing - since the for-loop is not very long and replacing old HTML Elements with new ones in aVector
should be easy. However, with error checking, exception reporting andString
-concatenation provided by theenum 'AUM'
(Attribute Update Mode) enumerated-type - the value of using this class over a simple for loop becomes more apparent: less error-prone & simpler code.
"Re-Inventing the Wheel' is something that happens in American Computer-Programming circles pretty easily.C#
, for instance, but before getting into complaints about software system engineering, it should be pointed out that this class (class Attributes
) along withenum AUM
- when working together - behave similarly to the pair:class ReplaceNodes
andclass ReplaceFunction
. Both of these are generally used to replace HTMLTagNode's
in a vectorized-html web-page with ones that have updated attributes.- See Also:
AUM
Hi-Lited Source-Code:- View Here: Torello/HTML/Attributes.java
- Open New Browser-Tab: Torello/HTML/Attributes.java
File Size: 50,236 Bytes Line Count: 989 '\n' Characters Found
Stateless Class:This class neither contains any program-state, nor can it be instantiated. The@StaticFunctional
Annotation may also be called 'The Spaghetti Report'.Static-Functional
classes are, essentially, C-Styled Files, without any constructors or non-static member fields. It is a concept very similar to the Java-Bean's@Stateless
Annotation.
- 1 Constructor(s), 1 declared private, zero-argument constructor
- 36 Method(s), 36 declared static
- 0 Field(s)
-
-
Nested Class Summary
Nested Classes Modifier and Type Class static interface
Attributes.Filter
-
Method Summary
Retrieve the Attributes inside all of the TagNode's, Contained by a Vector Modifier and Type Method static Ret2<int[],String[]>
retrieve(Vector<? super TagNode> html, String attribute)
Retrieve the Attributes of a Vector's TagNode's, Range-Limited Modifier and Type Method static String[]
retrieve(Vector<? super TagNode> html, int[] posArr, String attribute)
static Ret2<int[],String[]>
retrieve(Vector<? super TagNode> html, int sPos, int ePos, String attribute)
static Ret2<int[],String[]>
retrieve(Vector<? super TagNode> html, DotPair dp, String attribute)
Remove some, or all, of the Attribute from all of the TagNode's, Contained by a Vector Modifier and Type Method static int[]
filter(Vector<? super TagNode> html, String... innerTagWhiteList)
static int[]
filter(Vector<? super TagNode> html, StrFilter filter)
static int[]
remove(Vector<? super TagNode> html, String... innerTags)
static int[]
removeAll(Vector<? super TagNode> html)
static int[]
removeData(Vector<? super TagNode> html)
Remove some, or all, of the Attributes of a Vector's TagNode's, Range-Limited: sPos & ePos Modifier and Type Method static int[]
filter(Vector<? super TagNode> html, int sPos, int ePos, String... innerTagWhiteList)
static int[]
filter(Vector<? super TagNode> html, int sPos, int ePos, StrFilter filter)
static int[]
remove(Vector<? super TagNode> html, int sPos, int ePos, String... innerTags)
static int[]
removeAll(Vector<? super TagNode> html, int sPos, int ePos)
static int[]
removeData(Vector<? super TagNode> html, int sPos, int ePos)
Remove some, or all, of the Attributes of a Vector's TagNode's, Range-Limited: DotPair Modifier and Type Method static int[]
filter(Vector<? super TagNode> html, DotPair dp, String... innerTagWhiteList)
static int[]
filter(Vector<? super TagNode> html, DotPair dp, StrFilter filter)
static int[]
remove(Vector<? super TagNode> html, DotPair dp, String... innerTags)
static int[]
removeAll(Vector<? super TagNode> html, DotPair dp)
static int[]
removeData(Vector<? super TagNode> html, DotPair dp)
Remove some, or all, of the Attributes of a Vector's TagNode's, Range-Limited: Index-Position Array Modifier and Type Method static int[]
filter(Vector<? super TagNode> html, int[] posArr, String... innerTagWhiteList)
static int[]
filter(Vector<? super TagNode> html, int[] posArr, StrFilter filter)
static int[]
remove(Vector<? super TagNode> html, int[] posArr, String... innerTags)
static int[]
removeAll(Vector<? super TagNode> html, int[] posArr)
static int[]
removeData(Vector<? super TagNode> html, int[] posArr)
Modify the Attributes inside all of the TagNode's, Contained by a Vector Modifier and Type Method static int[]
update(Vector<? super TagNode> html, Attributes.Filter f)
static int[]
update(Vector<? super TagNode> html, AUM mode, String innerTag, String itValue, SD quote)
static int[]
update(Vector<? super TagNode> html, AUM mode, String innerTag, IntTFunction<TagNode,String> newITValueStrGetter, SD quote)
Modify the Attributes of a Vector's TagNode's, Range-Limited: sPos & ePos Modifier and Type Method static int[]
update(Vector<? super TagNode> html, int sPos, int ePos, Attributes.Filter f)
static int[]
update(Vector<? super TagNode> html, AUM mode, int sPos, int ePos, String innerTag, String itValue, SD quote)
static int[]
update(Vector<? super TagNode> html, AUM mode, int sPos, int ePos, String innerTag, IntTFunction<TagNode,String> newITValueStrGetter, SD quote)
Modify the Attributes of a Vector's TagNode's, Range-Limited: DotPair Modifier and Type Method static int[]
update(Vector<? super TagNode> html, AUM mode, DotPair dp, String innerTag, String itValue, SD quote)
static int[]
update(Vector<? super TagNode> html, AUM mode, DotPair dp, String innerTag, IntTFunction<TagNode,String> newITValueStrGetter, SD quote)
static int[]
update(Vector<? super TagNode> html, DotPair dp, Attributes.Filter f)
Modify the Attributes of a Vector's TagNode's, Range-Limited: Index-Position Array Modifier and Type Method static int[]
update(Vector<? super TagNode> html, int[] posArr, Attributes.Filter f)
static int[]
update(Vector<? super TagNode> html, AUM mode, int[] posArr, String innerTag, String itValue, SD quote)
static int[]
update(Vector<? super TagNode> html, AUM mode, int[] posArr, String innerTag, IntTFunction<TagNode,String> newITValueStrGetter, SD quote)
-
-
-
Method Detail
-
update
public static int[] update(java.util.Vector<? super TagNode> html, AUM mode, java.lang.String innerTag, java.lang.String itValue, SD quote)
Convenience Method
Passes: Simple Update Lambda that always assigns'itValue'
to the Attribute
Iterates: The entirehtml
-page, Passes0, -1
tosPos, ePos
See Documentation:update(Vector, AUM, int, int, String, IntTFunction, SD)
-
update
public static int[] update(java.util.Vector<? super TagNode> html, AUM mode, DotPair dp, java.lang.String innerTag, java.lang.String itValue, SD quote)
Convenience Method
Receives:DotPair
Passes: Simple Update Lambda that always assigns'itValue'
to the Attribute
Iterates: Thehtml
-page fromdp.start
(inclusive) todp.end
(also inclusive)
See Documentation:update(Vector, AUM, int, int, String, IntTFunction, SD)
-
update
public static int[] update (java.util.Vector<? super TagNode> html, AUM mode, java.lang.String innerTag, IntTFunction<TagNode,java.lang.String> newITValueStrGetter, SD quote)
Convenience Method
Receives: An Attribute-Update Lambda-Function'newITValueStrGetter'
Iterates: The entirehtml
-page, Passes0, -1
tosPos, ePos
See Documentation:update(Vector, AUM, int, int, String, IntTFunction, SD)
-
update
public static int[] update (java.util.Vector<? super TagNode> html, AUM mode, DotPair dp, java.lang.String innerTag, IntTFunction<TagNode,java.lang.String> newITValueStrGetter, SD quote)
Convenience Method
Receives:DotPair
And-Receives: An Attribute-Update Lambda-Function'newITValueStrGetter'
Iterates: Thehtml
-page fromdp.start
(inclusive) todp.end
(also inclusive)
See Documentation:update(Vector, AUM, int, int, String, IntTFunction, SD)
-
update
public static int[] update(java.util.Vector<? super TagNode> html, AUM mode, int sPos, int ePos, java.lang.String innerTag, java.lang.String itValue, SD quote)
Convenience Method
Receives: HTML-Vector
starting & ending indices (sPos
andePos
).
Passes: Simple Update Lambda that always assigns'itValue'
to the Attribute
Iterates: Thehtml
-page fromsPos
(inclusive) toePos
(exclusive)
See Documentation:update(Vector, AUM, int, int, String, IntTFunction, SD)
-
update
public static int[] update (java.util.Vector<? super TagNode> html, AUM mode, int sPos, int ePos, java.lang.String innerTag, IntTFunction<TagNode,java.lang.String> newITValueStrGetter, SD quote)
Will update any HTMLTagNode's
present in the vector-parameter'html'
according to passedAUM
mode and the'innerTag'
parameter.
Range-Restriction -sPos, ePos
:
This method restricts the update process to the specified subrangesPos ... ePos
for theVector
-parameter'html'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.mode
- Since the purpose of this class is to update, modify, or remove HTML Element Inner-Tag key-value pairs, the mechanism - or the desired behavior - of the update process needs to be specified. Use the enumerated typeenum 'AUM'
. for choosing what update type is needed.sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.innerTag
- This is the name of the HTML attribute that needs to be changed, added, or removed.newITValueStrGetter
- This function accepts theVector
-index and a theTagNode
located at that index, and is expected to provide the new value. Depending on whichAUM
-Mode was selected, this new value will be assigned, appended or used to replace the current value (if there is one) of the Attribute whose name equals the passed parameter'innerTag'
.
The actual that is taken using theString
returned by this function is decided by theAUM
mode. IfAUM.RemoveSubString
were chosen, then all HTML Elements within the specifiedVector
-range would have the first sub-string-copy of theString
-value returned by user-function'newITValueStrGetter'
removed the Attribute-value from anyTagNode's
that actually possesed such an Attribute named'innerTag'
.quote
- The programmer is expected to select eitherSD.Single-Quote
orSD.Double-Quote.
The updated Attribute / Inner-Tag key-value pairs will be surrounded by the selected quote. Always remember that theclass 'TagNode'
checks for quotes-within-quotes, and will throw an exception if two-double quotes also contain a double-quote within the inner-tag value, or vice-versa (single-quotes within a two single-quoted attribute-value).- Returns:
- This method shall return an integer-
array
index-list whose values identify which HTMLVector
Elements were changed as a result of this method invocation.
NOTE: One minor subtlety, there could be cases where a new HTML Element'TagNode'
reference / object were instantiated or 'created,' even though the actualString
that comprised theHTMLNode
itself were identical to the originalHTMLNode.str String
. In the'AUM'
enumerated-type, whenAUM.Set
is invoked, the originalString
data for an attribute is always clobbered, even in cases where an identical version of theString
is replaced or substituted. - Throws:
QuotesException
- If there are "quotes within quotes" problems when invoking theTagNode
constructor, this exception will throw. The problem occurs when one or more of the Attribute Key-Value Pairs have a quotation-choice such that the chosen quotation-mark is also found within the Attribute-Value.QuotesException
will also throw in the case that an Attribute Key-Value Pair has elected to use the "No Quotes" option, but the attribute-value contains white-space.InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
AUM.update(TagNode, String, String, SD)
,LV
,TagNode.isTagNode()
,TagNode.isClosing
- Code:
- Exact Method Body:
return Update.update(html, mode, sPos, ePos, innerTag, newITValueStrGetter, quote);
-
update
public static int[] update(java.util.Vector<? super TagNode> html, AUM mode, int[] posArr, java.lang.String innerTag, java.lang.String itValue, SD quote)
Convenience Method
Receives: Anint[]
-Array which identifes which nodes in theVector
to update.
Passes: Simple Update Lambda that always assigns'itValue'
to the Attribute
Iterates: AllVector
-indices pointed to by the values in'posArr'
See Documentation:update(Vector, AUM, int, int, String, IntTFunction, SD)
-
update
public static int[] update (java.util.Vector<? super TagNode> html, AUM mode, int[] posArr, java.lang.String innerTag, IntTFunction<TagNode,java.lang.String> newITValueStrGetter, SD quote)
Will update any HTMLTagNode's
present in the vector-parameter'html'
according to a passed'AUM'
mode and the'innerTag'
parameter.
Range-Restriction -int[] posArr
:
This method restricts the update process to only nodes specified by theVector
-indexint[]
-Array parameter'posArr'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.mode
- Since the purpose of this class is to update, modify, or remove HTML Element Inner-Tag key-value pairs, the mechanism - or the desired behavior - of the update process needs to be specified. Use the enumerated typeenum 'AUM'
. for choosing what update type is needed.posArr
- This integer-array is expected to receive a "Pointer-Integer Array." These are usually generated by the NodeSearch'Find'
classes, and are simply lists of index-pointers into a Vectorized HTML Web-PageVector
. Theint[]
array passed to this parameter will specify theTagNode's
in theVector
whose attributes will be partially removed via a call toTagNode.removeAV(...)
and replaced.
For Example:
// This line will retrieve an array "index-pointer" to every HTML Paragraph Element. int[] posArr = TagNodeFind.all(htmlPage, TC.OpeningTags, "p"); // This line will ensure that every HTML Paragraph Element that was found on the HTML // page in the previous line of code - shall have a CSS class='MyClass' key-value // Inner-Tag. The returned array will contain a list of pointers to HTML Paragraph // Elements that were changed. int[] changedPosArr = Attributes.update (htmlPage, AUM.set, posArr, "class", (i, tn) -> "MyClass", SD.SingleQuote);
innerTag
- This is the name of the HMTL attribute that needs to be changed, added, or removed.newITValueStrGetter
- This function accepts theVector
-index and a theTagNode
located at that index, and is expected to provide the new value. Depending on whichAUM
-Mode was selected, this new value will be assigned, appended or used to replace the current value (if there is one) of the Attribute whose name equals the passed parameter'innerTag'
.
The actual that is taken using theString
returned by this function is decided by theAUM
mode. IfAUM.RemoveSubString
were chosen, then all HTML Elements within the specifiedVector
-range would have the first sub-string-copy of theString
-value returned by user-function'newITValueStrGetter'
removed the Attribute-value from anyTagNode's
that actually possesed such an Attribute named'innerTag'
.quote
- The programmer is expected to select eitherSD.Single-Quote
orSD.Double-Quote.
The updated Attribute / Inner-Tag key-value pairs will be surrounded by the selected quote. Always remember that theclass 'TagNode'
checks for quotes-within-quotes, and will throw an exception if two-double quotes also contain a double-quote within the inner-tag value, or vice-versa (single-quotes within a two single-quoted attribute-value).- Returns:
- This method shall return an integer-
array
index-list whose values identify which HTMLVector
Elements were changed as a result of this method invokation.
NOTE: One minor subtlety, there could be cases where a new HTML Element'TagNode'
reference / object were instantiated or 'created,' even though the actualString
that comprised theHTMLNode
itself were identical to the originalHTMLNode.str String
. In the'AUM'
enumerated-type, whenAUM.Set
is invoked, the originalString
data for an attribute is always clobbered, even in cases where an identical version of theString
is replaced or substituted. - Throws:
QuotesException
- If there are "quotes within quotes" problems when invoking theTagNode
constructor, this exception will throw. The problem occurs when one or more of the Attribute Key-Value Pairs have a quotation-choice such that the chosen quotation-mark is also found within the Attribute-Value.QuotesException
will also throw in the case that an Attribute Key-Value Pair has elected to use the "No Quotes" option, but the attribute-value contains white-space.InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.TagNodeExpectedException
- This exception shall throw if an identifiedVector
-index must point-to an instance ofTagNode
, but that index instead holds some otherHTMLNode
instance (eitherCommentNode
orTextNode
). If an integer-position array (int[] posArr
) is passed, but that array has an index pointing-to - something besides aTagNode
- then this exception will be thrown.OpeningTagNodeExpectedException
- When aVector
position-index holds an instance ofTagNode
, but thatTagNode
is one in which itsisClosing
-Field is set toTRUE
, then this exception shall throw.
When passingint[]
-Array parameter'posArr'
, that array should contain a list ofVector
-indices. The code which checks for this exception checks to ensure that each of the locations in that array point to Opening TagNode's, and if or when they don't, this exception throws.java.lang.ArrayIndexOutOfBoundsException
- If any of the elements in'posArr'
contain index-pointers that are out of range ofVector
-parameter'page'
, then java will, naturally, throw this exception.- See Also:
AUM.update(TagNode, String, String, SD)
,TagNode.isTagNode()
,TagNode.isClosing
- Code:
- Exact Method Body:
return Update.update(html, mode, posArr, innerTag, newITValueStrGetter, quote);
-
removeAll
public static int[] removeAll(java.util.Vector<? super TagNode> html, int sPos, int ePos)
The purpose of this method is to remove all attributes / Inner-Tag key-value pairs from each and every non-'TextNode'
and non-'CommentNode'
HTML Element found on the vectorized-html page parameter'html'
. The removal process is limited to the range specified by method-parameterssPos, ePos.
Attribute Removal Specifics:
This method will remove each and everyclass=... id=... src=... alt=...
href=... onclick=... etc...
attribute from allTagNode
-instances whoseVector
-index location inside'html'
falls between'sPos'
and'ePos'
.
When this method exists, allTagNode
instances inside'html'
that fall within the specified sub-range will be attribute-free.
Range-Restriction -sPos, ePos
:
This method restricts the removal process to the specified subrangesPos ... ePos
for theVector
-parameter'html'
.
Example:
// Retrieve the contents from the foreign news source "https://www.gov.cn" - pick an article URL url = new URL("http://www.gov.cn/premier/2020-xx/xx/content_55267.htm"); Vector<HTMLNode> news = HTMLPage.getPageTokens(url, false); // Now retrieve the "article body" Vector<HTMLNode> body = InnerTagGetInclusive.first (page, "div", "class", TextComparitor.C, "article"); // To view a "pared down" version - with all CSS class, id information removed - call this // method, and only the raw HTML tags will remain... <P>, <DIV>, <B>... etc. // Passing 0 and -1 means the 'entire-page' is processed. Attributes.removeAll(body, 0, -1); // Print the updated "article body" Vector using the Debug class. It should be MUCH EASIER // to read. // // AGAIN: all long-winded "class" "ID" and other common HTML clutter has been removed. System.out.println(Util.pageToString(body));
- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.- Returns:
- An integer array of
'Vector'
-index positions / locations for each and every HTML'TagNode'
whose attributes have been removed. - Throws:
java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
TagNode.removeAllAV()
,TagNode.isTagNode()
,TagNode.isClosing
,LV
- Code:
- Exact Method Body:
return RemoveAll.removeAll(html, sPos, ePos);
-
removeAll
public static int[] removeAll(java.util.Vector<? super TagNode> html, int[] posArr)
The purpose of this method is to remove all attributes / Inner-Tag key-value pairs from each and every non-'TextNode'
and non-'CommentNode'
HTML Element found on the vectorized-html page parameter'html'
. The removal process is limited to the only removing attributes from elements pointed to by the contents of passed-parameter'posArr'
Attribute Removal Specifics:
This method will remove each and everyclass=... id=... src=... alt=...
href=... onclick=... etc...
attribute from allTagNode
-instances whoseVector
-index location within'html'
are indices among those listed by the index-listint[]
-Array'posArr'
.
When this method exits, allTagNode
instances inside'html'
specified by'posArr'
will be attribute-free.
Range-Restriction -int[] posArr
:
This method restricts the removal process to only nodes specified by theVector
-indexint[]
-Array parameter'posArr'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.posArr
- This integer-array is expected to receive a "Pointer-Integer Array." These are usually generated by the NodeSearch'Find'
classes, and are simply lists of index-pointers into a Vectorized HTML Web-PageVector
. Theint[]
array passed to this parameter will specify theTagNode's
in theVector
whose attributes will be partially removed via a call toTagNode.removeAV(...)
and replaced.
For Example:
// This line will retrieve an array "index-pointer" to every HTML Paragraph Element. int[] posArr = TagNodeFind.all(htmlPage, TC.OpeningTags, "p"); // This line will remove every attribute key-value pair from every HTML Paragraph // Element on the vectorized-html page 'htmlPage' // The returned array will contain a list of pointers to HTML Paragraph Elements that // were changed. Paragraph Elements that were already empty of Inner-Tag key-value pairs // will not have a pointer in this index-array. int[] changedPosArr = Attributes.removeAll(htmlPage, posArr);
- Returns:
- An integer array of
'Vector'
-index positions / locations for each and every HTML'TagNode'
whose attributes have been removed. - Throws:
java.lang.ArrayIndexOutOfBoundsException
- If any of the elements in'posArr'
contain index-pointers that are out of range ofVector
-parameter'page'
, then java will, naturally, throw this exception.OpeningTagNodeExpectedException
- When aVector
position-index holds an instance ofTagNode
, but thatTagNode
is one in which itsisClosing
-Field is set toTRUE
, then this exception shall throw.
When passingint[]
-Array parameter'posArr'
, that array should contain a list ofVector
-indices. The code which checks for this exception checks to ensure that each of the locations in that array point to Opening TagNode's, and if or when they don't, this exception throws.TagNodeExpectedException
- This exception shall throw if an identifiedVector
-index must point-to an instance ofTagNode
, but that index instead holds some otherHTMLNode
instance (eitherCommentNode
orTextNode
). If an integer-position array (int[] posArr
) is passed, but that array has an index pointing-to - something besides aTagNode
- then this exception will be thrown.- See Also:
TagNode.removeAllAV()
,TagNode.isTagNode()
,TagNode.isClosing
- Code:
- Exact Method Body:
return RemoveAll.removeAll(html, posArr);
-
removeData
public static int[] removeData(java.util.Vector<? super TagNode> html)
-
removeData
public static int[] removeData(java.util.Vector<? super TagNode> html, DotPair dp)
-
removeData
public static int[] removeData(java.util.Vector<? super TagNode> html, int sPos, int ePos)
The purpose of this method is to remove all HTML data-attribute key-value pairs from'TagNode'
Elements contained inside parameter'html'
.
Range-Restriction -sPos, ePos
:
This method restricts the removal process to the specified subrangesPos ... ePos
for theVector
-parameter'html'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
in which the all attributes have been removed.
NOTE: It is altogether possible that the nodes listed by the parameter'posArr'
, will actually not all be modified by this method! In such cases, it is (hopefully) obvious that the returnedint[]
-Array will be shorter than the supplied'posArr'
by the exact number ofTagNode's
that remained unchanged. - Throws:
java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
TagNode.removeDataAttributes()
,TagNode.isTagNode()
,TagNode.isClosing
,LV
- Code:
- Exact Method Body:
return RemoveData.removeData(html, sPos, ePos);
-
removeData
public static int[] removeData(java.util.Vector<? super TagNode> html, int[] posArr)
The purpose of this method is to remove all HTML data-attribute key-value pairs from'TagNode'
Elements contained inside parameter'html'
.
Range-Restriction -int[] posArr
:
This method restricts the removal process to only nodes specified by theVector
-indexint[]
-Array parameter'posArr'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.posArr
- This integer-array is expected to receive a "Pointer-Integer Array." These are usually generated by the NodeSearch'Find'
classes, and are simply lists of index-pointers into a Vectorized HTML Web-PageVector
. Theint[]
array passed to this parameter will specify theTagNode's
in theVector
whose attributes will be partially removed via a call toTagNode.removeAV(...)
and replaced.
For Example:
// This line will retrieve an array "index-pointer" to every HTML Image Element. int[] posArr = TagNodeFind.all(htmlPage, TC.OpeningTags, "img"); // This line will remove every "data-attribute" key-value pair from every HTML Image // Element on the vectorized-html page 'htmlPage' // The returned array will contain a list of pointers to HTML Paragraph Elements that // were changed. Image Elements that did not have "data-" HTML InnerTags // will not have a pointer in this index-array. int[] changedPosArr = Attributes.removeData(htmlPage, posArr);
- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
in which the all attributes have been removed.
NOTE: It is altogether possible that the nodes listed by the parameter'posArr'
, will actually not all be modified by this method! In such cases, it is (hopefully) obvious that the returnedint[]
-Array will be shorter than the supplied'posArr'
by the exact number ofTagNode's
that remained unchanged. - Throws:
java.lang.ArrayIndexOutOfBoundsException
- If any of the elements in'posArr'
contain index-pointers that are out of range ofVector
-parameter'page'
, then java will, naturally, throw this exception.OpeningTagNodeExpectedException
- When aVector
position-index holds an instance ofTagNode
, but thatTagNode
is one in which itsisClosing
-Field is set toTRUE
, then this exception shall throw.
When passingint[]
-Array parameter'posArr'
, that array should contain a list ofVector
-indices. The code which checks for this exception checks to ensure that each of the locations in that array point to Opening TagNode's, and if or when they don't, this exception throws.TagNodeExpectedException
- This exception shall throw if an identifiedVector
-index must point-to an instance ofTagNode
, but that index instead holds some otherHTMLNode
instance (eitherCommentNode
orTextNode
). If an integer-position array (int[] posArr
) is passed, but that array has an index pointing-to - something besides aTagNode
- then this exception will be thrown.- See Also:
TagNode.removeDataAttributes()
,TagNode.isTagNode()
,TagNode.isClosing
- Code:
- Exact Method Body:
return RemoveData.removeData(html, posArr);
-
remove
-
remove
-
remove
public static int[] remove(java.util.Vector<? super TagNode> html, int sPos, int ePos, java.lang.String... innerTags)
This will remove all copies of the attributes whose names are listed among the byString[]
array parameter'innerTags'
from the vectorized-html web-page parameter'html'
.
Range-Restriction -sPos, ePos
:
This method restricts the removal process to the specified subrangesPos ... ePos
for theVector
-parameter'html'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.innerTags
- ThisString
, or list ofString's
, should contain valid HTML Element inner-tag names that need to be removed.
Any instances of these Black-Listed Attributes that are found insideTagNode
instances will cause this method to extract thatTagNode
, and rebuild a new instance whereby all unwanted attributes have been removed, leaving only the attributes that weren't mentioned by the Var-Args Array.
AGAIN: This method shall only modifyTagNode's
if theirVector
-index locations in'html'
fall betweensPos
(inclusively) andePos
(exclusively).sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
in which the all key-value pairs named (black-listed) by'innerTags'
have been removed. - Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
java.lang.IllegalArgumentException
- If parameter'innerTags'
has zero elements.- See Also:
TagNode.removeAttributes(String[])
,LV
,TagNode.hasOR(boolean, String[])
,TagNode.isTagNode()
,TagNode.isClosing
,InnerTagKeyException.check(String[])
- Code:
- Exact Method Body:
return Remove.remove(html, sPos, ePos, innerTags);
-
remove
public static int[] remove(java.util.Vector<? super TagNode> html, int[] posArr, java.lang.String... innerTags)
This will remove all copies of the attributes whose names are listed among the byString[]
array parameter'innerTags'
from the vectorized-html web-page parameter'html'
.
Range-Restriction -int[] posArr
:
This method restricts the removal process to only nodes specified by theVector
-indexint[]
-Array parameter'posArr'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.innerTags
- ThisString
, or list ofString's
, should contain valid HTML Element inner-tag names that need to be removed.
Any instances of these Black-Listed Attributes that are found insideTagNode
instances will cause this method to extract thatTagNode
, and rebuild a new instance whereby all unwanted attributes have been removed, leaving only the attributes that weren't mentioned by the Var-Args Array.
AGAIN: This method shall only modifyTagNode's
if theirVector
-index locations in'html'
are listed in'posArr'
.posArr
- This integer-array is expected to receive a "Pointer-Integer Array." These are usually generated by the NodeSearch'Find'
classes, and are simply lists of index-pointers into a Vectorized HTML Web-PageVector
. Theint[]
array passed to this parameter will specify theTagNode's
in theVector
whose attributes will be partially removed via a call toTagNode.removeAV(...)
and replaced.
For Example:
// This line will retrieve an array "index-pointer" to every HTML Paragraph Element. int[] posArr = TagNodeFind.all(htmlPage, TC.OpeningTags, "p"); // This line will remove attribute key-value pairs for 'class' and 'id' from every HTML // Paragraph Element on the vectorized-html page 'htmlPage.' The returned array will // contain a list of pointers to HTML Paragraph Elements that were changed. Paragraph // Elements that did not contain a 'class' nor an 'id' inner-tag will not have a pointer // in the returned index-array, and therefore will not have been modified. int[] changedPosArr = Attributes.remove(htmlPage, posArr, "class", "id");
- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
in which the all key-value pairs named (black-listed) by'innerTags'
have been removed.
NOTE: It is altogether possible that the nodes listed by the parameter'posArr'
, will actually not all be modified by this method! In such cases, it is (hopefully) obvious that the returnedint[]
-Array will be shorter than the supplied'posArr'
by the exact number ofTagNode's
that remained unchanged. - Throws:
InnerTagKeyException
- This exception will throw if a non-standardString
-value is passed to parameterString 'innerTag'
. HTML expects that an attribute-name conform to a set of rules in order to be processed by a browser.java.lang.ArrayIndexOutOfBoundsException
- If any of the elements in'posArr'
contain index-pointers that are out of range ofVector
-parameter'page'
, then java will, naturally, throw this exception.OpeningTagNodeExpectedException
- When aVector
position-index holds an instance ofTagNode
, but thatTagNode
is one in which itsisClosing
-Field is set toTRUE
, then this exception shall throw.
When passingint[]
-Array parameter'posArr'
, that array should contain a list ofVector
-indices. The code which checks for this exception checks to ensure that each of the locations in that array point to Opening TagNode's, and if or when they don't, this exception throws.TagNodeExpectedException
- This exception shall throw if an identifiedVector
-index must point-to an instance ofTagNode
, but that index instead holds some otherHTMLNode
instance (eitherCommentNode
orTextNode
). If an integer-position array (int[] posArr
) is passed, but that array has an index pointing-to - something besides aTagNode
- then this exception will be thrown.java.lang.IllegalArgumentException
- If parameter'innerTags'
has zero elements.- See Also:
TagNode.removeAttributes(String[])
,TagNode.hasOR(boolean, String[])
,TagNode.isTagNode()
,TagNode.isClosing
,InnerTagKeyException.check(String[])
- Code:
- Exact Method Body:
return Remove.remove(html, posArr, innerTags);
-
retrieve
-
retrieve
-
retrieve
public static Ret2<int[],java.lang.String[]> retrieve (java.util.Vector<? super TagNode> html, int sPos, int ePos, java.lang.String attribute)
The purpose of this method is to retrieve the value of each attribute in eachTagNode
in an HTMLVector
(or sub-Vector
) that contained such an attribute.
Range-Restriction -sPos, ePos
:
This method restricts the retrieval process to the specified subrangesPos ... ePos
for theVector
-parameter'html'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.attribute
- This is the HTML attribute-value that is being retrieved. Each instance ofTagNode
in the input'html'
parameterVector
shall be searched for this attribute-name. If this attribute is present in any of theTagNode's
in HTML, then thatTagNode's
location (Vector
-index) shall be returned in theint[]
position array, and the value of that attribute shall be returned in theString[]
array.- Returns:
- An instance of
Ret2
where the tworeturn
-fields are as follows:-
Ret2.a (int[])
This an integer-arrayint[]
containing the indices of each instance ofTagNode
that contained a non-null attribute matching parameter'attribute'
.
-
Ret2.b (String[])
This a String-arrayString[]
containing the values of the attributes in theTagNode's
that contained the named'attribute'
.
NOTE: Thesearrays
should be considered parallelarrays
.
ALSO: This method shall never return null, if there are no matches, an instance ofRet2
shall be returned, containing zero lengtharrays
. -
- Throws:
InnerTagKeyException
- If the attribute name passed to this parameter does not contain the name of a valid HTML5 attribute, then this exception shall throw.java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
TagNode.AV(String)
,TagNode.isTagNode()
,TagNode.isClosing
,InnerTagKeyException.check(String[])
,LV
- Code:
- Exact Method Body:
return Retrieve.retrieve(html, sPos, ePos, attribute);
-
retrieve
public static java.lang.String[] retrieve (java.util.Vector<? super TagNode> html, int[] posArr, java.lang.String attribute)
This shall visit eachTagNode
indicated by theint[]
-Array parameter'posArr'
), and then query thoseTagNode's
for the Attribute-value of the attribute named byString
-Parameter'attribute'
The value of each of these attributes will be recorded to a parallelString
-array and returned. ThisString[]
array shall be parallel to the inputVector
-index'posArr'
parameter.
Range-Restriction -int[] posArr
:
This method restricts the retrieval process to only nodes specified by theVector
-indexint[]
-Array parameter'posArr'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.posArr
- This shall be a list ofVector
-indices that contain openingTagNode
elements. The value of the attribute provided by parameter'attribute'
will be returned in a parallelString[]
array for eachTagNode
identified by'posArr'
.attribute
- This is the name of the HTML attribute that is being retrieved. EachTagNode
element at the locations specified by input parameter'posArr'
shall be searched for this attribute (name), and the value of that attribute shall be placed in the returnedString[]
array.
If any of theTagNode
instances listed by theVector
-index array do not have that attribute, then a 'null' shall be placed in the returnedString[]
array at the index-location parallel to its position in'posArr'
- Returns:
- This returns a
String[]
array that shall be parallel to the input-parameterint[] posArr
. Each location in thisString
-array shall correspond to the attribute-value returned by a call toTagNode.AV(String)
on theTagNode
that is located at theVector
-index identified by the value at'posArr'
. - Throws:
InnerTagKeyException
- If theString
provided to parameter'attribute'
is not a valid HTML-5 attribute-name, then this exception shall thow.java.lang.ArrayIndexOutOfBoundsException
- If any of the elements in'posArr'
contain index-pointers that are out of range ofVector
-parameter'page'
, then java will, naturally, throw this exception.OpeningTagNodeExpectedException
- When aVector
position-index holds an instance ofTagNode
, but thatTagNode
is one in which itsisClosing
-Field is set toTRUE
, then this exception shall throw.
When passingint[]
-Array parameter'posArr'
, that array should contain a list ofVector
-indices. The code which checks for this exception checks to ensure that each of the locations in that array point to Opening TagNode's, and if or when they don't, this exception throws.TagNodeExpectedException
- This exception shall throw if an identifiedVector
-index must point-to an instance ofTagNode
, but that index instead holds some otherHTMLNode
instance (eitherCommentNode
orTextNode
). If an integer-position array (int[] posArr
) is passed, but that array has an index pointing-to - something besides aTagNode
- then this exception will be thrown.- See Also:
InnerTagKeyException.check(String[])
,TagNode.AV(String)
,TagNode.isTagNode()
,TagNode.isClosing
- Code:
- Exact Method Body:
return Retrieve.retrieve(html, posArr, attribute);
-
update
public static int[] update(java.util.Vector<? super TagNode> html, Attributes.Filter f)
-
update
public static int[] update(java.util.Vector<? super TagNode> html, DotPair dp, Attributes.Filter f)
-
update
public static int[] update(java.util.Vector<? super TagNode> html, int sPos, int ePos, Attributes.Filter f)
Modifies the contents of each instance of a'TC.OpeningTags'
element found in the inputVector
. The type of update that's performed is defined by the parameterFilter 'f'
. Each time aTagNode
found in the input vectorized-html web-page, or html sub-list, is changed or modified the, originalTagNode
will be removed and replaced by a new, modifiedTagNode
instance.
Range-Restriction -sPos, ePos
:
This method restricts the filtering process to the specified subrangesPos ... ePos
for theVector
-parameter'html'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.f
- This is a'functional-interface'
instance. It may be implemented by a lambda expression, or with an assignment to a (C Styled) function pointer. This interface is defined here inclass 'Attributes'
. It needs to implement a method that receives aString
and ajava.util.Properties
, and removes all attribute key-value pairs that need to be removed from thoseProperties
. If any changes have been made to theProperties
, this must be indicated by returning TRUE as the result of this method.
By implementing an instance of'Filter'
, a programmer may selectively choose which attributes in each and everyTagNode
of a web-page, or sub-list, using a single lambda-expression.
AGAIN: This method shall only modifyTagNode's
if theirVector
-index locations in'html'
fall betweensPos
(inclusively) andePos
(exclusively).sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
that was explicitly supplied by'filter'
. - Throws:
InnerTagKeyException
- TheTagNode
constructor that is used here when replacingTagNode
instances will automatically check each attribute that is being inserted into theTagNode
. If the user has added inner-tags whose names do not meet the requirements of the inner-tag naming conventions, this exception will throw.java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
QuotesException
- If there are "quotes within quotes" problems when invoking theTagNode
constructor, this exception will throw. The problem occurs when one or more of the Attribute Key-Value Pairs have a quotation-choice such that the chosen quotation-mark is also found within the Attribute-Value.QuotesException
will also throw in the case that an Attribute Key-Value Pair has elected to use the "No Quotes" option, but the attribute-value contains white-space.- See Also:
TagNode.allAV(boolean, boolean)
,TagNode.isTagNode()
,TagNode.isClosing
,LV
- Code:
- Exact Method Body:
return UpdateWithFilter.update(html, sPos, ePos, f);
-
update
public static int[] update(java.util.Vector<? super TagNode> html, int[] posArr, Attributes.Filter f)
Filters the contents of each instance of a'TC.OpeningTags'
element in the inputVector
. The type of filter performed is defined by the parameterFilter 'f'
. Each time aTagNode
in the input vectorized-html web-page, or html sub-list, is changed or modified the originalTagNode
will be removed and replaced by a new, updated or modifiedTagNode
instance.
Range-Restriction -int[] posArr
:
This method restricts the filtering process to only nodes specified by theVector
-indexint[]
-Array parameter'posArr'
.
In the example below, there are several explanations regarding use of this class.
Example:
// This line will retrieve an array "index-pointer" to every HTML Section Element. int[] posArr = TagNodeFind.all(htmlPage, TC.OpeningTags, "section"); // This line uses a lambda-expression to implement a simple Attributes Filter. This filter // removes any 'class' information found in the element, and then adds a 'title' attribute // if the TagNode does not already have a 'title' inner-tag. // // NOTE: This filter operation will only be applied to the TagNode's that were identified // by the search operation in the previous line. Specifically, only TagNode's // whose indices are in the integer-array 'posArr' will be checked against this // filter lambda expression. // // ALSO: This 'Counter' class simply 'counts' and returns successive integers, beginning // at one. It is used to 'bypass' the compiler's 'Effectively Final' rule. // // RETURNS: The returned array will contain a list of pointers to HTML SECTION Elements that // were changed. SECTION Elements that were not updated by the Attributes.Filter // lambda-expression will not have a pointer in this index-array. Counter c = new Counter(1); int[] changedPosArr = Attributes.filter(htmlPage, posArr, (String htmlTag, Properties av) -> { boolean ret = false; if (av.contains("class")) { ret=true; av.remove("class"); } if (! av.contains("title")) { ret=true; av.put("title", "Article Section Page #" + c.next()); } return ret; });
- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.f
- This is a'functional-interface'
instance. It may be implemented by a lambda expression, or with an assignment to a (C Styled) function pointer. This interface is defined here inclass 'Attributes'
. It needs to implement a method that receives aString
and ajava.util.Properties
, and removes all attribute key-value pairs that need to be removed from thoseProperties
. If any changes have been made to theProperties
, this must be indicated by returning TRUE as the result of this method.
By implementing an instance of'Filter'
, a programmer may selectively choose which attributes in each and everyTagNode
of a web-page, or sub-list, using a single lambda-expression.
AGAIN: This method shall only modifyTagNode's
if theirVector
-index locations in'html'
are listed in'posArr'
.posArr
- This integer-array is expected to receive a "Pointer-Integer Array." These are usually generated by the NodeSearch'Find'
classes, and are simply lists of index-pointers into a Vectorized HTML Web-PageVector
. Theint[]
array passed to this parameter will specify theTagNode's
in theVector
whose attributes will be partially removed via a call toTagNode.removeAV(...)
and replaced.- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
that was explicitly supplied by'filter'
.
NOTE: It is altogether possible that the nodes listed by the parameter'posArr'
, will actually not all be modified by this method! In such cases, it is (hopefully) obvious that the returnedint[]
-Array will be shorter than the supplied'posArr'
by the exact number ofTagNode's
that remained unchanged. - Throws:
java.lang.ArrayIndexOutOfBoundsException
- If any of the elements in'posArr'
contain index-pointers that are out of range ofVector
-parameter'page'
, then java will, naturally, throw this exception.OpeningTagNodeExpectedException
- When aVector
position-index holds an instance ofTagNode
, but thatTagNode
is one in which itsisClosing
-Field is set toTRUE
, then this exception shall throw.
When passingint[]
-Array parameter'posArr'
, that array should contain a list ofVector
-indices. The code which checks for this exception checks to ensure that each of the locations in that array point to Opening TagNode's, and if or when they don't, this exception throws.InnerTagKeyException
- TheTagNode
constructor that is used here when replacingTagNode
instances will automatically check each attribute that is being inserted into theTagNode
. If the user has added inner-tags whose names do not meet the requirements of the inner-tag naming conventions, this exception will throw.QuotesException
- If there are "quotes within quotes" problems when invoking theTagNode
constructor, this exception will throw. The problem occurs when one or more of the Attribute Key-Value Pairs have a quotation-choice such that the chosen quotation-mark is also found within the Attribute-Value.QuotesException
will also throw in the case that an Attribute Key-Value Pair has elected to use the "No Quotes" option, but the attribute-value contains white-space.TagNodeExpectedException
- This exception shall throw if an identifiedVector
-index must point-to an instance ofTagNode
, but that index instead holds some otherHTMLNode
instance (eitherCommentNode
orTextNode
). If an integer-position array (int[] posArr
) is passed, but that array has an index pointing-to - something besides aTagNode
- then this exception will be thrown.- See Also:
TagNode.allAV(boolean, boolean)
,TagNode.isTagNode()
,TagNode.isClosing
- Code:
- Exact Method Body:
return UpdateWithFilter.update(html, posArr, f);
-
filter
-
filter
-
filter
public static int[] filter(java.util.Vector<? super TagNode> html, int sPos, int ePos, java.lang.String... innerTagWhiteList)
Filters the contents of each instance of a'TC.OpeningTags'
element in the inputVector
using an attribute'white-list'
. All input-Vector TagNode's
that have attributes whose names are not members of the inner-tagwhite-list
will be removed, and a newTagNode
whose only attributes are members of the innerTagwhite-list
will replace the oldTagNode
.
Range-Restriction -sPos, ePos
:
This method restricts the removal process to the specified subrangesPos ... ePos
for theVector
-parameter'html'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.innerTagWhiteList
- This should be a list of attribute names that'white-list'
selected attributes forTagNode's
inVector
-Parameter'html'
.
The concept of an Attribute White-list means that any only attributes inside aTagNode
whose name is not on the list shall be removed when theTagNode
is reinstantiated.
AGAIN: This method shall only modifyTagNode's
if theirVector
-index locations in'html'
fall betweensPos
(inclusively) andePos
(exclusively).sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
in which all attribute key-value not explicitly mentioned by'innerTagWhiteList'
have been removed. - Throws:
java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
TagNode.allAN(boolean, boolean)
,TagNode.isTagNode()
,TagNode.removeAttributes(String[])
,TagNode.isClosing
,LV
- Code:
- Exact Method Body:
return WhiteListFilter.filter(html, sPos, ePos, innerTagWhiteList);
-
filter
public static int[] filter(java.util.Vector<? super TagNode> html, int[] posArr, java.lang.String... innerTagWhiteList)
Filters the contents of each instance of a'TC.OpeningTags'
element in the inputVector
using an attribute'white-list'
. All input-Vector TagNode's
that have attributes whose names are not members of the inner-tagwhite-list
will be removed, and a newTagNode
whose only attributes are members of the innerTagwhite-list
will replace the oldTagNode
.
Range-Restriction -int[] posArr
:
This method restricts the removal process to only nodes specified by theVector
-indexint[]
-Array parameter'posArr'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.innerTagWhiteList
- This should be a list of attribute names that'white-list'
selected attributes forTagNode's
inVector
-Parameter'html'
.
The concept of an Attribute White-list means that any only attributes inside aTagNode
whose name is not on the list shall be removed when theTagNode
is reinstantiated.
AGAIN: This method shall only modifyTagNode's
if theirVector
-index locations in'html'
are listed in'posArr'
.posArr
- This integer-array is expected to receive a "Pointer-Integer Array." These are usually generated by the NodeSearch'Find'
classes, and are simply lists of index-pointers into a Vectorized HTML Web-PageVector
. Theint[]
array passed to this parameter will specify theTagNode's
in theVector
whose attributes will be partially removed via a call toTagNode.removeAV(...)
and replaced.
For Example:
// This line will retrieve an array "index-pointer" to every HTML Image Element. int[] posArr = TagNodeFind.all(htmlPage, TC.OpeningTags, "img"); // This line will "clean up" any HTML "<IMG>" elements. If these elements are 'cluttered' // after the filter operation, only the 'src' and 'alt' attributes will remain. Attributes.filter(htmlPage, posArr, "src", "alt");
- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
in which all attribute key-value not explicitly mentioned by'innerTagWhiteList'
have been removed.
NOTE: It is altogether possible that the nodes listed by the parameter'posArr'
, will actually not all be modified by this method! In such cases, it is (hopefully) obvious that the returnedint[]
-Array will be shorter than the supplied'posArr'
by the exact number ofTagNode's
that remained unchanged. - Throws:
java.lang.ArrayIndexOutOfBoundsException
- If any of the elements in'posArr'
contain index-pointers that are out of range ofVector
-parameter'page'
, then java will, naturally, throw this exception.TagNodeExpectedException
- This exception shall throw if an identifiedVector
-index must point-to an instance ofTagNode
, but that index instead holds some otherHTMLNode
instance (eitherCommentNode
orTextNode
). If an integer-position array (int[] posArr
) is passed, but that array has an index pointing-to - something besides aTagNode
- then this exception will be thrown.OpeningTagNodeExpectedException
- When aVector
position-index holds an instance ofTagNode
, but thatTagNode
is one in which itsisClosing
-Field is set toTRUE
, then this exception shall throw.
When passingint[]
-Array parameter'posArr'
, that array should contain a list ofVector
-indices. The code which checks for this exception checks to ensure that each of the locations in that array point to Opening TagNode's, and if or when they don't, this exception throws.- See Also:
TagNode.allAN(boolean, boolean)
,TagNode.removeAttributes(String[])
,TagNode.isTagNode()
,TagNode.isClosing
- Code:
- Exact Method Body:
return WhiteListFilter.filter(html, posArr, innerTagWhiteList);
-
filter
-
filter
public static int[] filter(java.util.Vector<? super TagNode> html, int sPos, int ePos, StrFilter filter)
Filters the contents of each instance of a'TC.OpeningTags'
element in the inputVector
using aStrFilter
. All input-Vector TagNode's
which have attributes will have the list of attribute-names tested against the providedStrFilter.test(attribute)
predicate.
If any attribute whose name fails thePredicate
test, then that attribute will be removed. After testing all of aTagNode's
inner-tags, if any of those attributes did fail theStrFilter.test(...)
method, a newTagNode
will be constructed leaving those out. Finally, the oldTagNode
will be removed from input HTMLVector
, and replaced with the new one.
Range-Restriction -sPos, ePos
:
This method restricts the filtering process to the specified subrangesPos ... ePos
for theVector
-parameter'html'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.filter
- There are a plethora of available "automatically built"String
filters available using theinterface StrFilter's
static-member methods that buildStrFilter
to use here. One may also write a java lambda-expression here to implement thejava.util.function.Predicate
.
IMPORTANT: TheStrFilter
functional-interface extendsPredicate<Object>
. Perhaps it may seem counter-intuitive that it does not extendPredicate<String>
, however, sinceStrFilter
is a general purposePredicate
(used in numerous locations in this JAR-File 'Java-HTML' library distribution), it's a situation that allows for non-string objects (like the myriad classes which implement theinterface CharSequence
) to simply invoke Java'sObject.toString()
method to be used as input to the filter-test.
Sadly, this means that in writing a custom-made lambda-expression for thisPredicate
, it is mandatory to call Java's (Object class)'toString()'
method on the input 'innerTags' - even though the input parameter'inner-tags'
is already aString.
The rational for this inconvenience is thatinterface 'StrFilter'
has quite a few general-purpose, statically-invoked, factory-builder routines. Reusing those methods outweighs the benefit of having these methods, here, accept a'Predicate<String>'
instead of a'Predicate<Object>'
as an input parameter. (Again, noting thatStrFilter
extends'Predicate<Object>'
.
AGAIN: This method shall only modifyTagNode's
if theirVector
-index locations in'html'
fall betweensPos
(inclusively) andePos
(exclusively).sPos
- This is the (integer)Vector
-index that sets a limit for the left-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'inclusive' meaning that theHTMLNode
at thisVector
-index will be visited by this method.
NOTE: If this value is negative, or larger than the length of the input-Vector
, an exception will be thrown.ePos
- This is the (integer)Vector
-index that sets a limit for the right-mostVector
-position to inspect/search inside the inputVector
-parameter.
This value is considered 'exclusive' meaning that the'HTMLNode'
at thisVector
-index will not be visited by this method.
NOTE: If this value is larger than the size of input theVector
-parameter, an exception will throw.
ALSO: Passing a negative value to this parameter,'ePos'
, will cause its value to be reset to the size of the inputVector
-parameter.- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
in which all attribute key-value pairs which were requested be removed byStrFilter
parameter'filter'
, have been removed. - Throws:
java.lang.IndexOutOfBoundsException
- This exception shall be thrown if any of the following are true:- If
'sPos'
is negative, or ifsPos
is greater-than-or-equal-to thesize
of theVector
- If
'ePos'
is zero, or greater than the size of theVector
- If the value of
'sPos'
is a larger integer than'ePos'
. If'ePos'
was negative, it is first reset toVector.size()
, before this check is done.
- If
- See Also:
TagNode.allAN()
,TagNode.isTagNode()
,TagNode.isClosing
,TagNode.removeAttributes(String[])
,LV
- Code:
- Exact Method Body:
return UsingStrFilter.filter(html, sPos, ePos, filter);
-
filter
public static int[] filter(java.util.Vector<? super TagNode> html, int[] posArr, StrFilter filter)
Filters the contents of each instance of a'TC.OpeningTags'
element in the inputVector
using aStrFilter
. All input-Vector TagNode's
which have attributes will have the list of attribute-names tested against the providedStrFilter.test(attribute)
predicate.
If any attribute whose name fails thePredicate
test, then that attribute will be removed. After testing all of aTagNode's
inner-tags, if any of those attributes did fail theStrFilter.test(...)
method, a newTagNode
will be constructed leaving those out. Finally, the oldTagNode
will be removed from input HTMLVector
, and replaced with the new one.
Range-Restriction -int[] posArr
:
This method restricts the filtering process to only nodes specified by theVector
-indexint[]
-Array parameter'posArr'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? super TagNode'
means that aVector<TagNode>
or aVector<HTMLNode>
are both accepted by this parameter. They will not cause an exception throw.
Note that if aVector<Object>
is passed, and there are no instances ofclass TagNode
contained by that Vector, then this method will simply exit gracefully.filter
- There are a plethora of available "automatically built"String
filters available using theinterface StrFilter's
static-member methods that buildStrFilter
to use here. One may also write a java lambda-expression here to implement thejava.util.function.Predicate
.
IMPORTANT: TheStrFilter
functional-interface extendsPredicate<Object>
. Perhaps it may seem counter-intuitive that it does not extendPredicate<String>
, however, sinceStrFilter
is a general purposePredicate
(used in numerous locations in this JAR-File 'Java-HTML' library distribution), it's a situation that allows for non-string objects (like the myriad classes which implement theinterface CharSequence
) to simply invoke Java'sObject.toString()
method to be used as input to the filter-test.
Sadly, this means that in writing a custom-made lambda-expression for thisPredicate
, it is mandatory to call Java's (Object class)'toString()'
method on the input 'innerTags' - even though the input parameter'inner-tags'
is already aString.
The rational for this inconvenience is thatinterface 'StrFilter'
has quite a few general-purpose, statically-invoked, factory-builder routines. Reusing those methods outweighs the benefit of having these methods, here, accept a'Predicate<String>'
instead of a'Predicate<Object>'
as an input parameter. (Again, noting thatStrFilter
extends'Predicate<Object>'
.
AGAIN: This method shall only modifyTagNode's
if theirVector
-index locations in'html'
are listed in'posArr'
.posArr
- This integer-array is expected to receive a "Pointer-Integer Array." These are usually generated by the NodeSearch'Find'
classes, and are simply lists of index-pointers into a Vectorized HTML Web-PageVector
. Theint[]
array passed to this parameter will specify theTagNode's
in theVector
whose attributes will be partially removed via a call toTagNode.removeAV(...)
and replaced.
For Example:
// This line will retrieve an array "index-pointer" to every HTML Image Element. int[] posArr = TagNodeFind.all(htmlPage, TC.OpeningTags, "img"); // Build an instance of StrFilter // // NOTE: The 'true' parameter indicates that the attribute name should be considered // and compared using a CASE INSENSITIVE fashion. StrFilter f = StrFilter.strListKEEP(true, "src") // This line will "clean up" any HTML "<IMG>" elements. If these elements are 'cluttered' // after the filter operation, only the 'src' attribute will remain. Attributes.filter(htmlPage, posArr, f);
- Returns:
- A Java
int[]
-Array whose integer-elements are pointers into the inputVector
-parameter'html'
. This array ofVector
-indices will always point toTagNode
-instances, and specifically toTagNode's
that were modified during this method's processing.
When this method has completed, each modified element of'html'
will have been replaced by a newTagNode
in which all attribute key-value pairs which were requested be removed byStrFilter
parameter'filter'
, have been removed.
NOTE: It is altogether possible that the nodes listed by the parameter'posArr'
, will actually not all be modified by this method! In such cases, it is (hopefully) obvious that the returnedint[]
-Array will be shorter than the supplied'posArr'
by the exact number ofTagNode's
that remained unchanged. - Throws:
java.lang.ArrayIndexOutOfBoundsException
- If any of the elements in'posArr'
contain index-pointers that are out of range ofVector
-parameter'page'
, then java will, naturally, throw this exception.OpeningTagNodeExpectedException
- When aVector
position-index holds an instance ofTagNode
, but thatTagNode
is one in which itsisClosing
-Field is set toTRUE
, then this exception shall throw.
When passingint[]
-Array parameter'posArr'
, that array should contain a list ofVector
-indices. The code which checks for this exception checks to ensure that each of the locations in that array point to Opening TagNode's, and if or when they don't, this exception throws.TagNodeExpectedException
- This exception shall throw if an identifiedVector
-index must point-to an instance ofTagNode
, but that index instead holds some otherHTMLNode
instance (eitherCommentNode
orTextNode
). If an integer-position array (int[] posArr
) is passed, but that array has an index pointing-to - something besides aTagNode
- then this exception will be thrown.- See Also:
TagNode.allAN()
,TagNode.isTagNode()
,TagNode.isClosing
,TagNode.removeAttributes(String[])
- Code:
- Exact Method Body:
return UsingStrFilter.filter(html, posArr, filter);
-
-