Package Torello.HTML.Tools.Images
Class Request
- java.lang.Object
-
- Torello.HTML.Tools.Images.Request
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
public class Request extends java.lang.Object implements java.lang.Cloneable, java.io.Serializable
ImageScraper-Suite Class
TheImageScraper
Tool itself includes three 'Helper-Classes' that facilitate its operations. These three Helpers include:Request
,Results
andImageInfo
.
Building a Request:
Building an Image-DownloadRequest
instance really should be extremely easy, and there is an example of doing just that at the top of theRequest
class. Properly configuring the class to handle any / all possible errors or exceptions that might occur when downloading images from a web-server requires a little reading of the JavaDoc pages provided by these tools.
TheRequest
class includes several boolean's for supressing / skipping exception if they occur during the download loop / process iteration. If an exception is thrown and suppressed, it will simply be logged to theResults
class.
Once aRequest
Object has been built, simply pass that object-instance to theImageScraper
methoddownload
and a download-process will begin.Request's
Lambda-Targets:
If theRequest
object contained any Lambda-Target / Function-Pointers, then those Lambda-Methods will be passed instances of the 'Helper-Class'ImageInfo
when they are invoked by the download-loop. These Function-Pointers provide just a few features that allow a programmer to do things like filter-out certain Image-URL's
and also do things like decide where a downloaded Image is ultimately stored.
Finally, when the download-loop has run to completely, it will return an instance of classResults
Getting Results:
After theImageScraper.download(...)
loop has run to completion, an instance of classResults
will be returned tot he user, and it will simply contain several parallel-arrays that hold / store data about what transpired when trying to download each of the Image-URL's
which were passed to theRequest
-Object.
For instance the 'skipped
' array will indicate which pictures didn't download. The 'fileNames
' array will hold the name of the file of each image that was successfully downloaded. And the 'imageFormats
' will identify which format was ultimately decided-upon when saving the image.
Remember that each of these return-arrays are parallel to eachother, and (or course) will be identical in length. Furthermore, as per the definition of "Parallel-Arrays", the element residing at any index will always correspond to the same image in any one of the other arrays.Holds all relevant configurations and parameters needed to run the primary download-loop of classImageScraper
The class holds numerous configuration parameters that are provided to theImageScraper
when initiating an image-download. The available configurations include:-
Severl Exception-Suppression
'skipOn' boolean's
that allow a user to instruct theImageScraper's
download-loop not to throw exceptions - but rather suppress them and save them to theResults
instance.
-
Two Time-Out Fields that may be configured which instruct the downloader on the maximum wait
time that an image should be allowed before halting the process. When this time-limit is
reached, the loop will either throw an exception, or suppress the exception and move on to the
next image, dependent upon the value assigned to
'
skipOnTimeOutException
'
-
Possibly configuring an HTTP
'User-Agent'
when downloading images. The'User-Agent'
was a feature designed by Web-Servers to allow a Web-Browser to identify itself when attempting to communicate to a server.
-
Several configurations are offered that allow a user to specify where and how an image is
saved to disk.
-
Finally, there are two Java
Predicate's
(Lambda-Targets) that allow a user to specify whether an image is saved or even downloaded. The Predicate 'skipURL
' is a way to inform the main download-loop that an image should not even be downloaded in the first place. The Predicate 'keeperPredicate
' allows a user to specify whether or not an image should be saved to disk after it has been downloaded, and all of the information about the image is known.
Example:
URL url = new URL("http://news.yahoo.com/"); Vector<HTMLNode> page = HTMLPage.getPageTokens(url, false); // Download a Web-Page, and extract all Image-Elements List<String> images = InnerTagGet .all(page, "img", "src") .stream() .map((TagNode tn) -> tn.AV("src")) .filter((String src) -> src.length() > 0) .toList(); // Build a Request Object using the Images-as-String's List Request req = Request.buildFromStrIter(images, url, true); // Add a few more Scraper-Configurations req.targetDirectory = "data/MyWebPages/Page01/"; req.useDefaultCounterForImageFileNames = true; req.skipOnDownloadException = true; req.verbosity = Verbosity.Verbose; // Run the scraper, Send all Text-Output to 'System.out' Results results = ImageScraper.download(req, System.out);
- See Also:
- Serialized Form
Hi-Lited Source-Code:- View Here: Torello/HTML/Tools/Images/Request.java
- Open New Browser-Tab: Torello/HTML/Tools/Images/Request.java
File Size: 58,433 Bytes Line Count: 1,325 '\n' Characters Found
-
-
Field Summary
Serializable ID Modifier and Type Field static long
serialVersionUID
Primary / Core Download Fields Modifier and Type Field URL
originalPageURL
int
size
Function<URL,URL>
urlPreProcessor
Verbosity
verbosity
Location-Decisions for Saving an Image File Modifier and Type Field Consumer<ImageInfo>
imageReceiver
String
targetDirectory
Function<ImageInfo,File>
targetDirectoryRetriever
File-Name given to an Image File Modifier and Type Field String
fileNamePrefix
Function<ImageInfo,String>
getImageFileSaveName
boolean
useDefaultCounterForImageFileNames
Booleans for Deciding When to Continue on Exception / Failure or Throw Modifier and Type Field boolean
skipBase64EncodedImages
boolean
skipOnB64DecodeException
boolean
skipOnDownloadException
boolean
skipOnImageWritingFail
boolean
skipOnNullImageException
boolean
skipOnTimeOutException
boolean
skipOnUserLambdaException
Predicates for Deciding Which Image Files to Save, and Which to Skip Modifier and Type Field Predicate<ImageInfo>
keeperPredicate
Predicate<URL>
skipURL
Avoiding Hangs & Locks with a TimeOut Modifier and Type Field static long
MAX_WAIT_TIME
static TimeUnit
MAX_WAIT_TIME_UNIT
long
maxDownloadWaitTime
TimeUnit
waitTimeUnits
Applying a User-Agent when Downloading Images Modifier and Type Field boolean
alwaysUseUserAgent
static String
DEFAULT_USER_AGENT
boolean
retryWithUserAgent
String
userAgent
-
Method Summary
Static-Builders for URL
-as-String
IterableModifier and Type Method static Request
buildFromStrIter(Iterable<String> source)
static Request
buildFromStrIter(Iterable<String> source, URL originalPageURL, boolean skipDontThrowIfBadStr)
Static-Builders for <IMG SRC=...>-TagNode
IterableModifier and Type Method static Request
buildFromTagNodeIter(Iterable<TagNode> source, boolean skipDontThrowIfBadSRCAttr)
static Request
buildFromTagNodeIter(Iterable<TagNode> source, URL originalPageURL, boolean skipDontThrowIfBadSRCAttr)
Static-Builder for URL
IterableModifier and Type Method static Request
buildFromURLIter(Iterable<URL> source)
Simple Convenience-Method Setter Modifier and Type Method void
skipOnAllExceptions()
Methods: interface java.lang.Cloneable Modifier and Type Method Request
clone()
Methods: class java.lang.Object Modifier and Type Method String
toString()
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
This fulfils the SerialVersion UID requirement for all classes that implement Java'sinterface java.io.Serializable
. Using theSerializable
Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.- See Also:
- Constant Field Values
- Code:
- Exact Field Declaration Expression:
public static final long serialVersionUID = 1;
-
originalPageURL
public final java.net.URL originalPageURL
URL
from whence this page has been downloaded
-
size
public final int size
The number of Image-URL's
identified inside the'source'
Iterable.
-
verbosity
public Verbosity verbosity
Allows a user of this utility to specify how the Level of Verbosity (or silence) is applied to the output mechanism while the tool is running.
Note that the Javaenum
Verbosity
provides four distint levels, and that the classImageScraper
does indeed implement all four variants of textual output.NullPointerException
:
This field may not be null, or aNullPointerException
: will throw!
-
urlPreProcessor
public java.util.function.Function<java.net.URL,java.net.URL> urlPreProcessor
When non-null, this allows a user to modify any image-URL
immediately-prior to theImageScraper
beginning the download process for that image. This is likely of limited use, but there are certainly situations where (for example) escaped-characters need to be un-escaped prior to starting the download system.
In such cases, just write a lambda-target that accepts aURL
, and processes it (in some way, of your chossing), and the downloader will use that updatedURL
-instance for making the HTTP-Connection to download the picture.
Setting to null:
This field may be null, and when it is, it shall be ignored. Upon construction, this class initializes this field to null.
-
targetDirectoryRetriever
public java.util.function.Function<ImageInfo,java.io.File> targetDirectoryRetriever
Allows a user to specify where to save an Image-File after being downloaded.
Setting to null:
This field may be null, and when it is, it shall be ignored. Upon construction, this class initializes this field to null.
-
imageReceiver
public java.util.function.Consumer<ImageInfo> imageReceiver
A functional-interface that allows a user to save an image-file to a location of his or her choosing. Implement this class if saving image files to a target-directory on the file-system is not acceptable, and the programmer wishes to do something else with the downloaded images.
Setting to null:
This field may be null, and when it is, it shall be ignored. Upon construction, this class initializes this field to null.
-
targetDirectory
public java.lang.String targetDirectory
When this configuration-field is non-null, thisString
parameter is used to identify the file-system directory to where downloaded image-files are to be saved.
Setting to null:
This field may be null, and when it is, it shall be ignored. Upon construction, this class initializes this field to null.
-
fileNamePrefix
public java.lang.String fileNamePrefix
When this field is non-null, thisString
will be prepended to each image file-name that is saved or stored to the file-system.
Setting to null:
This field may be null, and when it is, it shall be ignored. Upon construction, this class initializes this field to null.
-
useDefaultCounterForImageFileNames
public boolean useDefaultCounterForImageFileNames
When true, images will be saved according to a counter; when this isFALSE
, the software will attempt to save these images using their original filenames - picked from the URL. Saving using a counter is the default behaviour for this class.
-
getImageFileSaveName
public java.util.function.Function<ImageInfo,java.lang.String> getImageFileSaveName
When this field is non-null, each time an image is written to the file-system, this function will be queried for a file-name before writing the the image-file.
Setting to null:
This field may be null, and when it is, it shall be ignored. Upon construction, this class initializes this field to null.
-
skipOnDownloadException
public boolean skipOnDownloadException
Requests that the downloader-logic catch any & all exceptions that are thrown when downloading images from an Internet-URL
. The failed download is simply reflected in theResults
output arrays, and the download-process moves onto the nextIterable
-Element.
Exception's Skipped:
This particular configuration-boolean
allows a user to focus on exceptions that are thrown while Java'sImageIO
class is downloading and image, and suddenly fails.
-
skipOnB64DecodeException
public boolean skipOnB64DecodeException
Requests that the downloader-logic catch any & all exceptions that are thrown when decoding Base-64 Encoded Images. The failed conversion is simply reflected in theResults
output arrays, and the download-process moves onto the nextIterable
-Element.
Exception's Skipped:
This particular configuration-boolean
allows a user to focus on exceptions that are thrown when Java's Base-64 Image-Decoder throws an exception.
-
skipOnTimeOutException
public boolean skipOnTimeOutException
Requests that the downloader-logic catch any & all exceptions that are thrown while waiting for an image to finish downloading from an Internet-URL
. The failed download is simply reflected in theResults
output arrays, and the download-process moves onto the nextIterable
-Element.
Exception's Skipped:
This particular configuration-boolean
allows a user to focus on exceptions that are thrown when the Monitor-Thread has timed-out.
-
skipOnNullImageException
public boolean skipOnNullImageException
There are occasions when Java'sImageIO
class returns a null image, rather than throwing an exception at all. In these cases, theImageScraper
class throws its own exception - unless thisboolean
has expressly requested to skip-and-move-on when theImageIO
returns null from downloading aURL
.
Exception's Skipped:
This particular configuration-boolean
allows a user to focus on exceptions that are thrown by theImageScraper
when a downloaded image is null.
-
skipOnImageWritingFail
public boolean skipOnImageWritingFail
If an attempt is made to write an Image to the File-System, and an exception is thrown, this boolean requests that rather than throwing the exception, the downloader make a note in the log that a failure occured, and move on to the next image.
Exception's Skipped:
This particular configuration-boolean
allows a user to focus on exceptions that are thrown when writing an already downloaded and converted image to the file-system.
-
skipOnUserLambdaException
public boolean skipOnUserLambdaException
This can be helpful if there are any "doubts" about the quality of the Functional-Interfaces that have been provided to thisRequest
-instance.
Exception's Skipped:
This particular configuration-boolean
allows a user to focus on exceptions that are thrown by any of the Lambda-Target / Functional-Interfaces that are provided by the user via thisRequest
-instance.
-
skipURL
public java.util.function.Predicate<java.net.URL> skipURL
If this field is non-null, then before anyURL
is connected for a download, the downloaded mechanism will ask thisURL-Predicate
for permission first. If thisPredicate
returnsFALSE
for a givenURL
, then that image will not be downloaded, but rather skipped, instead.
Setting to null:
This field may be null, and when it is, it shall be ignored. Upon construction, this class initializes this field to null.
-
skipBase64EncodedImages
public boolean skipBase64EncodedImages
This scraper has the ability to decode and saveBase-64
Images, and they may be downloaded or skipped - based on thisboolean
. If anIterable<TagNode>
is passed to the constructor, and one of thoseTagNode's
contain an Image Element (<IMG SRC="data:image/jpeg;base64,...data">
) this class has the ability to interpret and save the image to a regular image file. By default,Base-64
images are skipped, but they can also be downloaded as well.
-
keeperPredicate
public java.util.function.Predicate<ImageInfo> keeperPredicate
Allows for a user-provided decision-predicate about whether to retain & save, or discard, an image after downloading. All information available in data-flow classImageInfo
is provided to this predicate, and ought to be enough to decide whether or not to save or reject any of the downloaded image.
-
MAX_WAIT_TIME
public static final long MAX_WAIT_TIME
This is the default maximum wait time for an image to download (10L). This value may be reset or modified by instantiating aImageScraper.AdditionalParameters
class, and passing the desired values to the constructor. This value is measured in units ofpublic static final java.util.concurrent.TimeUnit MAX_WAIT_TIME_UNIT
- See Also:
MAX_WAIT_TIME_UNIT
,maxDownloadWaitTime
, Constant Field Values- Code:
- Exact Field Declaration Expression:
public static final long MAX_WAIT_TIME = 10;
-
MAX_WAIT_TIME_UNIT
public static final java.util.concurrent.TimeUnit MAX_WAIT_TIME_UNIT
This is the default measuring unit for thestatic final long MAX_WAIT_TIME
member. This value may be reset or modified by instantiating aImageScraper.AdditionalParameters
class, and passing the desired values to the constructor.- See Also:
MAX_WAIT_TIME
,waitTimeUnits
- Code:
- Exact Field Declaration Expression:
public static final TimeUnit MAX_WAIT_TIME_UNIT = TimeUnit.SECONDS;
-
maxDownloadWaitTime
public long maxDownloadWaitTime
If you do not want the downloader to hang on an image, which is sometimes an issue depending upon the site from which you are making a request, set this parameter, and the downloader will not wait past that amount of time to download an image. The default value for this parameter is10 seconds
. If you do not wish to set the max-wait-time "the download time-out" counter, then leave the parameter"waitTimeUnits"
set tonull
, and this parameter will be ignored.
-
waitTimeUnits
public java.util.concurrent.TimeUnit waitTimeUnits
This is the "unit of measurement" for the fieldlong maxDownloadWaitTime
.
NOTE: This parameter may benull
, and if it is both this parameter and the parameterlong maxDownloadWaitTime
will be ignored, and the default maximum-wait-time (download time-out settings) will be used instead.
READ: java.util.concurrent.*; package, and about theclass java.util.concurrent.TimeUnit
for more information.
-
DEFAULT_USER_AGENT
public static final java.lang.String DEFAULT_USER_AGENT
There are web-sites that expect a User-Agent to be defined before allowing an image download to progress. There are even web-sites and servers that simply will not connect to a scraper unless a User-Agent is defined.
This is the default User-Agent that is defined inside of classScrape
.
-
userAgent
public java.lang.String userAgent
This
-
alwaysUseUserAgent
public boolean alwaysUseUserAgent
-
retryWithUserAgent
public boolean retryWithUserAgent
-
-
Method Detail
-
buildFromStrIter
public static Request buildFromStrIter (java.lang.Iterable<java.lang.String> source)
Builds an instance of this class from a list ofURL's
asString's
- Parameters:
source
- Accepts any JavaIterable
containingString's
. Note that if any of theseString's
are malformedURL's
, then this method will throw anIllegalArgumentException
, with a JavaMalformedURLException
as its cause.
Furthermore, if any of theseString's
contain partially resolvedURL's
, this will also force this method to throw anIllegalArgumentException
.- Returns:
- A
'Request'
instance. This may be further configured by assigning values to any / all fields (which will still have their initialized / default-values) - Throws:
java.lang.NullPointerException
- If any of theString's
in theIterable
are nulljava.lang.IllegalArgumentException
- If any of theURL's
areString's
which begin with neither'http://'
nor'https://'
. Since this method doesn't accept the parameter'originalPageURL'
, each and everyURL
in the'source'
iterable must be a full & completeURL
.
This exception will also throw if there are anyURL's
in theString
-List that cause aMalformedURLException
to throw when constructing an instance ofjava.net.URL
from theString
. In these cases, the originalMalformedURLException
will be assigned to the'cause'
, and may be retrieved using the exception'sgetCause()
method.- Code:
- Exact Method Body:
Vector<URL> temp = new Vector<>(); int count = 0; for (String urlStr : source) { count++; if (urlStr == null) throw new NullPointerException( "The " + count + StringParse.ordinalIndicator(count) + "th element of " + "Iterable-Parameter 'source' is null" ); if (StrCmpr.startsWithNAND(urlStr, "http://", "https://")) throw new IllegalArgumentException( "The " + count + StringParse.ordinalIndicator(count) + "th element of " + "Iterable-Parameter 'source' did neither begin with 'http://' nor " + "'https://'. If there are any partial URL's in you Iterable, you must " + "re-build using an 'originalPageURL' parameter in order to resolve any and " + "all partial URL's." ); try { temp.add(new URL(urlStr)); } catch (MalformedURLException e) { throw new IllegalArgumentException( "When attempting to build the " + count + StringParse.ordinalIndicator(count) + " element of Iterable-Parameter 'source', a MalformedURLException threw. " + "Please see e.getCause() for details.", e ); } } return new Request(temp, count, null);
-
buildFromStrIter
public static Request buildFromStrIter (java.lang.Iterable<java.lang.String> source, java.net.URL originalPageURL, boolean skipDontThrowIfBadStr)
Builds an instance of this class from a list ofURL's
asString's
- Parameters:
source
- Accepts any JavaIterable
containingString's
. Note that if any of theseString's
are malformedURL's
, then this method will throw anIllegalArgumentException
, with a JavaMalformedURLException
as its cause.
Furthermore, if any of theseString's
contain partially resolvedURL's
, this will also force this method to throw anIllegalArgumentException
.originalPageURL
- This parameter may not be null, or aNullPointerException
will throw. This parameter is used to help any partially resolved-URL's
.skipDontThrowIfBadStr
- If an exception is thrown when attempting to resolve a partial-URL
, and this parameter isTRUE
, then that exception is suppressed and logged, and the builder-loop continues to the nextURL
-as-a-String
.
When this parameter is passedFALSE
, unresolvableURL's
will generate anIllegalArgumentException
-throw.
Note that the presence of a null in theIterable 'source'
parameter will always force this method to throwNullPointerException
.- Returns:
- A
'Request'
instance. This may be further configured by assigning values to any / all fields (which will still have their initialized / default-values) - Throws:
java.lang.NullPointerException
- If any of theString's
in theIterable
are nulljava.lang.IllegalArgumentException
- This exception will also throw if there are anyURL's
in theString
-List that cause aMalformedURLException
to throw when constructing an instance ofjava.net.URL
from theString
. In these cases, the generatedMalformedURLException
will be assigned to the exception's'cause'
, and may therefore be retrieved using this exception'sgetCause()
method.- Code:
- Exact Method Body:
if (originalPageURL == null) throw new NullPointerException("'originalPageURL' is null."); Vector<URL> temp = new Vector<>(); Vector<Exception> strExceptions = new Vector<>(); Vector<String[]> b64Dummy = new Vector<>(); int count = 0; for (String urlStr : source) { count++; // This isn't used here, but the (private) Request-Constructor needs an empty Vector if // there aren't any b64-Images. b64Dummy.add(null); if (urlStr == null) throw new NullPointerException( "The " + count + StringParse.ordinalIndicator(count) + " element of " + "Iterable-Parameter 'source' is null" ); Ret2<URL, MalformedURLException> r2 = Links.resolve_KE(urlStr, originalPageURL); if (r2.b == null) temp.add(r2.a); else { if (skipDontThrowIfBadStr) { temp.add(null); strExceptions.add(r2.b); } else throw new IllegalArgumentException( "When attempting to resolve the " + count + StringParse.ordinalIndicator(count) + " TagNode of Iterable-Parameter " + "'source' an Exception was thrown. See 'getCause' for details.", r2.b ); } } return new Request(temp, count, originalPageURL, b64Dummy, strExceptions);
-
buildFromTagNodeIter
public static Request buildFromTagNodeIter (java.lang.Iterable<TagNode> source, java.net.URL originalPageURL, boolean skipDontThrowIfBadSRCAttr)
Builds an instance of this class using theSRC
-Attribute from a list ofTagNode
's.- Parameters:
source
- Accepts any JavaIterable
containingTagNode
instances. If any of theseTagNode's
do not have an HTML-SRC
Attribute, then this builder method will immediately throw aSRCException
.originalPageURL
- This parameter may not be null, or aNullPointerException
will throw. This parameter is used to help any partially resolved-URL's
.skipDontThrowIfBadSRCAttr
- When this parameter is passedTRUE
, if any of theTagNode
elements inside'source'
either do not have aSRC
-Attribute, or have such an attribute containing an invalidURL
that cannot be instantiated, then this element of the iterable will simply be skipped, gracefully.
When this parameter is passedFALSE
, if any of the above two conditions / situations arise, then an exception will be thrown (this builder method will not run until completion).
Note that the presence of a null inside theIterable 'source'
parameter will always force this method to throwNullPointerException
.- Returns:
- A
'Request'
instance. This may be further configured by assigning values to any / all fields (which will still have their initialized / default-values) - Throws:
java.lang.NullPointerException
- If any of theTagNode
's in theIterable
are nullSRCException
- If any of theTagNode
's in the list do not have a'SRC'
Attribute, and'skipDontThrowIfBadSRCAttr'
isFALSE
.
This exception will also throw if there are anyURL's
in theTagNode
-List that cause aMalformedURLException
to throw when constructing an instance ofjava.net.URL
(from theTagNode's SRC
-Attribute). In these cases, the generatedMalformedURLException
will be assigned to the exception's'cause'
, and may therefore be retrieved using the exception'sgetCause()
method.
If'skipDontThrowIfBadSRCAttr'
isFALSE
, then this exception will not throw, and a null will be placed in the query-list.- Code:
- Exact Method Body:
if (originalPageURL == null) throw new NullPointerException("'originalPageURL' is null."); Vector<URL> temp = new Vector<>(); int count = 0; Vector<String[]> b64Images = new Vector<>(); Vector<Exception> tagNodeSRCExceptions = new Vector<>(); for (TagNode tn : source) { count++; if (tn == null) throw new NullPointerException( "The " + count + StringParse.ordinalIndicator(count) + " element of " + "Iterable-Parameter 'source' is null" ); String urlStr = tn.AV("src"); if (urlStr == null) { SRCException e = new SRCException( "The " + count + StringParse.ordinalIndicator(count) + " TagNode of " + "Iterable-Parameter 'source' does not have a SRC-Attribute" ); if (skipDontThrowIfBadSRCAttr) { temp.add(null); b64Images.add(null); tagNodeSRCExceptions.add(e); continue; } else throw e; } // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Handle the Base-64 Stuff, here. 08.31.2023 // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** Matcher m = IF.B64_INIT_STRING.matcher(urlStr); if (m.find()) { temp.add(null); b64Images.add(new String[] { m.group(1), m.group(2)}); continue; } // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Handle a normal URL. 08.31.2023 // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** Ret2<URL, MalformedURLException> r2 = Links.resolve_KE(urlStr, originalPageURL); if (r2.b != null) { SRCException e = new SRCException( "Attempting to resolve the " + count + StringParse.ordinalIndicator(count) + " TagNode of Iterable-Parameter 'source' there was an Exception. See " + "'getCause' for details.", r2.b ); if (skipDontThrowIfBadSRCAttr) { temp.add(null); b64Images.add(null); tagNodeSRCExceptions.add(e); continue; } else throw e; } temp.add(r2.a); } return new Request(temp, count, originalPageURL, b64Images, tagNodeSRCExceptions);
-
buildFromTagNodeIter
public static Request buildFromTagNodeIter (java.lang.Iterable<TagNode> source, boolean skipDontThrowIfBadSRCAttr)
Builds an instance of this class using theSRC
-Attribute from a list ofTagNode
's.- Parameters:
source
- Accepts any JavaIterable
containingTagNode
instances. If any of theseTagNode's
do not have an HTML-SRC
Attribute, then this builder method will immediately throw aSRCException
.skipDontThrowIfBadSRCAttr
- When this parameter is passedTRUE
, if any of theTagNode
elements inside'source'
either do not have aSRC
-Attribute, or have such an attribute containing an invalidURL
that cannot be instantiated, then this element of the iterable will simply be skipped, gracefully.
When this parameter is passedFALSE
, if any of the above two conditions / situations arise, then an exception will be thrown (this builder method will not run until completion).
Note that the presence of a null inside theIterable 'source'
parameter will always force this method to throwNullPointerException
.- Returns:
- A
'Request'
instance. This may be further configured by assigning values to any / all fields (which will still have their initialized / default-values) - Throws:
java.lang.NullPointerException
- If any of theTagNode
's in theIterable
are nullSRCException
- If any of theTagNode
's in the list do not have a'SRC'
-Attribute, and'skipDontThrowIfBadSRCAttr'
isFALSE
.
This exception will also throw if any of theURL's
assigned to a'SRC'
-Attribute are partial-URL's
which do not begin with'http://'
(or'https://'
), and'skipDontThrowIfBadSRCAttr'
isFALSE
.
Finally, if any of theURL's
inside aTagNode
's''SRC'
-Attribute cause aMalformedURLException
, that exception will be assigned to thecause
of aSRCException
, and thrown (unless'skipDontThrowIfBadSRCAttr'
isFALSE
).- Code:
- Exact Method Body:
Vector<URL> temp = new Vector<>(); int count = 0; Vector<String[]> b64Images = new Vector<>(); Vector<Exception> tagNodeSRCExceptions = new Vector<>(); for (TagNode tn : source) { count++; if (tn == null) throw new NullPointerException( "The " + count + StringParse.ordinalIndicator(count) + " element of " + "Iterable-Parameter 'source' is null" ); String urlStr = tn.AV("src"); if (urlStr == null) { SRCException e = new SRCException( "The " + count + StringParse.ordinalIndicator(count) + " TagNode of " + "Iterable-Parameter 'source' does not have a SRC-Attribute" ); if (skipDontThrowIfBadSRCAttr) { temp.add(null); b64Images.add(null); tagNodeSRCExceptions.add(e); continue; } else throw e; } // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Handle the Base-64 Stuff, here. 08.31.2023 // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** Matcher m = IF.B64_INIT_STRING.matcher(urlStr); if (m.find()) { temp.add(null); b64Images.add(new String[] { m.group(1), m.group(2)}); continue; } // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Handle the Base-64 Stuff, here. 08.31.2023 // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** if (StrCmpr.startsWithNAND(urlStr, "http://", "https://")) { SRCException e = new SRCException( "The " + count + StringParse.ordinalIndicator(count) + " TagNode of " + "Iterable-Parameter 'source' did neither begin with 'http://' nor " + "'https://'. Please build using an 'originalPageURL' parameter to resolve " + "any and all partial URL's." ); if (skipDontThrowIfBadSRCAttr) { temp.add(null); b64Images.add(null); tagNodeSRCExceptions.add(e); continue; } else throw e; } try { temp.add(new URL(urlStr)); } catch (MalformedURLException e) { // variable 'e' has already been defined above, use 'e2' SRCException e2 = new SRCException( "When attempting to build the " + count + StringParse.ordinalIndicator(count) + " element of Iterable-Parameter 'source', a MalformedURLException threw. " + "Please see getCause for details.", e ); if (skipDontThrowIfBadSRCAttr) { temp.add(null); b64Images.add(null); tagNodeSRCExceptions.add(e2); continue; } else throw e2; } } return new Request(temp, count, null, b64Images, tagNodeSRCExceptions);
-
buildFromURLIter
public static Request buildFromURLIter (java.lang.Iterable<java.net.URL> source)
Builds an instance of this class using a list of already preparedURL's
.- Parameters:
source
- Accepts any JavaIterable
containingURL's
. Note that if any of theseURL's
are not fully resolved, when downloading begins, any un-resolved ones will cause an exception to throw.- Returns:
- A
'Request'
instance. This may be further configured by assigning values to any / all fields (which will still have their initialized / default-values) - Throws:
java.lang.NullPointerException
- If any of theURL's
in theIterable
are null- Code:
- Exact Method Body:
Vector<URL> temp = new Vector<>(); int count = 0; for (URL url : source) { count++; if (url == null) throw new NullPointerException( "The " + count + StringParse.ordinalIndicator(count) + " element of " + "Iterable-Parameter 'source' is null" ); temp.add(url); } return new Request(temp, count, null);
-
skipOnAllExceptions
public void skipOnAllExceptions()
This allows a user to quickly / easily set all'skipOn'
flags in one method call.
-
toString
public java.lang.String toString()
Converts'this'
instance into a simple Java-String
- Overrides:
toString
in classjava.lang.Object
- Returns:
- A
String
where each field has had a 'best efforts'String
-Conversion - Code:
- Exact Method Body:
return // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Primary Request Fields // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** "Primary Request Fields:\n" + // public final URL originalPageURL; I4 + "originalPageURL: " + this.originalPageURL + '\n' + // private final int size; I4 + "size: " + this.size + '\n' + // public Verbosity = Verbosity.Normal; I4 + "Verbosity: " + this.verbosity + '\n' + '\n' + // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Location-Decisions for Saving an Image File // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** "Location-Decisions for Saving an Image File:\n" + // public Function<ImageInfo, File> targetDirectoryRetriever = null; I4 + "targetDirectoryRetriever: " + OBJ_TO_CLASS_NAME(this.targetDirectoryRetriever) + // public Consumer<ImageInfo> imageReceiver = null; I4 + "imageReceiver: " + OBJ_TO_CLASS_NAME(this.imageReceiver) + // public String targetDirectory = null; I4 + "targetDirectory: " + ((this.targetDirectory != null) ? ("[\"" + this.targetDirectory + "\"]") : "null") + '\n' + '\n' + // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // File-Name given to an Image File // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** "File-Name given to an Image File:\n" + // public String fileNamePrefix = null; I4 + "fileNamePrefix: " + ((this.fileNamePrefix != null) ? ("[\"" + this.fileNamePrefix + "\"]") : "null") + '\n' + // public boolean useDefaultCounterForImageFileNames = true; I4 + "useDefaultCounterForImageFileNames: " + this.useDefaultCounterForImageFileNames + '\n' + // public Function<ImageInfo, String> getImageFileSaveName = null; I4 + "getImageFileSaveName: " + OBJ_TO_CLASS_NAME(this.getImageFileSaveName) + '\n' + '\n' + // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // BOOLEAN'S: Continuing or Throwing on Failure & Exception // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** "Boolean-Flags: Continuing or Throwing on Failure & Exception:\n" + // public boolean skipOnDownloadException = false; I4 + "skipOnDownloadException: " + this.skipOnDownloadException + '\n' + // public boolean skipOnB64DecodeException = false; I4 + "skipOnB64DecodeException: " + this.skipOnB64DecodeException + '\n' + // public boolean skipOnTimeOutException = false; I4 + "skipOnTimeOutException: " + this.skipOnTimeOutException + '\n' + // public boolean skipOnNullImageException = false; I4 + "skipOnNullImageException: " + this.skipOnNullImageException + '\n' + // public boolean skipOnImageWritingFail = false; I4 + "skipOnImageWritingFail: " + this.skipOnImageWritingFail + '\n' + // public boolean skipOnUserLambdaException = false; I4 + "skipOnUserLambdaException: " + this.skipOnUserLambdaException + '\n' + '\n' + // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // USER-PREDICATE'S & BOOLEAN'S: Which Image Files to Save, and Which to Skip // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** "User-Predicate's: Which Image-Files to Save, and Which to Skip:\n" + // public Predicate<URL> skipURL = null; I4 + "skipURL: " + OBJ_TO_CLASS_NAME(this.skipURL) + // public boolean skipBase64EncodedImages = false; I4 + "skipBase64EncodedImages: " + this.skipBase64EncodedImages + '\n' + // public Predicate<ImageInfo> keeperPredicate = null; I4 + "keeperPredicate: " + OBJ_TO_CLASS_NAME(this.keeperPredicate) + '\n' + // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Avoiding Hangs and Locks with a TimeOut // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** "Avoiding Hangs and Locks with a TimeOut:\n" + // public long maxDownloadWaitTime = 0; I4 + "maxDownloadWaitTime: " + this.maxDownloadWaitTime + '\n' + // public TimeUnit waitTimeUnits = null; I4 + "waitTimeUnits: " + this.waitTimeUnits + '\n' + '\n' + // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // USER AGENT // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** "Using a User-Agent:\n" + // public String userAgent = DEFAULT_USER_AGENT; I4 + "userAgent: " + this.userAgent + '\n' + // public boolean alwaysUseUserAgent = false; I4 + "alwaysUseUserAgent: " + this.alwaysUseUserAgent + '\n' + // public boolean retryWithUserAgent = true; I4 + "retryWithUserAgent: " + this.retryWithUserAgent + '\n';
-
clone
public Request clone()
Builds a clone of'this'
instance- Overrides:
clone
in classjava.lang.Object
- Returns:
- The copied instance. Note that this is a shallow clone,
rather than a deep clone. The references within the returned
instances are the exact same references as are in
'this'
instance. - Code:
- Exact Method Body:
return new Request(this);
-
-