Package Torello.HTML.Tools.Images
Class Results
- java.lang.Object
-
- Torello.HTML.Tools.Images.Results
-
- All Implemented Interfaces:
java.io.Serializable,java.lang.Cloneable
public class Results extends java.lang.Object implements java.io.Serializable, java.lang.Cloneable
ImageScraper-Suite Class
TheImageScraperTool itself includes three 'Helper-Classes' that facilitate its operations. These three Helpers include:Request,ResultsandImageInfo.
Building a Request:
Building an Image-DownloadRequestinstance really should be extremely easy, and there is an example of doing just that at the top of theRequestclass. Properly configuring the class to handle any / all possible errors or exceptions that might occur when downloading images from a web-server requires a little reading of the JavaDoc pages provided by these tools.
TheRequestclass includes several boolean's for supressing / skipping exception if they occur during the download loop / process iteration. If an exception is thrown and suppressed, it will simply be logged to theResultsclass.
Once aRequestObject has been built, simply pass that object-instance to theImageScrapermethoddownloadand a download-process will begin.Request'sLambda-Targets:
If theRequestobject contained any Lambda-Target / Function-Pointers, then those Lambda-Methods will be passed instances of the 'Helper-Class'ImageInfowhen they are invoked by the download-loop. These Function-Pointers provide just a few features that allow a programmer to do things like filter-out certain Image-URL'sand also do things like decide where a downloaded Image is ultimately stored.
Finally, when the download-loop has run to completely, it will return an instance of classResults
Getting Results:
After theImageScraper.download(...)loop has run to completion, an instance of classResultswill be returned tot he user, and it will simply contain several parallel-arrays that hold / store data about what transpired when trying to download each of the Image-URL'swhich were passed to theRequest-Object.
For instance the 'skipped' array will indicate which pictures didn't download. The 'fileNames' array will hold the name of the file of each image that was successfully downloaded. And the 'imageFormats' will identify which format was ultimately decided-upon when saving the image.
Remember that each of these return-arrays are parallel to eachother, and (or course) will be identical in length. Furthermore, as per the definition of "Parallel-Arrays", the element residing at any index will always correspond to the same image in any one of the other arrays.After downloading all of the user's requested images, the classImageScraperreturns an instance of this class.
When a download request has completed, this class will be instantiated and returned. Care has been taken to ensure this class does not freeze nor fail when a particular image fails to download. This level of control is customizable, so if the programmer would prefer download execution to halt immediately upon exception, there is are several settings for this in the classRequest.
The link below contains the output of invoking thetoStringmethod on a'Results'instance after downloading all of the HTML<IMG SRC=...>tags that were scraped from the Web-Servernews.yahoo.com.results.txt
Initializations:
Many of the values in the arrays for class'Results'(this class) may contain null-values, or a'-1'. This happens if an exceptions is thrown while downloading or saving any one particular image, which prevents the process from running to completion.
To view the exact initialization-value for elements of any of these array, simply click on theHiLited Source-Codelink, and scroll down to the Package-PrivateResultsclass constructor.
The 'skipped' Array:
There is an easy way to check whichURL'shave failed or were skipped (due to user-request), and whichURL'swere successfully obtained. One of several arrays that arepublicin this class is theskippedboolean[]-array.
Whenever a particularURL-index corresponds to askipped-index that contains aFALSEboolean-value, it indicates that that particularURLultimately was not properly saved or re-transmitted.- See Also:
ImageScraper, Serialized Form
Hi-Lited Source-Code:- View Here: Torello/HTML/Tools/Images/Results.java
- Open New Browser-Tab: Torello/HTML/Tools/Images/Results.java
File Size: 13,556 Bytes Line Count: 364 '\n' Characters Found
-
-
Field Summary
Serializable ID Modifier and Type Field static longserialVersionUIDParallel Result-Arrays:Original Source-Location of the Image Modifier and Type Field boolean[]b64EncodedImgURL[]urlsParallel Result-Arrays:Image-Retrieval Success or Failure Modifier and Type Field Exception[]exceptionsboolean[]skippedParallel Result-Arrays:Image-File Write Name & Location Modifier and Type Field String[]fileNamesString[]saveDirectoriesParallel Result-Arrays:Image Specs Modifier and Type Field int[]heightsIF[]imageFormatslong[]sizesint[]widths
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
This fulfils the SerialVersion UID requirement for all classes that implement Java'sinterface java.io.Serializable. Using theSerializableImplementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.- See Also:
- Constant Field Values
-
urls
public final java.net.URL[] urls
This will contain a complete list of theURL'sthat were retrieved (or generated- if partially-resolved 'relative'URL'swere provided) from theRequest-instancesstatic-builder. Every image downloaded (or attempted for download) will have itsURLsaved here, in this array.
Null's in theurlsArray
An array-element of theurls-array will contain null under the following two circumstances:- No image-
URLwas provided, becasue the picture was a Base-64 Encoded Image, and instead was aStringthat had been retrieved from aTagNode'sSRC-Attribute.
- The user provided a
Stringto theRequestClass builder, but thatStringcaused aMalformedURLException, and noURL-instance was ever built. (Note that in this scenario, theexceptions-array would be storing theURL-Exception that was thrown).
Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass. - No image-
-
b64EncodedImg
public final boolean[] b64EncodedImg
When constructing anImageScraper'sRequestobject, one of the options for building the instance is to pass a list of HTMLTagNodeinstances containing HTML'<IMG SRC=...>'tags.
HTMLTagNodeelements will sometimes / occasionally have a variant of an image source known as the Base-64 Encoded Image. These are images where the actual pictures fully stored & encoded inside theSRC-Attribute of the HTMLTagNode'sSRC-Attribute.
Base-64 Images are just pictures that have been converted into actual character data, in the form of a simpleString, and saved into the<IMG>tag'sSRC-URL - instead of an actual HTTPURLbeing saved there. Note that this practice is generally used for much smaller pictures, thumbnails or logo signs (images that wouldn't use up a lot of data).
A full explanation of HTML'sBase-64Image-Encoding is beyone the scope of this Java-Doc Comment.
If any image was "converted" from a B-64 Image-Encoding (rather than downloaded from aURL), then the boolean for the image's index will beTRUErather thanFALSE. The default value for all elements of this array isFALSE.Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass.
-
skipped
public final boolean[] skipped
An imagesResults-data in this particular paralell-array will beTRUEunder any of the following circumstances:- If the user provided a
Request.skipURL-Predicate, and that predicate rejected the image (telling the downloader not to download the picture).
- If the user provided a
Request.keeperPredicate, and that predicate, after image-downloade complettion, rejected the image (telling the downloader not save the picture to disk the picture).
- If there were any exceptions thrown while downloading the image that forced the
downloader-logic to abandon the image (and either throw the exception, or skip-and-move-on to
the next image).
- If the original
Iterablethat was provided to theRequest-instance had entries that had causedexceptionsto be thrown while building theRequest-instance.
Under any / all other circumstances, if an image was successfully downloaded and written to disk, then the corresponding element in this array will containFALSE!Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass. - If the user provided a
-
fileNames
public final java.lang.String[] fileNames
The names of the files that were retrieved and/or stored will be in this array. If this image were skipped or an exception occurred, the array position for thatURLwould contain 'null'.
It is important to note that if an element of this array contains a valid, non-null, file-name - it does not guarantee that the image was properly saved. The value stored in the corresponding (parallel)skipped-array index is the only way to ascertain whether an image was ultimately written to disk (or transmitted to a User-ProvidedRequest.imageReceiver).Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass.
-
saveDirectories
public final java.lang.String[] saveDirectories
The location of the file-name saved directory, if an image did not successfully save to the File-System, or if animageReceiverwere used, then the array-location would contain'null'.Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass.
-
imageFormats
public final IF[] imageFormats
The image type of the files that were retrieved will be stored in this array.
Image-Download Skipped:
If the image-downloader threw an exception while attempting to download a particular image, then thesizes-array at that image's index would containnull(and theskipped-array would containTRUE).
If the user-providedskipURL-Predicate asked that aURLbe skipped, then the element at that array-index would also containnull.
Image-Download Succeeded:
If a particular Image-URLwas successfully downloaded, but if the User-ProvidedkeeperPredicatehas asked that the downloader not save the image to disk, then the value in this array would contain that image's image-format anyway!. This would be despite the fact that the image wasn't actually written to disk.
If, after downloading, an image failed to write to disk due to an exception, then the value in this array would, also / again, contain that image's image-format - even though the image itself hadn't properly saved. (Theskipped-array would still have aTRUEvalue stored in it).
In Summary: When an image is skipped due to an exception, or some other issue that occurred during the downloader-loop, partially-acquired information about the picture will still be written to the classResults-arrays - whether or not the image has been saved to the file-system.Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass.
-
exceptions
public final java.lang.Exception[] exceptions
If any stage of the image download, conversion or disk-write fails, then this array will store a record the exception that was thrown.
If the download succeeded, then the'exceptions'-array element at that index would contain 'null.' Anyexceptions-array index that contains a non-nullExceptionwill be an index for which theskipped-array contains aTRUE-value stored at the same location.Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass.
-
sizes
public final long[] sizes
This will contain a list of long-integers, each of which will have the file-size of the downloaded image.
Image-Download Skipped:
If the image-downloader threw an exception while attempting to download a particular image, then thesizes-array at that image's index would contain-1(and theskipped-array would containTRUE).
If the user-providedskipURL-Predicate asked that aURLbe skipped, then the element at that array-index would also contain-1.
Image-Download Succeeded:
If a particular Image-URLwas successfully downloaded, but if the User-ProvidedkeeperPredicatehas asked that the downloader not save the image to disk, then the value in this array would contain that image's size anyway!. This would be despite the fact that the image wasn't actually written to disk.
If, after downloading, an image failed to write to disk due to an exception, then the value in this array would, also / again, contain that image's size - even though the image itself hadn't properly saved. (Theskipped-array would still have aTRUEvalue stored in it).
In Summary: When an image is skipped due to an exception, or some other issue that occurred during the downloader-loop, partially-acquired information about the picture will still be written to the classResults-arrays - whether or not the image has been saved to the file-system.Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass.
-
widths
public final int[] widths
This will contain a list of integers, each of which shall have the image-widths of the downloaded images.
Image-Download Skipped:
If the image-downloader threw an exception while attempting to download a particular image, then thesizes-array at that image's index would contain-1(and theskipped-array would containTRUE).
If the user-providedskipURL-Predicate asked that aURLbe skipped, then the element at that array-index would also contain-1.
Image-Download Succeeded:
If a particular Image-URLwas successfully downloaded, but if the User-ProvidedkeeperPredicatehas asked that the downloader not save the image to disk, then the value in this array would contain that image's width anyway!. This would be despite the fact that the image wasn't actually written to disk.
If, after downloading, an image failed to write to disk due to an exception, then the value in this array would, also / again, contain that image's width - even though the image itself hadn't properly saved. (Theskipped-array would still have aTRUEvalue stored in it).
In Summary: When an image is skipped due to an exception, or some other issue that occurred during the downloader-loop, partially-acquired information about the picture will still be written to the classResults-arrays - whether or not the image has been saved to the file-system.Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass.
-
heights
public final int[] heights
This shall contain a list of integers, each of which shall have the image-heights of the downloaded images.
Image-Download Skipped:
If the image-downloader threw an exception while attempting to download a particular image, then thesizes-array at that image's index would contain-1(and theskipped-array would containTRUE).
If the user-providedskipURL-Predicate asked that aURLbe skipped, then the element at that array-index would also contain-1.
Image-Download Succeeded:
If a particular Image-URLwas successfully downloaded, but if the User-ProvidedkeeperPredicatehas asked that the downloader not save the image to disk, then the value in this array would contain that image's height anyway!. This would be despite the fact that the image wasn't actually written to disk.
If, after downloading, an image failed to write to disk due to an exception, then the value in this array would, also / again, contain that image's height - even though the image itself hadn't properly saved. (Theskipped-array would still have aTRUEvalue stored in it).
In Summary: When an image is skipped due to an exception, or some other issue that occurred during the downloader-loop, partially-acquired information about the picture will still be written to the classResults-arrays - whether or not the image has been saved to the file-system.Parallel Arrays: The index of this array will be parallel to the input-sourceIterable<URL>retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResultsClass.
-
-
Method Detail
-
clone
-
equals
public boolean equals(java.lang.Object other)
Checks whether'this'instance is equal to another instance of classResults. This method performs a Deep Equals comparison using theequals(...)method suite found in class'java.util.Arrays.- Overrides:
equalsin classjava.lang.Object- Parameters:
other- This may be any Java Object, but only an instance class'Results'has any chance of being marked as equal to this instance.- Returns:
TRUEif and only if'o'has a type that's assignable toResults- and if each of the internal arrays in this instance are equal to the arrays in parameter'o'.- Code:
- Exact Method Body:
return ResultsEquals.equals(this, other);
-
toString
public java.lang.String toString()
Returns ajava.lang.Stringrepresentation of'this'instance- Overrides:
toStringin classjava.lang.Object- Returns:
- A Java
Stringcontaining the data inside this class. - Code:
- Exact Method Body:
return ResultsToString.run(this);
-
hashCode
public int hashCode()
Java's hash-code requirement. The code is computed by summing the first 10sizesarray elements.- Overrides:
hashCodein classjava.lang.Object- Returns:
- A hash-code that may be used when storing this node in a java sorted-collection.
- Code:
- Exact Method Body:
int sum = 0; for (int i=0; (i < 10) && (i < sizes.length); i++) sum += sizes[i]; return sum;
-
-