Package Torello.HTML.Tools.Images
Class Results
- java.lang.Object
-
- Torello.HTML.Tools.Images.Results
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
public class Results extends java.lang.Object implements java.io.Serializable, java.lang.Cloneable
ImageScraper-Suite Class
TheImageScraper
Tool itself includes three 'Helper-Classes' that facilitate its operations. These three Helpers include:Request
,Results
andImageInfo
.
Building a Request:
Building an Image-DownloadRequest
instance really should be extremely easy, and there is an example of doing just that at the top of theRequest
class. Properly configuring the class to handle any / all possible errors or exceptions that might occur when downloading images from a web-server requires a little reading of the JavaDoc pages provided by these tools.
TheRequest
class includes several boolean's for supressing / skipping exception if they occur during the download loop / process iteration. If an exception is thrown and suppressed, it will simply be logged to theResults
class.
Once aRequest
Object has been built, simply pass that object-instance to theImageScraper
methoddownload
and a download-process will begin.Request's
Lambda-Targets:
If theRequest
object contained any Lambda-Target / Function-Pointers, then those Lambda-Methods will be passed instances of the 'Helper-Class'ImageInfo
when they are invoked by the download-loop. These Function-Pointers provide just a few features that allow a programmer to do things like filter-out certain Image-URL's
and also do things like decide where a downloaded Image is ultimately stored.
Finally, when the download-loop has run to completely, it will return an instance of classResults
Getting Results:
After theImageScraper.download(...)
loop has run to completion, an instance of classResults
will be returned tot he user, and it will simply contain several parallel-arrays that hold / store data about what transpired when trying to download each of the Image-URL's
which were passed to theRequest
-Object.
For instance the 'skipped
' array will indicate which pictures didn't download. The 'fileNames
' array will hold the name of the file of each image that was successfully downloaded. And the 'imageFormats
' will identify which format was ultimately decided-upon when saving the image.
Remember that each of these return-arrays are parallel to eachother, and (or course) will be identical in length. Furthermore, as per the definition of "Parallel-Arrays", the element residing at any index will always correspond to the same image in any one of the other arrays.After downloading all of the user's requested images, the classImageScraper
returns an instance of this class.
When a download request has completed, this class will be instantiated and returned. Care has been taken to ensure this class does not freeze nor fail when a particular image fails to download. This level of control is customizable, so if the programmer would prefer download execution to halt immediately upon exception, there is are several settings for this in the classRequest
.
The link below contains the output of invoking thetoString()
method on a'Results'
instance after downloading all of the HTML<IMG SRC=...>
tags that were scraped from the Web-Servernews.yahoo.com
.results.txt
Initializations:
Many of the values in the arrays for class'Results'
(this class) may contain null-values, or a'-1'
. This happens if an exceptions is thrown while downloading or saving any one particular image, which prevents the process from running to completion.
To view the exact initialization-value for elements of any of these array, simply click on theHiLited Source-Code
link, and scroll down to the Package-PrivateResults
class constructor.
The 'skipped' Array:
There is an easy way to check whichURL's
have failed or were skipped (due to user-request), and whichURL's
were successfully obtained. One of several arrays that arepublic
in this class is theskipped
boolean[]
-array.
Whenever a particularURL
-index corresponds to askipped
-index that contains aFALSE
boolean-value, it indicates that that particularURL
ultimately was not properly saved or re-transmitted.- See Also:
ImageScraper
, Serialized Form
Hi-Lited Source-Code:- View Here: Torello/HTML/Tools/Images/Results.java
- Open New Browser-Tab: Torello/HTML/Tools/Images/Results.java
File Size: 24,502 Bytes Line Count: 604 '\n' Characters Found
-
-
Field Summary
Serializable ID Modifier and Type Field static long
serialVersionUID
Parallel Result-Arrays:Original Source-Location of the Image Modifier and Type Field boolean[]
b64EncodedImg
URL[]
urls
Parallel Result-Arrays:Image-Retrieval Success or Failure Modifier and Type Field Exception[]
exceptions
boolean[]
skipped
Parallel Result-Arrays:Image-File Write Name & Location Modifier and Type Field String[]
fileNames
String[]
saveDirectories
Parallel Result-Arrays:Image Specs Modifier and Type Field int[]
heights
IF[]
imageFormats
long[]
sizes
int[]
widths
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
This fulfils the SerialVersion UID requirement for all classes that implement Java'sinterface java.io.Serializable
. Using theSerializable
Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.- See Also:
- Constant Field Values
- Code:
- Exact Field Declaration Expression:
public static final long serialVersionUID = 1;
-
urls
public final java.net.URL[] urls
This will contain a complete list of theURL's
that were retrieved (or generated- if partially-resolved 'relative'URL's
were provided) from theRequest
-instancesstatic
-builder. Every image downloaded (or attempted for download) will have itsURL
saved here, in this array.
Null's in theurls
Array
An array-element of theurls
-array will contain null under the following two circumstances:- No image-
URL
was provided, becasue the picture was a Base-64 Encoded Image, and instead was aString
that had been retrieved from aTagNode
'sSRC
-Attribute.
- The user provided a
String
to theRequest
Class builder, but thatString
caused aMalformedURLException
, and noURL
-instance was ever built. (Note that in this scenario, theexceptions
array would be storing theURL
-Exception that was thrown).
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class. - No image-
-
b64EncodedImg
public final boolean[] b64EncodedImg
When constructing anImageScraper
'sRequest
object, one of the options for building the instance is to pass a list of HTMLTagNode
instances containing HTML'<IMG SRC=...>'
tags.
HTMLTagNode
elements will sometimes / occasionally have a variant of an image source known as the Base-64 Encoded Image. These are images where the actual pictures fully stored & encoded inside theSRC
-Attribute of the HTMLTagNode
'sSRC
-Attribute.
Base-64 Images are just pictures that have been converted into actual character data, in the form of a simpleString
, and saved into the<IMG>
tag'sSRC
-URL - instead of an actual HTTPURL
being saved there. Note that this practice is generally used for much smaller pictures, thumbnails or logo signs (images that wouldn't use up a lot of data).
A full explanation of HTML'sBase-64
Image-Encoding is beyone the scope of this Java-Doc Comment.
If any image was "converted" from a B-64 Image-Encoding (rather than downloaded from aURL
), then the boolean for the image's index will beTRUE
rather thanFALSE
. The default value for all elements of this array isFALSE
.
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class.
-
skipped
public final boolean[] skipped
An imagesResults
-data in this particular paralell-array will beTRUE
under any of the following circumstances:- If the user provided a
Request.skipURL
-Predicate, and that predicate rejected the image (telling the downloader not to download the picture).
- If the user provided a
Request.keeperPredicate
, and that predicate, after image-downloade complettion, rejected the image (telling the downloader not save the picture to disk the picture).
- If there were any exceptions thrown while downloading the image that forced the
downloader-logic to abandon the image (and either throw the exception, or
skip-and-move-on to the next image).
- If the original
Iterable
that was provided to theRequest
-instance had entries that had causedexceptions
to be thrown while building theRequest
-instance.
Under any / all other circumstances, if an image was successfully downloaded and written to disk, then the corresponding element in this array will containFALSE
!
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class. - If the user provided a
-
fileNames
public final java.lang.String[] fileNames
The names of the files that were retrieved and/or stored will be in this array. If this image were skipped or an exception occurred, the array position for thatURL
would contain 'null'.
It is important to note that if an element of this array contains a valid, non-null, file-name - it does not guarantee that the image was properly saved. The value stored in the corresponding (parallel)skipped
-array index is the only way to ascertain whether an image was ultimately written to disk (or transmitted to a User-ProvidedRequest.imageReceiver
).
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class.
-
saveDirectories
public final java.lang.String[] saveDirectories
The location of the file-name saved directory, if an image did not successfully save to the file system, or if animageReceiver
were used, then the array-location would contain'null'
.
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class.
-
imageFormats
public final IF[] imageFormats
The image type of the files that were retrieved will be stored in this array.
Image-Download Skipped:
If the image-downloader threw an exception while attempting to download a particular image, then thesizes
-array at that image's index would containnull
(and theskipped
-array would containTRUE
).
If the user-providedskipURL
-Predicate asked that aURL
be skipped, then the element at that array-index would also containnull
.
Image-Download Succeeded:
If a particular Image-URL
was successfully downloaded, but if the User-ProvidedkeeperPredicate
has asked that the downloader not save the image to disk, then the value in this array would contain that image's image-format anyway!. This would be despite the fact that the image wasn't actually written to disk.
If, after downloading, an image failed to write to disk due to an exception, then the value in this array would, also / again, contain that image's image-format - even though the image itself hadn't properly saved. (Theskipped
-array would still have aTRUE
value stored in it).
In Summary: When an image is skipped due to an exception, or some other issue that occurred during the downloader-loop, partially-acquired information about the picture will still be written to the classResults
-arrays - whether or not the image has been saved to the file-system.
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class.
-
exceptions
public final java.lang.Exception[] exceptions
If any stage of the image download, conversion or disk-write fails, then this array will store a record the exception that was thrown.
If the download succeeded, then the'exceptions'
-array element at that index would contain 'null.' Anyexceptions
-array index that contains a non-nullException
will be an index for which theskipped
-array contains aTRUE
-value stored at the same location.
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class.
-
sizes
public final long[] sizes
This will contain a list of long-integers, each of which will have the file-size of the downloaded image.
Image-Download Skipped:
If the image-downloader threw an exception while attempting to download a particular image, then thesizes
-array at that image's index would contain-1
(and theskipped
-array would containTRUE
).
If the user-providedskipURL
-Predicate asked that aURL
be skipped, then the element at that array-index would also contain-1
.
Image-Download Succeeded:
If a particular Image-URL
was successfully downloaded, but if the User-ProvidedkeeperPredicate
has asked that the downloader not save the image to disk, then the value in this array would contain that image's size anyway!. This would be despite the fact that the image wasn't actually written to disk.
If, after downloading, an image failed to write to disk due to an exception, then the value in this array would, also / again, contain that image's size - even though the image itself hadn't properly saved. (Theskipped
-array would still have aTRUE
value stored in it).
In Summary: When an image is skipped due to an exception, or some other issue that occurred during the downloader-loop, partially-acquired information about the picture will still be written to the classResults
-arrays - whether or not the image has been saved to the file-system.
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class.
-
widths
public final int[] widths
This will contain a list of integers, each of which shall have the image-widths of the downloaded images.
Image-Download Skipped:
If the image-downloader threw an exception while attempting to download a particular image, then thesizes
-array at that image's index would contain-1
(and theskipped
-array would containTRUE
).
If the user-providedskipURL
-Predicate asked that aURL
be skipped, then the element at that array-index would also contain-1
.
Image-Download Succeeded:
If a particular Image-URL
was successfully downloaded, but if the User-ProvidedkeeperPredicate
has asked that the downloader not save the image to disk, then the value in this array would contain that image's width anyway!. This would be despite the fact that the image wasn't actually written to disk.
If, after downloading, an image failed to write to disk due to an exception, then the value in this array would, also / again, contain that image's width - even though the image itself hadn't properly saved. (Theskipped
-array would still have aTRUE
value stored in it).
In Summary: When an image is skipped due to an exception, or some other issue that occurred during the downloader-loop, partially-acquired information about the picture will still be written to the classResults
-arrays - whether or not the image has been saved to the file-system.
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class.
-
heights
public final int[] heights
This shall contain a list of integers, each of which shall have the image-heights of the downloaded images.
Image-Download Skipped:
If the image-downloader threw an exception while attempting to download a particular image, then thesizes
-array at that image's index would contain-1
(and theskipped
-array would containTRUE
).
If the user-providedskipURL
-Predicate asked that aURL
be skipped, then the element at that array-index would also contain-1
.
Image-Download Succeeded:
If a particular Image-URL
was successfully downloaded, but if the User-ProvidedkeeperPredicate
has asked that the downloader not save the image to disk, then the value in this array would contain that image's height anyway!. This would be despite the fact that the image wasn't actually written to disk.
If, after downloading, an image failed to write to disk due to an exception, then the value in this array would, also / again, contain that image's height - even though the image itself hadn't properly saved. (Theskipped
-array would still have aTRUE
value stored in it).
In Summary: When an image is skipped due to an exception, or some other issue that occurred during the downloader-loop, partially-acquired information about the picture will still be written to the classResults
-arrays - whether or not the image has been saved to the file-system.
Parallel Arrays:
The index of this array will be parallel to the input-sourceIterable<URL>
retrieval order. All array-indices in this array contain elements that are parallel to the corresponding elements in the other 9 array-fields of thisResults
Class.
-
-
Method Detail
-
clone
-
equals
public boolean equals(java.lang.Object other)
Checks whether'this'
instance is equal to another instance of classResults
. This method performs a Deep Equals comparison using theequals(...)
method suite found in class'java.util.Arrays
.- Overrides:
equals
in classjava.lang.Object
- Parameters:
other
- This may be any Java Object, but only an instance class'Results'
has any chance of being marked as equal to this instance.- Returns:
TRUE
if and only if'o'
has a type that's assignable toResults
- and if each of the internal arrays in this instance are equal to the arrays in parameter'o'
.- Code:
- Exact Method Body:
if (other == null) return false; if (! Results.class.isAssignableFrom(other.getClass())) return false; Results r = (Results) other; // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // NOTE: These arrays cannot ever be null, that is an "Unreachable Situation" // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** return Arrays.equals(this.urls, r.urls) && Arrays.equals(this.b64EncodedImg, r.b64EncodedImg) && Arrays.equals(this.skipped, r.skipped) && Arrays.equals(this.fileNames, r.fileNames) && Arrays.equals(this.saveDirectories, r.saveDirectories) && Arrays.equals(this.imageFormats, r.imageFormats) && Arrays.equals(this.exceptions, r.exceptions) && Arrays.equals(this.sizes, r.sizes) && Arrays.equals(this.widths, r.widths) && Arrays.equals(this.heights, r.heights);
-
toString
public java.lang.String toString()
Returns ajava.lang.String
representation of'this'
instance- Overrides:
toString
in classjava.lang.Object
- Returns:
- A Java
String
containing the data inside this class. - Code:
- Exact Method Body:
StringBuilder sb = new StringBuilder(); // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // NOTE: These arrays, themselves can never be null - BUT THEIR CONTENTS ARE OFTEN NULL // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** for (int i=0; i < urls.length; i++) { if (b64EncodedImg[i]) sb.append("Base 64 Encoded Image\n"); else sb.append( "URL: " + ((urls[i] == null) ? "null" : StrPrint.abbrev(urls[i].toString(), 50, true, " ... ", 100)) + '\n' ); boolean comma = false; if (skipped[i] == true) { sb.append(" SKIPPED"); comma = true; } if (imageFormats[i] != null) { sb.append(comma ? COMMA : " "); sb.append("Format: " + imageFormats[i]); comma = true; } if (sizes[i] > 0) { sb.append(comma ? COMMA : " "); sb.append("Size: " + StringParse.commas(sizes[i])); comma = true; } if (widths[i] > 0) { sb.append(comma ? COMMA : " "); sb.append("W: " + StringParse.commas(widths[i])); comma = true; } if (heights[i] > 0) { sb.append(comma ? COMMA : " "); sb.append("H: " + StringParse.commas(heights[i])); comma = true; } if (comma) sb.append('\n'); comma = false; if (fileNames[i] != null) { sb.append(" FileName: [" + fileNames[i] + ']'); comma = true; } if (saveDirectories[i] != null) { sb.append(comma ? COMMA : " "); sb.append("Dir: [" + StrPrint.abbrev(saveDirectories[i], 30, true, null, 60) + ']'); comma = true; } if (comma) sb.append('\n'); if (exceptions[i] != null) sb.append(" Thrown: " + exceptions[i].getClass().getName() + '\n'); if (i < (urls.length - 1)) sb.append('\n'); } return sb.toString();
-
hashCode
public int hashCode()
Java's hash-code requirement. The code is computed by summing the first 10sizes
array elements.- Overrides:
hashCode
in classjava.lang.Object
- Returns:
- A hash-code that may be used when storing this node in a java sorted-collection.
- Code:
- Exact Method Body:
int sum = 0; for (int i=0; (i < 10) && (i < sizes.length); i++) sum += sizes[i]; return sum;
-
-