HTML Parse, Search, Update and Scrape, Version 1.7

Initial Release:January 2022
Newest: November 2022
This JAR Library is an extremely efficient HTML Parser, and provides a very powerful API in comparison to other HTML Tools available. The parser, itself, produces a tremendously simplified Java Vector of HTMLNode. These nodes are nothing more than wrapped instances of java.lang.String. Because Java List's are produced, rather than Java-Script DOM-Tree's, the ability to modify, interpret and understand HTML pages in Java is tremendously easier. The library provides myraid tools for looking for attributes, changing and adding (or removing) attributes inside HTML Tags. There are many static classes for finding sublists, sub-sections and tables - and even iterating through the contents of those entities using enhanced Java Iterator's.

The power of the HTML Utilities provided can best be seen through the Java Doc Upgrader Tool, which has been written, in entirety, using the HTML Parser inside this JAR Library. The documentation linked below has been run-through and upgraded using the Java Doc Upgrader Utility. Though the primary impetus for writing a Java Doc Tool was to showcase how the Vectorized HTML actually works, the upgrader also provides myriad ways to enhance your Java Programs documentation.

Documentation

View Docs
Download Docs  [~105 MB]

Code as a JAR

Download JavaHTML-1.7.jar  [~5.2 MB]
Jar-File built using Java 11 (Alternate compilations available)

External JAR's

Java Parser JAR - If you wish to use the Java Doc Upgrader section of Java HTML, you will need the Java Parser JAR File in your class path. This is only needed if using the classes in the Torello.HTML.Tools.JavaDoc sub-package.
GCS.jar - If you wish to use the class Torello.Java.LFEC.GCSSB, you will need to download one of Google Corporation's Java JAR Libraries. This JAR is only needed for that single class mentioned here (GCSSB).