Package Torello.HTML

Interface Replaceable

  • All Superinterfaces:
    java.lang.Comparable<Replaceable>
    All Known Implementing Classes:
    CommentNodeIndex, NodeIndex, SubSection, TagNodeIndex, TextNodeIndex

    public interface Replaceable
    extends java.lang.Comparable<Replaceable>
    The class ReplaceNodes offers a great efficiency-improvement optimization for modifying vectorized-HTML. HTML Pages can be very long, and the insertion or removal of a piece or snippet of HMTL may result in the shifting of hundreds (or even thousands!) of HTMLNode's. This can incur a non-trivial performance cost if many there are many updates and changes to be made to a page.



    Exceprt from currentNodes():
    Replaceable's are, sort-of, the exact opposite of Java's List method 'subList'. According to the Sun / Oracle Documentation for java.util.List.subList(int fromIndex, int toIndex), any changes ffinamade to an instance of a 'subList' are immediately reflected back into the original List from where they were created.

    The List.subList operation has the advantage of being extremely easy to work with - however, an HTML-Page Vector has the potential of being hundreds of nodes long. Any operations that involve insertion or deletion will likely be terribly inefficient.

    When the HTML inside of a Replaceable is modified - nothing happens to the original Vector whatsoever!. Until a user requests that the original HTML-Vector be updated to reflect all changes that he or she has made, the original HTML remains untouched. When an update request is finally issued, all changes are made all at once, and at the same time!

    Again - see ReplaceNodes.r(Vector, Iterable, boolean) to understand how quick updates on HTML-Pages is done using the Replaceable interface.



    Utilizing class ReplaceNodes:
    Class ReplaceNodes offers three methods for performing these optimized replacement methods. These methods are listed below. The optimization that is utilized there is to first calculate the size / length of the updated Vector, and then do the entire update all at once. This eliminates to do any shifting and only performs a single resizing of the Vector.

    These methods will work in lock-step with interface 'Replaceable' to actually perform the update after all Vectorized-HTML has been changed, sufficiently as deemed by the programmer:



    The Java Doc Upgrader Tool in this JAR Library heavily relies on using instances of Replaceable to update and modify Java Doc HTML in a fast, simple & efficient manner.

    Though this class may look somewhat complicated to understand, in all reality it is actually very simple. Load a web-page from disk (or download one from the Internet) and run it through the parser (class HTMLPage) to make a Vectorized-HTML Page. Next, build a few instances of SubSection which hold both the location of an HTML snippet and HTML itself.

    Finallly, make whatever modifications you want to those HTML snippets, and call the ReplaceNodes method listed first in the list above! The page should be updated quickly with little cost overhead.



    Peek Operation Replaceables
    The class InnerTagPeekInclusive and TagNodePeekInclusive will always generate properly ordered / sorted references that implement the Replaceable interface! Furthermore, these instances will be ones that are sorted and do not overlap.

    This means that if a set or collection of Replaceable's were created using the NodeSearch 'Peek' Search-Classes, the ReplaceNodes.r(Vector, Iterable, boolean) requirements that the Replaceable's be ordered, sorted and non-overlapping would be automatically met.

    This interface is implemented by all return-values for the NodeSearch Peek operations.



    • Method Detail

      • compareTo

        🡇     🗕  🗗  🗖
        default int compareTo​(Replaceable other)
        Java's Comparable interface requirements.
        Specified by:
        compareTo in interface java.lang.Comparable<Replaceable>
        Returns:
        An integer based on comparing the starting locations for two Replaceable instances.
        Code:
        Exact Method Body:
         return this.originalLocationStart() - other.originalLocationStart();
        
      • originalSize

        🡅  🡇     🗕  🗗  🗖
        int originalSize()
        Reports how many nodes were copied into this instance. For implementing classes that inherit NodeIndex, this value will always be one. For others, it should report exactly how many HTMLNode's were copied.
        Returns:
        Number of nodes originally contained by this instance.

        The purpose of Replaceable's is to allow a user to modify HTML using a smaller sub-list, without having to operate on the entire HTML-Vector since adding & removing nodes is one variant of Vector-modification, the original-size may often differ from the current-size.

        When modifying HTML, if a web-page is broken into smaller-pieces, and changes are restricted to those smaller sub-lists (and the original page is rebuilt, all at once, after all changes have been made) then those modifications should require far-fewer time-consuming list-shift operations, tremendously improving the performance of the code.
      • currentSize

        🡅  🡇     🗕  🗗  🗖
        int currentSize()
        Returns how many nodes are currently in this instance.
        Returns:
        Number of nodes. See explanation of the original size, versus the current size here
      • originalLocationStart

        🡅  🡇     🗕  🗗  🗖
        int originalLocationStart()
        Returns the start-location within the original page-Vector from whence the HTML contents of this instance were retrieved.

        Start is Inclusive:
        The returned value is inclusive of the actual, original-range of this instance. This means the first HTMLNode copied into this instance' internal data-structure was at originalLocationStart().

        Implementations of Replaceable:
        The two concrete implementatons of this interface (NodeIndex and SubSection) - both enforce the 'final' modifier on their location-fields. (See: NodeIndex.index and SubSection.location).
        Returns:
        The Vector start-index from whence this HTML was copied.
      • originalLocationEnd

        🡅  🡇     🗕  🗗  🗖
        int originalLocationEnd()
        Returns the end-location within the original page-Vector from whence the HTML contents of this instance were retrieved.

        Start is Exclusive:
        The returned value is exclusive of the actual, original-range of this instance. This means the last HTMLNode copied into this instance' internal data-structure was at originalLocationEnd() - 1

        Implementations of Replaceable:
        The two concrete implementatons of this interface (NodeIndex and SubSection) - both enforce the 'final' modifier on their location-fields. (See: NodeIndex.index and SubSection.location).
        Returns:
        The Vector end-index from whence this HTML was copied.
      • currentNodes

        🡅  🡇     🗕  🗗  🗖
        java.util.Vector<HTMLNodecurrentNodes()
        All nodes currently contained by this Replaceable. The concrete-classes which implement Replaceable (SubSection & TagNodeIndex) allow for the html they hold to be modified. The modification to a Replaceable happens independently from the original HTML Page out of which it was copied.

        Replaceable's are, sort-of, the exact opposite of Java's List method 'subList'. According to the Sun / Oracle Documentation for java.util.List.subList(int fromIndex, int toIndex), any changes made to an instance of a 'subList' are immediately reflected back into the original List from where they were created.

        The List.subList operation has the advantage of being extremely easy to work with - however, an HTML-Page Vector has the potential of being hundreds of nodes long. Any operations that involve insertion or deletion will likely be terribly inefficient.

        When the HTML inside of a Replaceable is modified - nothing happens to the original Vector whatsoever!. Until a user requests that the original HTML-Vector be updated to reflect all changes that he or she has made, the original HTML remains untouched. When an update request is finally issued, all changes are made all at once, and at the same time!

        Again - see ReplaceNodes.r(Vector, Iterable, boolean) to understand how quick updates on HTML-Pages is done using the Replaceable interface.
        Returns:
        An HTML-Vector of the nodes.
      • addAllInto

        🡅  🡇     🗕  🗗  🗖
        boolean addAllInto​(java.util.Vector<HTMLNode> html)
        Add all nodes currently retained in this instance into the HTML-Vector parameter html. The nodes are appended to the end of 'html'. Implementing classes NodeIndex and SubSection simply use the Java Vector method's add (for NodeIndex) and addAll (for SubSection).
        Parameters:
        html - The HTML-Vector into which the nodes will be appended (to the end of this Vector, using Vector methods add or addAll dependent upon whether one or more-than-one nodes are being inserted).
        Returns:
        The result of Vector method add, or method allAll
      • addAllInto

        🡅  🡇     🗕  🗗  🗖
        boolean addAllInto​(int index,
                           java.util.Vector<HTMLNode> html)
        Add all nodes currently retained in this instance into the HTML-Vector parameter html.
        Parameters:
        index - The 'html' parameter's Vector-index where these nodes are to be inserted
        html - The HTML-Vector into which the nodes will be appended (to the end of this Vector, using Vector methods add or addAll dependent upon whether one or more-than-one nodes are being inserted).
        Returns:
        The result of Vector method add, or method allAll
      • update

        🡅  🡇     🗕  🗗  🗖
        int update​(java.util.Vector<HTMLNode> originalHTML)
        Replaces the original range of nodes inside originalHTML with the current-nodes of this instance, using the original-location of the node(s).

        Replaceable's Primary Value:
        The main value of using the Replaceable interface is to allow for more expedient replacing / modifying HTML Pages. If many changes need to be made to a page, first extracting and copying the sub-sections that need changing into Replaceable's instances (using the Peek operations in package NodeSearch), and then re-copying those sections back into the original page-Vector after changing them - avoids the cost that would be incurred from repeatedly inserting and shifting a long list of nodes in a large HTML Page.

        Therefore, this method is probably best avoided, as it is defeating the entire-purpose of a Relaceable. This method will update the nodes at the location in the original-Vector, which is fine, but if more than one update / change is needed, using this method over-and-over again will re-introduce the exact shifting that was supposed to be avoided by (and is the whole reason for...) using Replaceable's in the first place!

        The following example should make this clear:

        Example:
        Vector<HTMLNode>    page        = HTMLPage.getPageTokens(new URL("http://some.url.com/"), false);
        Vector<SubSection>  myTableRows = TagNodePeekInclusive.all(page, "tr");
        TagNode             OPEN_SPAN   = HTMLTags.hasTag("SPAN", TC.OpeningTags);
        TagNode             CLOSE_SPAN  = HTMLTags.hasTag("SPAN", TC.ClosingTags);
        int                 counter     = 1;
        
        for (SubSection tableRow : myTableRows)
        {
            // Retrieve the <TR> Tag & Give it a CSS-ID
            TagNode tr = tableRow.html.elementAt(0).asTagNode().setID("ROW" + counter++, null);
        
            // Put the newly created <TR ID=..> into the vector.  It was the first-element in the SubSection
            tableRow.html.setElementAt(tr, 0);
        
            // Add a <SPAN>...</SPAN> surrounding the first line of text
            // NOTE: This assumes that tableRow[1] (second SubSection node) is a TextNode with text
        
            tableRow.html.insertElementAt(OPEN_SPAN, 1);
            tableRow.html.insertElementAt(CLOSE_SPAN, 3);
        }
        
        // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
        // This version DESTROYS THE BENEFIT of using TagNodePeekInclusive
        // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
        //
        // Here, if the original html-page was thousands of nodes long, every table-row
        // update will force thousands of nodes to be shifted to the right over-and-over
        // again!
        
        for (SubSection tableRow : myTableRows) tableRow.update(page);
        
        // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
        // This builds a new Vector much more efficiently, avoiding costly node-shifting
        // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
        
        page = ReplaceNodes.r(page, myTableRows, false).a;
        
        Parameters:
        originalHTML - The original page-Vector where the nodes in this instance were retrieved
        Returns:
        The change in the size of the Vector
        Throws:
        java.lang.IndexOutOfBoundsException - If originalLocationStart() or originalLocationEnd() are not within the bounds of the input html-page.
        See Also:
        ReplaceNodes.r(Vector, Iterable, boolean)
      • setHTML

        🡅  🡇     🗕  🗗  🗖
        default Replaceable setHTML​(java.util.Vector<HTMLNode> newHTML)
        This method may be used for arbitrary replacements. An instance of NodeIndex (one of its sub-classes) only contains a single HTMLNode. To change that to a list, or to remove that node altogether, invoke this method, and a new instance of Replaceable will be automatically created, and returned.

        This may be a little tricky at first, but the primary reason for using this method is that size-changes that would make a single-node (NodeIndex instance) into a list (SubSection instance), or vice-versa, would require building a different type of Replaceable instance. This method will automatically build that instance into a Replaceable that retains its original location, but reflects its new contents and size.

        Once again, the primary impetus for this method is using it with an in-place page update having multiple-replacements, vis-a-vis a call to ReplaceNodes.r(Vector, Iterable, boolean).
        Parameters:
        newHTML - The contents of 'this' replaceable will be assigned to the the html in this parameter.
        Returns:
        a new replaceable whose location has not changed, but whose contents are the contents of newHTML.
        Code:
        Exact Method Body:
         final int oldSize   = this.originalSize();
         final int newSize   = newHTML.size();
         final int sPos      = this.originalLocationStart();
         final int ePos      = this.originalLocationEnd();
        
         // SubSection ==> SubSection
         if ((oldSize > 1) && (newSize > 1))
             return new SubSection(new DotPair(sPos, ePos - 1), newHTML);
        
         // NodeIndex ==> NodeIndex
         if ((oldSize == 1) && (newSize == 1))
             return NodeIndex.newNodeIndex(sPos, newHTML.elementAt(0));
        
         // Empty ==> Empty
         if ((oldSize == 0) && (newSize == 0))
             return empty(sPos);
        
         return new ReplaceableAdapter(sPos, ePos, newHTML);
        
      • setHTML

        🡅  🡇     🗕  🗗  🗖
        default Replaceable setHTML​(HTMLNode newHTML)
        See the description in setHTML(Vector) to understand when to use setHTML. This method is identical, but accepts a single HTMLNode instance, instead of an html list.
        Parameters:
        newHTML - The contents of 'this' replaceable will be assigned to the the html contained by newHTML. (The returned instance will have the same location values)
        Returns:
        a new replaceable whose location has not changed, but whose contents are newHTML.
        See Also:
        setHTML(Vector)
        Code:
        Exact Method Body:
         // NodeIndex ==> NodeIndex
         if (this.originalSize() == 1)
             return NodeIndex.newNodeIndex(this.originalLocationStart(), newHTML);
        
         Vector<HTMLNode> v = new Vector<>();
         v.add(newHTML);
        
         return new ReplaceableAdapter
             (this.originalLocationStart(), this.originalLocationEnd(), v);
        
      • clearHTML

        🡅  🡇     🗕  🗗  🗖
        default Replaceable clearHTML()
        Removes all HTML from this Replaceable, such that's currentNodes() would return an empty HTML list.
        Returns:
        a new replaceable whose original location has not changed, but whose contents are empty.
        See Also:
        setHTML(Vector)
        Code:
        Exact Method Body:
         if (currentSize() == 0) return Replaceable.empty(originalLocationStart());
        
         return new ReplaceableAdapter
             (originalLocationStart(), originalLocationEnd(), new Vector<>());
        
      • create

        🡅  🡇     🗕  🗗  🗖
        static Replaceable create​(DotPair location,
                                  java.util.Vector<HTMLNode> html)
        Provides a mechanism for creating a SubSection instance whose html does not match the size of the location where that html is to be placed.
        Parameters:
        location - The range in any HTML Page by which the new html will be replaced.
        html - The html that will ultimately be used to replace the current-html, on a web-page, at the specified location.
        Returns:
        An instance of a Replaceable, that is, in-effect, a SubSection, but one whose location/bounds do not match the size of the new-html.

        NOTE: This method allows a user to bypass the exception-check that class SubSection performs when building an instance of that class.
        Code:
        Exact Method Body:
         return new ReplaceableAdapter(location.start, location.end + 1, html);
        
      • create

        🡅  🡇     🗕  🗗  🗖
        static Replaceable create​(int location,
                                  java.util.Vector<HTMLNode> html)
        Creates a new Replaceable instance whose original-location is just a single-node, but whose new html may be an arbitrarily-sized html Vector.
        Parameters:
        location - The node in any HTML Page which shall be replaced by 'html'
        html - The html that will replace the node on an HTML page located at 'location'
        Returns:
        An instance of a Replaceable that is, in effect, a SubSection, but one whose location/bounds are not (necessarily) a single page-index.

        NOTE: This method allows a user to bypass the requirement that a NodeIndex occupy only a single-node.
        Code:
        Exact Method Body:
         return new ReplaceableAdapter(location, location + 1, html);
        
      • createInsertion

        🡅  🡇     🗕  🗗  🗖
        static Replaceable createInsertion​(int location,
                                           java.util.Vector<HTMLNode> html)
        Creates a new Replaceable instance whose original-location had zero-length
        Parameters:
        location - The location in any HTML Page into which the 'html' shall be inserted
        html - The html that will be inserted into an HTML Page at index 'location'
        Returns:
        An instance of a Replaceable - whose original-location had a zero-length
        Code:
        Exact Method Body:
         return new ReplaceableAdapter(location, location, html);
        
      • moveAndUpdate

        🡅  🡇     🗕  🗗  🗖
        default Replaceable moveAndUpdate​(int sPos)
        This method is mostly of internal-use, mainly by ReplaceNodes.r(Vector, Iterable, boolean)
        Parameters:
        sPos - The new location in an html page-Vector where the contents of this Replaceable are now located.
        Returns:
        A new instance, whose html-contents are identical, but is located at 'sPos' (and having an ending-location of sPos + currentSize()).
        Code:
        Exact Method Body:
         // IMPORTANT: This method is extremely un-important!  It looks kind of unreadable.
         //            All it is doing is REGISTERING the changes to his SubSection or NodeIndex
         //            by building a new SubSection or new NodeIndex.
         //
         // PRIMARILY: Since the *WHOLE POINT* is to make all of the changes to an HTML Page, first,
         //            before doing an update ... Having updated Replaceable's is mostly a waste.
         //            Specifically, after the page has been updated, keeping the sub-parts of the 
         //            page would no longer be necessary!
         //
         // ReplaceNodes: This class offers the option to 'updateReplaceablesAfterBuild' in case
         //            (for whatever reason) the user has decided another round of page updates is
         //            needed.
        
         final int size = currentSize();
        
         switch (size)
         {
             case 0: return Replaceable.empty(sPos);
             case 1: return NodeIndex.newNodeIndex(sPos, firstCurrentNode());
        
             default:
                 return new SubSection(
                     // DotPair.end is inclusive, so subtract 1
                     new DotPair(sPos, sPos + size - 1),
        
                     // The current HTML Vector
                     currentNodes()
                 );
         }
        
      • empty

        🡅  🡇     🗕  🗗  🗖
        static Replaceable empty​(int sPos)
        Returns an empty Replaceable (an instance having 0 HTMLNode's) located at sPos.

        NoSuchElementException:
        Attempting to retrieve nodes from the returned-instance will generate a Java NoSuchElementException.
        Parameters:
        sPos - The location of this zero-element Replaceable
        Returns:
        The new instance.
        Code:
        Exact Method Body:
         return new ReplaceableAdapter(sPos, sPos, new Vector<>());
        
      • isSynthetic

        🡅     🗕  🗗  🗖
        default boolean isSynthetic()
        Identifies whether or not 'this' instance is an anonymous class, that was built from the (internal) ReplaceableAdapter.
        Returns:
        TRUE if 'this' is neither an instance that inherits NodeIndex nor inherits SubSection. Such instances are built from an internal ReplaceableAdapter, and are produced by the methods: setHTML(Vector), setHTML(HTMLNode), clearHTML(), and empty(int).
        Code:
        Exact Method Body:
         return false;