Package Torello.HTML
Class Features
- java.lang.Object
-
- Torello.HTML.Features
-
public class Features extends java.lang.Object
Tools to retrieve and insert tags into the<HEAD>
of a web-page.
Replaceable Optimization's:
Note that if updates are being performed, and if there are many, it is very likely more efficient to use theReplaceable
interface to perform these page changes.
The first thing to do is extract aSubSection
instance (which is one of the classes that implementsReplaceable
). Retrieve aSubSection
containing the HTML<HEAD> ... </HEAD>
section.
Once extracted, perform all the Header-Tag modifications using this class'Features'
to operate onSubSection.html
performing all necessary modifications to the Web-Page'sHEAD
-Section.
After all updates have been made, use the classReplaceNodes
to re-insert the previously extractedHEAD
-Section back into the page, so that any/all node-shifts that have to occur only happen once! The example below demonstrates how this is done:
Example:
// Scrape any Page Vector<HTMLNode> page = HTMLPage.getPageTokens(new URL("http://some.url.com/page.html"), false); // IMPORTANT: By extracting a "SubSection", the next several lines which insert HTML into the // header section **DO NOT** require shifting hundreds of HTML nodes forward to // perform these inserts. Only the nodes in the header are shifted forward during // these insert operations. SubSection header = TagNodePeekInclusive.first(page, "HEAD"); // Add Some HTML / CSS Header Elements Features.insertFavicon(header.html, "../../SiteLogo.png"); Features.insertCSSLink(header.html, "../../MyCSSPage.css"); Features.Meta.insertUTF8MetaTag(header.html); Features.Meta.insertKeyWords(header.html, "Java", "HTML", "Parse"); // Re-Insert the header back into the main page. The node-shifting that has to occur of this // potentially very-large Web-Page will only happen once! page = ReplaceNodes.r(page, header);
Hi-Lited Source-Code:- View Here: Torello/HTML/Features.java
- Open New Browser-Tab: Torello/HTML/Features.java
File Size: 105,007 Bytes Line Count: 2,229 '\n' Characters Found
Stateless Class:This class neither contains any program-state, nor can it be instantiated. The@StaticFunctional
Annotation may also be called 'The Spaghetti Report'.Static-Functional
classes are, essentially, C-Styled Files, without any constructors or non-static member fields. It is a concept very similar to the Java-Bean's@Stateless
Annotation.
- 1 Constructor(s), 1 declared private, zero-argument constructor
- 10 Method(s), 10 declared static
- 7 Field(s), 7 declared static, 7 declared final
-
-
Nested Class Summary
Nested Classes Modifier and Type Class static class
Features.Meta
-
Field Summary
Fields Modifier and Type Field static String
canonicalTag
static String
cssExternalSheet
static String
cssExternalSheetWithMediaAttribute
static String
favicon
static String
javaScriptExternalPage
protected static TextNode
NEWLINE
static String
NO_HEADER_MESSAGE
-
Method Summary
Retrieve HTML Header Elements Modifier and Type Method static Vector<TagNode>
getAllCSSLinks(Vector<? extends HTMLNode> html)
static String[]
getAllExternalJSLinks(Vector<? extends HTMLNode> html)
static String
hasCanonicalURL(Vector<? extends HTMLNode> html)
static String
hasFavicon(Vector<? extends HTMLNode> html)
Insert HTML Elements into <HEAD>...</HEAD> Modifier and Type Method static void
insertCanonicalURL(Vector<HTMLNode> html, String canonicalURLAsStr)
static void
insertCSSLink(Vector<HTMLNode> html, String externalCSSFileURLAsString)
static void
insertCSSLink(Vector<HTMLNode> html, String externalCSSFileURLAsString, String mediaInnerTagValue)
static void
insertExternalJavaScriptLink(Vector<HTMLNode> html, String externalJSFileURLAsString)
static void
insertFavicon(Vector<HTMLNode> html, String imageURLAsString)
Internal Methods Modifier and Type Method protected static void
checkForSingleQuote(String s)
-
-
-
Field Detail
-
NO_HEADER_MESSAGE
public static final java.lang.String NO_HEADER_MESSAGE
Error Message that is used repeatedly.- See Also:
- Constant Field Values
- Code:
- Exact Field Declaration Expression:
public static final String NO_HEADER_MESSAGE = "You are attempting to insert an HTML INSERT-STR, but such an element belongs in the " + "page's header. Unfortunately, the page or sub-page you have passed does not have a " + "<HEAD>...</HEAD> sub-section. Therefore, there is no place to insert the elements.";
-
favicon
public static final java.lang.String favicon
ThisString
may be inserted in the HTML<HEAD> ... </HEAD>
section to add a "logo-image" at the top-left corner of the Web-Browser's tab for the page when it loads. This logo is called a'favicon'
.- See Also:
insertFavicon(Vector, String)
,hasFavicon(Vector)
, Constant Field Values- Code:
- Exact Field Declaration Expression:
public static final String favicon = "<LINK REL='icon' TYPE='image/INSERT-IMAGE-TYPE-HERE' HREF='INSERT-URL-STRING-HERE' />";
-
cssExternalSheet
public static final java.lang.String cssExternalSheet
ThisString
may be inserted in the HTML<HEAD> ... </HEAD>
section to add a Cascading Style Sheet (a'.css'
file) to your page.
The web-browser that ultimately loads the HTML that you are exporting will render the style elements across all the HTML elements in your page that match their respective CSS-Selectors. Without going into a big diatribe about how CSS works, just know that theString
used to build / instantiate a newTagNode
with an externally linkedCSS
-Page is provided here, by this field.- See Also:
insertCSSLink(Vector, String)
,getAllCSSLinks(Vector)
, Constant Field Values- Code:
- Exact Field Declaration Expression:
public static final String cssExternalSheet = "<LINK REL=stylesheet TYPE='text/css' HREF='INSERT-URL-STRING-HERE' />";
-
cssExternalSheetWithMediaAttribute
public static final java.lang.String cssExternalSheetWithMediaAttribute
ThisString
may be inserted in the HTML<HEAD> ... </HEAD>
section to add a Cascading Style Sheet (a'.css'
file) to your page. This particularString
-Constant Field includes / allows for aMEDIA
-Attribute / Inner-Tag.- See Also:
insertCSSLink(Vector, String)
,insertCSSLink(Vector, String, String)
,getAllCSSLinks(Vector)
, Constant Field Values- Code:
- Exact Field Declaration Expression:
public static final String cssExternalSheetWithMediaAttribute = "<LINK REL=stylesheet TYPE='text/css' HREF='INSERT-URL-STRING-HERE' " + "MEDIA='INSERT-MEDIA-ATTRIBUTE-VALUE-HERE' />";
-
javaScriptExternalPage
public static final java.lang.String javaScriptExternalPage
ThisString
may be inserted in the HTML<HEAD> ... </HEAD>
section to add an externally-linked Java-Script File ('.js'
File) to your page.
The Web-Browser will download this Java-Script page from theURL
that you ultimately provide and (hopefully) load all your variable definitions and methods when the page loads.
Closing</SCRIPT>
Tag:
Inserting an external Java-Script Page has one important difference vis-a-vis inserting an external CSS-Page. Inserting a link to a'.js'
page requires both the opening<SCRIPT ..>
and the closing</SCRIPT>
Tags.
This is expected and required even-when / especially-when there is no actual java-script code being placed on the'.html'
page itself. Effectively, regardless of whether you are putting actual java-script code into / inside your HTML page, or you are just inserting a link to a'.js'
File on your server - you must always create both the open and the closed HTML<SCRIPT SRC='...'></SCRIPT>
tags and insert them into your Vectorized-HTML Web-Page.
In the brief example below, it should be clear that even though theSCRIPT
-Tags do not enclose any Java-Script, both the open and the closed versions of the tag are placed into the HTML-File.
HTML Elements:
<!-- This is a short note about including the HTML SCRIPT element in your web-pages. --> <HTML> <HEAD> <!-- Version #1 Inserting a java-script 'variables & functions' external-page --> <SCRIPT TYPE='text/javascript' SRC='/script/javaScriptFiles/functions.js'> </SCRIPT> <!-- Right here (line above) we always need the closing Script-tag, even when there is no actual java-script present, and the methods/variables are going to be downloaded from the java-script file identified in by the SRC="..." attribute! --> <SCRIPT TYPE='text/javascript'> var someVar1; var someVar2; function someFunction() { return; } </SCRIPT> <!-- Either way, the closing-script tag is expected. -->
- See Also:
insertExternalJavaScriptLink(Vector, String)
,getAllExternalJSLinks(Vector)
, Constant Field Values- Code:
- Exact Field Declaration Expression:
public static final String javaScriptExternalPage = "<SCRIPT TYPE='text/javascript' SRC='INSERT-URL-STRING-HERE'>";
-
canonicalTag
public static final java.lang.String canonicalTag
If you have pages on your site that are almost identical, then you may need to inform search engines which one to prioritize. Or you might have syndicated content on your site which was republished elsewhere. You can do both of these things without incurring a duplicate content penalty – as long as you use aCANONICAL
-Tag.
Instead of confusing Google and missing your ranking on the SERP's, you are guiding the crawlers as to which URL counts as the “main” one. This places the emphasis on the right URL and prevents the others from cannibalizing your SEO.
UseCANONICAL
-Tags to avoid having problems with duplicate content that may affect your rankings.
The content of this Documentation Page was copied from a page on the web-domain'http://searchenginewatch.com'
. It was lifted on May 24th, 2019.
See link below, if still valid:
https://searchenginewatch.com/2018/04/04/a-quick-and-easy-guide-to-meta-tags-in-seo/- See Also:
insertCanonicalURL(Vector, String)
,hasCanonicalURL(Vector)
, Constant Field Values- Code:
- Exact Field Declaration Expression:
public static final String canonicalTag = "<LINK REL=canonical HREF='INSERT-URL-STRING-HERE' />";
-
NEWLINE
-
-
Method Detail
-
checkForSingleQuote
protected static void checkForSingleQuote(java.lang.String s)
This method checks whether theString
-Parameter's'
contains a Single-Quotations Punctuation-Mark anywhere inside thatString
. If so, a properly formatted exception is thrown. This is used as an internal Helper-Method.- Parameters:
s
- This may be any JavaString
, but generally it is one used to insert into an HTMLCONTENT
-Attribute.- Throws:
QuotesException
- IfString
-Parameter's'
contains any single-quotation marks.- Code:
- Exact Method Body:
int pos; if ((pos = s.indexOf("'")) != -1) throw new QuotesException( "The passed string-parameter may not contain a single-quote punctuation mark. " + "Yours was: [" + s + "], and has a single-quotation mark at string-position " + "[" + pos + "]" );
-
insertFavicon
public static void insertFavicon(java.util.Vector<HTMLNode> html, java.lang.String imageURLAsString)
This inserts a favicon HTML link element into the right location so that a particular Web-Page will render an "browser icon image" into the top-left corner of the Web-Page's Browser-Tab.- Parameters:
html
- Any Vectorized-HTML Web-Page, but it is important that this page contain an HTML <HEAD> ... </HEAD> section or area. If the passed Vectorized-HTML does not have a header, then this method will throw aNodeNotFoundException
because whenever a <META>-Tag is inserted, it must be inserted into a page'sHEAD
-Section.imageURLAsString
- This is theString
that will be copied into theString
-Fieldfavicon
, and subsequently used to build a newTagNode
instance, and inserted into the HTML Page's HTMLHEAD
-Section.- Throws:
NodeNotFoundException
- Throws if there is no HTMLHEAD
-Section. Specifically, if parameter'html'
doesn't have a <HEAD> ... </HEAD> element where the insertion would have to be performed, then this exception will throw.QuotesException
- IfString
-Parameter'imageURLAsString'
contains any single-quotation marks.- See Also:
favicon
,checkForSingleQuote(String)
- Code:
- Exact Method Body:
// Insert the Favicon <LINK ...> element into the <HEAD> section of the input html page. // <link rel='icon' type='image/INSERT-IMAGE-TYPE-HERE' href='INSERT-URL-STRING-HERE' /> checkForSingleQuote(imageURLAsString); // The HTML Page must have a <HEAD> ... </HEAD> section, or an exception shall throw. DotPair header = TagNodeFindInclusive.first(html, "head"); if (header == null) throw new NodeNotFoundException (NO_HEADER_MESSAGE.replace("INSERT-STR", "favicon <LINK> element")); String ext = IF.getGuess(imageURLAsString).extension; if (ext == null) throw new IllegalArgumentException( "The Image-Type of the 'imageURLAsString' parameter could not be determined. " + "The method IF.getGuess(faviconURL) returned null. Please provide a favicon with " + "standard image file-type. This is required because the image-type is required " + "to be placed inside the HTML <LINK TYPE=... HREF=...> Element 'TYPE' Attribute." ); // Build a new Favicon TagNode. TagNode faviconTN = new TagNode ("<LINK REL='icon' TYPE='image/" + ext + "' HREF='" + imageURLAsString + "' />"); // Insert the Favicon into the page. Put it at the top of the header, just after <HEAD> Util.insertNodes(html, header.start + 1, NEWLINE, faviconTN, NEWLINE);
-
hasFavicon
public static java.lang.String hasFavicon (java.util.Vector<? extends HTMLNode> html)
This method will search for an HTML<LINK REL="icon" ...>
Tag, in hopes of finding aREL
-Attribute whose value is'icon'
.
When this method finds such a tag, it will return the value of that Tag'sHREF
-Attribute.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.- Returns:
- This method will return the
String
-value of theHREF
-Attribute found inside theLINK
-Tag. If this page or sub-page does not have such a tag with anHREF
-Attribute, then null is returned.
NOTE: In the event that multiple copies of the HTMLLINK
-Tag are found, and more than one of these tags has aREL
-Attribute with a value equal to"icon"
, then this method will simple return the first of the'favicon'
tags that were found.
An (albeit erroneous) page, with multiple favicon definitions, will not cause this method to throw an exception. - See Also:
InnerTagGet
,favicon
,TagNode.AV(String)
- Code:
- Exact Method Body:
// InnerTagGet.all: Returns a vector of TagNode's that resemble: <LINK rel="icon" ...> // // EQ_CI_TRM: Check the 'rel' Attribute-Value using a Case-Insensitive, Equality // String-Comparison. // Trim the 'rel' Attribute-Value String of possible leading & trailing // White-Space before performing the comparison. Vector<TagNode> list = InnerTagGet.all (html, "LINK", "REL", TextComparitor.EQ_CI_TRM, "icon"); // If there were no HTML "<LINK ...>" elements with REL='ICON' attributes, then // there was no favicon. if (list.size() == 0) return null; // Just in case there were multiple favicon <LINK ...> tags, just return the first // one found. Inside of a <LINK REL="icon" HREF="..."> the 'HREF' Attribute contains // the Image-URL. Use TagNode.AV("HREF") to retrieve that image url. String s; for (TagNode tn : list) if ((s = tn.AV("HREF")) != null) return s; // If for some reason, none of these <LINK REL='ICON' ...> elements had an "HREF" // attribute, then just return null. return null;
-
insertCSSLink
public static void insertCSSLink (java.util.Vector<HTMLNode> html, java.lang.String externalCSSFileURLAsString)
This inserts an HTMLLINK
-Tag into Web-Page parameter'html'
with the purpose of linking an externally-defined Cascading Style Sheet (also known as aCSS
-Page) into that Page-Vector
.- Parameters:
html
- Any Vectorized-HTML Web-Page, but it is important that this page contain an HTML <HEAD> ... </HEAD> section or area. If the passed Vectorized-HTML does not have a header, then this method will throw aNodeNotFoundException
because whenever a <META>-Tag is inserted, it must be inserted into a page'sHEAD
-Section.externalCSSFileURLAsString
- This is theString
that will be copied into theString
-FieldcssExternalSheet
, and subsequently used to build a newTagNode
instance, and inserted into the HTML Page's HTMLHEAD
-Section.- Throws:
NodeNotFoundException
- Throws if there is no HTMLHEAD
-Section. Specifically, if parameter'html'
doesn't have a <HEAD> ... </HEAD> element where the insertion would have to be performed, then this exception will throw.QuotesException
- IfString
-Parameter'externalCSSFileURLAsString'
contains any single-quotation marks.- See Also:
cssExternalSheet
,cssExternalSheetWithMediaAttribute
,insertCSSLink(Vector, String, String)
,getAllCSSLinks(Vector)
,checkForSingleQuote(String)
,DotPair
,TagNode
- Code:
- Exact Method Body:
// Inserts an external CSS Link into the <HEAD> section of this html page vector // <link REL=stylesheet type='text/css' href='INSERT-URL-STRING-HERE' /> checkForSingleQuote(externalCSSFileURLAsString); // The HTML Page must have a <HEAD> ... </HEAD> section, or an exception shall throw. DotPair header = TagNodeFindInclusive.first(html, "head"); if (header == null) throw new NodeNotFoundException( NO_HEADER_MESSAGE.replace ("INSERT-STR", "externally-linked CSS page <LINK> element") ); TagNode cssTN = new TagNode ("<LINK REL=stylesheet TYPE='text/css' HREF='" + externalCSSFileURLAsString + "' />"); // Insert the Style-Sheet link into the page. Put it at the top of the header, // just after <HEAD> Util.insertNodes(html, header.start + 1, NEWLINE, cssTN, NEWLINE);
-
insertCSSLink
public static void insertCSSLink (java.util.Vector<HTMLNode> html, java.lang.String externalCSSFileURLAsString, java.lang.String mediaInnerTagValue)
This inserts a Cascading Style Sheet with the extraMEDIA
-Attribute using an HTMLLINK
-Tag into the Vectorized-HTML Web-Page parameter'html'
- Parameters:
html
- Any Vectorized-HTML Web-Page, but it is important that this page contain an HTML <HEAD> ... </HEAD> section or area. If the passed Vectorized-HTML does not have a header, then this method will throw aNodeNotFoundException
because whenever a <META>-Tag is inserted, it must be inserted into a page'sHEAD
-Section.externalCSSFileURLAsString
- This is theString
that will be copied into theString
-FieldcssExternalSheet
, and subsequently used to build a newTagNode
instance, and inserted into the HTML Page's HTMLHEAD
-Section.mediaInnerTagValue
- Externally linked CSS-Pages, which are included using the HTMLLINK
-Tag may explicitly request aMEDIA
-Attribute be inserted into that Tag. ThatMEDIA
-Attribute may take one of five values. In such a tag, the extra attribute specifies when the listed CSS-Rules are to be applied.
Listed here are the most common values for theMEDIA
-Attribute:Attribute Value Intended CSS Meaning screen indicates for use on a computer screen projection for projected presentations handheld for handheld devices (typically with small screens) print to style printed Web-Pages all (default value) This is what most people choose. You can leave off the MEDIA
-Attribute completely if you want your styles to be applied for all media types.- Throws:
NodeNotFoundException
- Throws if there is no HTMLHEAD
-Section. Specifically, if parameter'html'
doesn't have a <HEAD> ... </HEAD> element where the insertion would have to be performed, then this exception will throw.QuotesException
- If either of theString
-Parameter's'externalCSSFileURLAsString'
or'mediaInnerTagValue'
contain any single-quotation marks.- See Also:
cssExternalSheet
,cssExternalSheetWithMediaAttribute
,insertCSSLink(Vector, String)
,getAllCSSLinks(Vector)
,checkForSingleQuote(String)
,DotPair
- Code:
- Exact Method Body:
// Inserts an external CSS Link (with 'media' attribute) into the <HEAD> section of // this html page vector // <link REL=stylesheet type='text/css' href='INSERT-URL-STRING-HERE' // media='INSERT-MEDIA-ATTRIBUTE-VALUE-HERE' /> checkForSingleQuote(externalCSSFileURLAsString); checkForSingleQuote(mediaInnerTagValue); // The HTML Page must have a <HEAD> ... </HEAD> section, or an exception shall throw. DotPair header = TagNodeFindInclusive.first(html, "HEAD"); if (header == null) throw new NodeNotFoundException( NO_HEADER_MESSAGE.replace ("INSERT-STR", "externally-linked CSS Style-Sheet LINK-Tag") ); // Build the TagNode TagNode cssTN = new TagNode( "<LINK REL=stylesheet TYPE='text/css' HREF='" + externalCSSFileURLAsString + "' " + "MEDIA='" + mediaInnerTagValue + "' />" ); // Insert the Style-Sheet link into the page. Put it at the top of the header, just // after <HEAD> Util.insertNodes(html, header.start + 1, NEWLINE, cssTN, NEWLINE);
-
getAllCSSLinks
public static java.util.Vector<TagNode> getAllCSSLinks (java.util.Vector<? extends HTMLNode> html)
This will retrieve all linked CSS-Pages from Vectorized-HTML Web-Page parameter'html'
.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.- Returns:
- This will return the links as a list of
TagNode
's' - See Also:
insertCSSLink(Vector, String)
,insertCSSLink(Vector, String, String)
,InnerTagGet
- Code:
- Exact Method Body:
// InnerTagGet.all: Returns a vector of TagNode's that resemble: // <LINK rel="stylesheet" ...> // // EQ_CI_TRM: Check the 'rel' Attribute-Value using a Case-Insensitive, Equality // String-Comparison // Trim the 'rel' Attribute-Value String of possible leading & trailing // White-Space before performing the comparison. return InnerTagGet.all(html, "LINK", "REL", TextComparitor.EQ_CI_TRM, "stylesheet");
-
insertExternalJavaScriptLink
public static void insertExternalJavaScriptLink (java.util.Vector<HTMLNode> html, java.lang.String externalJSFileURLAsString)
This inserts an HTML'<LINK ...>'
element into the proper location for linking an externally-defined Java-Script (a'.js'
File) into the Web-Page.- Parameters:
html
- Any Vectorized-HTML Web-Page, but it is important that this page contain an HTML <HEAD> ... </HEAD> section or area. If the passed Vectorized-HTML does not have a header, then this method will throw aNodeNotFoundException
because whenever a <META>-Tag is inserted, it must be inserted into a page'sHEAD
-Section.externalJSFileURLAsString
- This is theString
that will be copied into theString
-FieldjavaScriptExternalPage
, and subsequently used to build a newTagNode
instance, and inserted into the HTML Page's HTMLHEAD
-Section.- Throws:
NodeNotFoundException
- Throws if there is no HTMLHEAD
-Section. Specifically, if parameter'html'
doesn't have a <HEAD> ... </HEAD> element where the insertion would have to be performed, then this exception will throw.QuotesException
- IfString
-Parameter'externalJSFileURLAsString'
contains any single-quotation marks.- See Also:
javaScriptExternalPage
,getAllExternalJSLinks(Vector)
,checkForSingleQuote(String)
,TagNode
,TextNode
,DotPair
,HTMLTags.hasTag(String, TC)
- Code:
- Exact Method Body:
// Builds an external Java-Script link, and inserts it into the header portion of // this html page. // <script type='text/javascript' src='INSERT-URL-STRING-HERE'> checkForSingleQuote(externalJSFileURLAsString); // The HTML Page must have a <HEAD> ... </HEAD> section, or an exception shall throw. DotPair header = TagNodeFindInclusive.first(html, "HEAD"); if (header == null) throw new NodeNotFoundException( NO_HEADER_MESSAGE.replace( "INSERT-STR", "externally-linked Java-Script <SCRIPT> ... </SCRIPT> elements") ); // Build an HTML <SCRIPT ...> node, and a </SCRIPT> node. HTMLNode n = new TagNode ("<SCRIPT TYPE='text/javascript' SRC='" + externalJSFileURLAsString + "'>"); HTMLNode closeN = HTMLTags.hasTag("script", TC.ClosingTags); // Insert the Java-Script link into the page. Put it at the top of the header, just // after <HEAD> Util.insertNodes(html, header.start + 1, NEWLINE, n, closeN, NEWLINE);
-
getAllExternalJSLinks
public static java.lang.String[] getAllExternalJSLinks (java.util.Vector<? extends HTMLNode> html)
Inserting Java-Script directly onto an HTML-Page and including an external link to a'.js'
File are extremely similar tasks. Either way, in both cases the construct is simply:<SCRIPT TYPE='text/javascript'> ... </SCRIPT>
When the actual functions and methods are pasted into an HTML-Page directly, they are pasted into theString
above where the ellipses'...'
are. When a link is made to an external page from a directory on the same Web-Server - both the open and the close HTMLSCRIPT
-Tag's must be included.
If just a link is being added, then the text-content of theSCRIPT
-Tag should just be left blank or empty. Instead, theURL
to the Java-Script Page is added as an HTMLSRC
-Attribute.
This method will retrieve any and all'SCRIPT'
nodes that meet the following criteria:- The Script Body must be empty, meaning there is no Java-Script between the
opening and closing
SCRIPT
-Tags - The HTML
SRC
-Attribute must contain a non-null, non-zero-length value
- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.- Returns:
- This will return a list of relative
URL's
to externally linked Java-Script Pages asString's
- See Also:
InnerTagGetInclusive
,javaScriptExternalPage
,insertExternalJavaScriptLink(Vector, String)
,TagNode
,TextNode
,TagNode.AV(String)
,HTMLNode.str
- Code:
- Exact Method Body:
// InnerTagGetInclusive.all: Returns a vector of TagNode's that resemble: // <SCRIPT TYPE="javascript" ...> // // CN_CI: Check the 'rel' Attribute-Value using a Case-Insensitive, "Contains" // String-Comparison // 'contains' rather than 'equals' testing is done because this value may be // "javascript", but it may also be "text/javascript" // // Inclusive: This means that everything between the <SCRIPT type="javascript"> ... and // the closing </SCRIPT> tag are returned in a vector of vectors. Vector<Vector<HTMLNode>> v = InnerTagGetInclusive.all (html, "SCRIPT", "TYPE", TextComparitor.CN_CI, "javascript"); Stream.Builder<String> b = Stream.builder(); TOP: for (Vector<HTMLNode> scriptSection : v) { String srcValue = null; for (HTMLNode n : scriptSection) { if (n.isTagNode()) if ((srcValue = ((TagNode) n).AV("SRC")) != null) break; if (n.isTextNode()) if (n.str.trim().length() > 0) break TOP; } b.add(srcValue); } return b.build().toArray(String[]::new);
- The Script Body must be empty, meaning there is no Java-Script between the
opening and closing
-
insertCanonicalURL
public static void insertCanonicalURL(java.util.Vector<HTMLNode> html, java.lang.String canonicalURLAsStr)
This section will insert a Canonical-URL
into Vectorized-HTML parameter'html'
. TheURL
itself will be inserted into an HTMLLINK
-Tag as below:<LINK REL=canonical HREF='the_url'>
Since HTML mandates that such elements be located in the'HEAD'
portion of an HTML-Page, if the Vectorized-HTML parameter'html'
does not have a'HEAD'
area, then this method shall throw aNodeNotFoundException
.
Note that this exception is an unchecked / runtime exception.- Parameters:
html
- Any Vectorized-HTML Web-Page, but it is important that this page contain an HTML <HEAD> ... </HEAD> section or area. If the passed Vectorized-HTML does not have a header, then this method will throw aNodeNotFoundException
because whenever a <META>-Tag is inserted, it must be inserted into a page'sHEAD
-Section.canonicalURLAsStr
- This is theString
that will be copied into theString
-FieldcanonicalTag
, and subsequently used to build a newTagNode
instance, and inserted into the HTML Page's HTMLHEAD
-Section.- Throws:
NodeNotFoundException
- Throws if there is no HTMLHEAD
-Section. Specifically, if parameter'html'
doesn't have a <HEAD> ... </HEAD> element where the insertion would have to be performed, then this exception will throw.QuotesException
- IfString
-Parameter'canonicalURLAsStr'
contains any single-quotation marks.- See Also:
canonicalTag
,hasCanonicalURL(Vector)
,checkForSingleQuote(String)
,TagNode
,DotPair
- Code:
- Exact Method Body:
// Inserts a link element into the header of this page // <link REL=canonical href='INSERT-URL-STRING-HERE' /> checkForSingleQuote(canonicalURLAsStr); // The HTML Page must have a <HEAD> ... </HEAD> section, or an exception shall throw. DotPair header = TagNodeFindInclusive.first(html, "HEAD"); if (header == null) throw new NodeNotFoundException (NO_HEADER_MESSAGE.replace("INSERT-STR", "Canonical-url LINK-Tag")); // Builds the canonical <LINK ...> element TagNode linkTN = new TagNode ("<LINK REL=canonical HREF='" + canonicalURLAsStr + "' />"); // Insert the canonical-url into the page. Put it at the top of the header, just // after <HEAD> Util.insertNodes(html, header.start + 1, NEWLINE, linkTN, NEWLINE);
-
hasCanonicalURL
public static java.lang.String hasCanonicalURL (java.util.Vector<? extends HTMLNode> html) throws MalformedHTMLException
This method will check whether a Vectorized-HTML Page has an HTML<LINK REL=canonical ...>
Tag. This tag is used to inform Search-Engines whether or not this page surrenders or relays to a "Canonical-URL
".
Canonical-Pages help Search-Engines index large web-sites by providing a root or Master-URL
to which all sub-pages may point. SuchURL's
are often (but not always) like a "Table of Contents".
The primary goal of having a canonical is to avoid forcing Search-Engines (and their users) from sifting through and indexing every page of a large Web-Site, and instead focusing on either an introductory T.O.C. or a Title-Page.- Parameters:
html
- This may be any Vectorized-HTML Web-Page (or sub-page).
The Variable-Type Wild-Card Expression'? extends HTMLNode'
means that aVector<TagNode>, Vector<TextNode>
orVector<CommentNode>
will all be accepted by this paramter without causing an exception throw.
These 'sub-type' Vectors are often returned as search results from the classes in the'NodeSearch'
vpackage.- Returns:
- This will return whatever text was placed inside the canonical-url
HREF='some_url'
attribute/value pair of the HTML link tag. If there were no HTML<LINK REL=canonical HREF='some_url'>
tag, then this method will return null. - Throws:
MalformedHTMLException
- This exception will be thrown if there are multiple html tags that match the link, and REL=canonical search criteria requirements. If an HTML element<link REL=canonical>
is found, but that element does not have anhref='...'
attribute, or that attribute is of zero length, then this a situation that will also force this exception to throw.- See Also:
InnerTagGet
,canonicalTag
,insertCanonicalURL(Vector, String)
,TagNode.AV(String)
- Code:
- Exact Method Body:
// InnerTagGet.all: Returns a vector of TagNode's that resemble: // <LINK rel="canonical" ...> // // EQ_CI_TRM: Check the 'rel' Attribute-Value using a Case-Insensitive, Equality // String-Comparison // Trim the 'rel' Attribute-Value String of possible leading & trailing // White-Space before performing the comparison. Vector<TagNode> v = InnerTagGet.all (html, "LINK", "REL", TextComparitor.EQ_CI_TRM, "canonical"); if (v.size() == 0) return null; if (v.size() > 1) throw new MalformedHTMLException( "The Web-Page you have passed has precisely " + v.size() + " Canonical-URL LINK-Tags, but it may not have more than 1. This is " + "invalid HTML." ); String s = v.elementAt(0).AV("href"); if (s == null) throw new MalformedHTMLException( "The HTML LINK-Tag that was retrieved, contained a " + "REL=canonical Attribute-Value pair, but did not have an HREF-Attribute." + "This is invalid HTML." ); if (s.length() == 0) throw new MalformedHTMLException( "The HTML LINK-Tag that was retrieved contained a zero-length " + "String as the Attribute-Value for the HREF-Attribute. This is not " + "invalid, but poorly formatted HTML." ); return s;
-
-