Package Torello.HTML
Class Escape
- java.lang.Object
-
- Torello.HTML.Escape
-
public final class Escape extends java.lang.Object
Easy utilities for escaping and un-escaping HTML characters such as
, and even code-point based Emoji's.
There are dozens of "Escaped HTML" symbols in the HTML language. This class helps convert from an "escaped character" to the underlying/actual UTF-8 or ASCII'char'
(or in-the-reverse / vice-versa).
Hi-Lited Source-Code:- View Here: Torello/HTML/Escape.java
- Open New Browser-Tab: Torello/HTML/Escape.java
File Size: 28,536 Bytes Line Count: 636 '\n' Characters Found
Stateless Class:This class neither contains any program-state, nor can it be instantiated. The@StaticFunctional
Annotation may also be called 'The Spaghetti Report'.Static-Functional
classes are, essentially, C-Styled Files, without any constructors or non-static member fields. It is a concept very similar to the Java-Bean's@Stateless
Annotation.
- 1 Constructor(s), 1 declared private, zero-argument constructor
- 11 Method(s), 11 declared static
- 6 Field(s), 6 declared static, 6 declared final
-
-
Method Summary
Basic Methods Modifier and Type Method static boolean
hasHTMLEsc(char c)
static void
printHTMLEsc()
Escape Characters to HTML Escape-Strings Modifier and Type Method static String
escChar(char c, boolean use16BitEscapeSequence)
static String
escCodePoint(int codePoint, boolean use16BitEscapeSequence)
static String
htmlEsc(char c)
Un-Escape HTML Escape-Strings to Characters Modifier and Type Method static char
escHTMLToChar(String escHTML)
static String
replace(String s)
static String
replaceAll(String s)
static String
replaceAll_DEC(String str)
static String
replaceAll_HEX(String str)
static String
replaceAll_TEXT(String str)
-
-
-
Method Detail
-
printHTMLEsc
public static void printHTMLEsc()
Print's the HTML Escape Character lookup table toSystem.out
. This is useful for debugging.
View Escape-Codes:
The JAR Data-File List included within the page attached (below) is a complete list of alltext-String
HTML Escape Sequences that are known to this class. This list, does not include anyCode Point, Hex
orDecimal Number
sequences.All HTML Escape Sequences
-
escHTMLToChar
public static char escHTMLToChar(java.lang.String escHTML)
Converts a singleString
from an HTML-escape sequence into the appropriate character.
&[escape-sequence];
==> actual ASCII or UniCode character.- Parameters:
escHTML
- An HTML escape sequence.- Returns:
- the
ASCII
orUnicode
character represented by this escape sequence.
This method will return'0'
if the input it does not represent a valid HTML Escape sequence. - Code:
- Exact Method Body:
if (! escHTML.startsWith("&") || ! escHTML.endsWith(";")) return (char) 0; String s = escHTML.substring(1, escHTML.length() - 1); // Temporary Variable. int i = 0; // Since the EMOJI Escape Sequences use Code Point, they cannot, generally be // converted into a single Character. Skip them. if (HEX_CODE.matcher(s).find()) { if ((i = Integer.parseInt(s.substring(2), 16)) < Character.MAX_VALUE) return (char) i; else return 0; } // Again, deal with Emoji's here... Parse the integer, and make sure it is a // character in the standard UNICODE range. if (DEC_CODE.matcher(s).find()) { if ((i = Integer.parseInt(s.substring(1))) < Character.MAX_VALUE) return (char) i; else return 0; } // Now check if the provided Escape String is listed in the htmlEscChars Hashtable. Character c = htmlEscChars.get(s); // If the character was found in the table that lists all escape sequence characters, // then return it. Otherwise just return ASCII zero. return (c != null) ? c.charValue() : 0;
-
replaceAll_HEX
public static java.lang.String replaceAll_HEX(java.lang.String str)
Will generate aString
whereby any & all Hexadecimal Escape Sequences have been removed and subsequently replaced with their actual ASCII/UniCode un-escaped characters!
Hexadecimal HTML Escape-Sequence Examples:Substring from Input: Web-Browser Converts To: ª
'ª'
within a browserg
'g'
within a browser„
''
within a browser
This method might be thought of as similar to the older C/C++'Ord()'
function, except it is for HTML.- Parameters:
str
- anyString
that contains an HTML Escape Sequence &#x[HEXADECIMAL VALUE];- Returns:
- a
String
, with all of the hexadecimal escape sequences removed and replaced with their equivalent ASCII or UniCode Characters. - See Also:
replaceAll_DEC(String str)
,StrReplace.r(String, String[], char[])
- Code:
- Exact Method Body:
// This is the RegEx Matcher from the top. It matches string's that look like: &#x\d+; Matcher m = HEX_CODE.matcher(str); // Save the escape-string regex search matches in a TreeMap. We need to use a // TreeMap because it is much easier to check if a particular escape sequence has already // been found. It is easier to find duplicates with TreeMap's. TreeMap<String, Character> escMap = new TreeMap<>(); while (m.find()) { // Use Base-16 Integer-Parse int i = Integer.valueOf(m.group(1), 16); // Do not un-escape EMOJI's... It makes a mess - they are sequences of characters // not single characters. if (i > Character.MAX_VALUE) continue; // Retrieve the Text Information about the HTML Escape Sequence String text = m.group(); // Check if it is a valid HTML 5 Escape Sequence. if (! escMap.containsKey(text)) escMap.put(text, Character.valueOf((char) i)); } // Build the matchStr's and replaceChar's arrays. These are just the KEY's and // the VALUE's of the TreeMap<String, Character> which was just built. // NOTE: A TreeMap is used *RATHER THAN* two parallel arrays in order to avoid keeping // duplicates when the replacement occurs. String[] matchStrs = escMap.keySet().toArray(new String[escMap.size()]); char[] replaceChars = new char[escMap.size()]; // Lookup each "ReplaceChar" in the TreeMap, and put it in the output "replaceChars" // array. The class StrReplace will replace all the escape squences with the actual // characters. for (int i=0; i < matchStrs.length; i++) replaceChars[i] = escMap.get(matchStrs[i]); return StrReplace.r(str, matchStrs, replaceChars);
-
replaceAll_DEC
public static java.lang.String replaceAll_DEC(java.lang.String str)
This method functions the same asreplaceAll_HEX(String)
- except it replaces only HTML Escape sequences that are represented using decimal (base-10) values.'replaceAll_HEX(...)'
works on hexadecimal (base-16) values.
Base-10 HTML Escape-Sequence Examples:Substring from Input: Web-Browser Converts To: 0
'0'
in your browser@
'@'
in your browser{
'{'
in your browser}
'}'
in your browser
Base-10 & Base-16 Escape-Sequence Difference:-
&#x[hex base-16 value];
There is an'x'
as the third character in theString
-
&#[decimal base-10 value];
There is no'x'
in the escape-sequenceString!
This short example delineates the difference between an HTML escape-sequence that employsBase-10
numbers, and one usingBase-16
(Hexadecimal) numbers.- Parameters:
str
- anyString
that contains the HTML Escape Sequence&#[DECIMAL VALUE];
.- Returns:
- a
String
, with all of the decimal escape sequences removed and replaced with ASCII UniCode Characters.
If this parameter does not contain such a sequence, then this method will return the same input-String
reference as its return value. - See Also:
replaceAll_HEX(String str)
,StrReplace.r(String, String[], char[])
- Code:
- Exact Method Body:
// This is the RegEx Matcher from the top. It matches string's that look like: &#\d+; Matcher m = DEC_CODE.matcher(str); // Save the escape-string regex search matches in a TreeMap. We need to use a // TreeMap because it is much easier to check if a particular escape sequence has already // been found. It is easier to find duplicates with TreeMap's. TreeMap<String, Character> escMap = new TreeMap<>(); while (m.find()) { // Use Base-10 Integer-Parse int i = Integer.valueOf(m.group(1)); // Do not un-escape EMOJI's... It makes a mess - they are sequences of characters // not single characters. if (i > Character.MAX_VALUE) continue; // Retrieve the Text Information about the HTML Escape Sequence String text = m.group(); // Check if it is a valid HTML 5 Escape Sequence. if (! escMap.containsKey(text)) escMap.put(text, Character.valueOf((char) i)); } // Build the matchStr's and replaceChar's arrays. These are just the KEY's and // the VALUE's of the TreeMap<String, Character> which was just built. // NOTE: A TreeMap is used *RATHER THAN* two parallel arrays in order to avoid keeping // duplicates when the replacement occurs. String[] matchStrs = escMap.keySet().toArray(new String[escMap.size()]); char[] replaceChars = new char[escMap.size()]; // Lookup each "ReplaceChar" in the TreeMap, and put it in the output "replaceChars" // array. The class StrReplace will replace all the escape sequences with the actual // characters. for (int i=0; i < matchStrs.length; i++) replaceChars[i] = escMap.get(matchStrs[i]); return StrReplace.r(str, matchStrs, replaceChars);
-
-
replaceAll_TEXT
public static java.lang.String replaceAll_TEXT(java.lang.String str)
Replaces all HTML Escape Sequences that contain text-word escape-sequences.
Standard (Text) HTML Escape-Sequence Examples:ASCII or UNICODE: Can be Escaped Using: "
(double-quote)"
(in HTML)&
(ampersand)&
(in HTML)<
(less-than)<
(in HTML)>
(greater-than)>
(in HTML
View Escape-Codes:
The list included within the page attached (below) is a complete list of all Text-String
HTML Escape Sequences known to this class. This list, does not include anyCode Point, Hex
orDecimal Number
sequences.All HTML Escape Sequences
- Parameters:
str
- anyString
that contains HTML Escape Sequences that need to be converted to their ASCII-UniCode character representations.- Returns:
- a
String
, with all of the decimal escape sequences removed and replaced with ASCII UniCode Characters. - Throws:
java.lang.IllegalStateException
- See Also:
replaceAll_HEX(String str)
,StrReplace.r(String, boolean, String[], Torello.Java.Function.ToCharIntTFunc)
- Code:
- Exact Method Body:
// We only need to find which escape sequences are in this string. // use a TreeSet<String> to list them. It will Matcher m = TEXT_CODE.matcher(str); TreeMap<String, String> escMap = new TreeMap<>(); while (m.find()) { // Retrieve the Text Information about the HTML Escape Sequence String text = m.group(); String sequence = text.substring(1, text.length() - 1); // Check if it is a valid HTML 5 Escape Sequence. if ((! escMap.containsKey(text)) && htmlEscChars.containsKey(sequence)) escMap.put(text, sequence); } // Convert the TreeSet to a String[] array... and use StrReplace String[] escArr = new String[escMap.size()]; return StrReplace.r( str, false, escMap.keySet().toArray(escArr), (int i, String sequence) -> htmlEscChars.get(escMap.get(sequence)) );
-
replaceAll
@Deprecated public static java.lang.String replaceAll(java.lang.String s)
Deprecated.Calls all of the HTML Escape Sequence convert/replaceString
functions at once.- Parameters:
s
- This may be any JavaString
which may (or may not) contain HTML Escape sequences.- Returns:
- a new
String
where all HTML escape-sequence substrings have been replaced with their natural character representations. - See Also:
replaceAll_DEC(String)
,replaceAll_HEX(String)
,replaceAll_TEXT(String)
- Code:
- Exact Method Body:
return replaceAll_HEX(replaceAll_DEC(replaceAll_TEXT(s)));
-
replace
public static java.lang.String replace(java.lang.String s)
This is an optimized HTMLString
-replacement method. It will substitute all HTML Escape Sequences with the actual characters they represent.
Emoji's:
In keeping with the other methods in this class, if there are any HTMLEmoji
Escape Sequences, these shall not be replaced.Emoji's
work on the principle ofCode-Point
, and though replacing such escape sequences is not difficult, because they work in theCode-Point
space, their substitutions are never single character representations (there are always at least two Javachar's
used per oneCode Point
).
There is an alternate method that can substitute the actual Javachar's
for aCode-Point
Escape-Sequence.
Code-Point:
For those familiar withCode Point
, the wau this method works is that it just skips any escaped sequence that use Base-10 or Base-16 Representations if the number inside the Escape-Sequence is larger thanCharacter.MAX_VALUE
.
It is important to remember that all JavaString's
are simplychar
-Arrays which are wrapped in anjava.lang.String
class instance. Since the Primitive Type'char'
is fundamentally a 16-bit character, no character can be converted if it is larger than this value. Although Code Point works just fine in Java, it is left as a separate method in this class.
Rendering Emoji's:
Many standard web-pages use very little of the more advanced Escape-Sequences.Emoji's
are somewhat popular. The issue isn't about whether the'Code Point'
based Escape-Sequences can be converted or handled, but rather it is about whether or not your really want to leave the comfortable world of HTML Escape-Sequences for yourCode Point
related characters.
Once aCode Point
sequence has been un-escaped, it will only be visible in text-editors / viewers that are capable of renderingCode Point's
orEmoji's
(and not all text editors can do this!)- Parameters:
s
- This may be any JavaString
which may (or may not) contain HTML Escape sequences.- Returns:
- a new
String
where all HTML escape-sequence substrings have been replaced with their natural character representations. - Code:
- Exact Method Body:
// The primary optimization is to do this the "C" way (As in The C Programming Language) // The String to Escape is converted to a character array, and the characters are shifted // as the Escape Sequences are replaced. This is all done "in place" without creating // new substring's in memory. char[] c = s.toCharArray(); // These two pointers are kept as the "Source Character" - as in the next character to // "Read" ... and the "Destination Character" - as in the next location to write. int sourcePos = 0; int destPos = 0; while (sourcePos < c.length) // All Escape Sequences begin with the Ampersand Symbol. If the next character // does not begin with the Ampersand, we should skip and move on. Copy the next source // character to the next destination location, and continue the loop. if (c[sourcePos] != '&') { c[destPos++]=c[sourcePos++]; continue; } // Here, an Ampersand has been found. Now check if the character immediately // following the Ampersand is a Pound Sign. If it is a Pound Sign, that implies // this escape sequence is simply going to be a number. else if ((sourcePos < (c.length-1)) && (c[sourcePos + 1] == '#')) { int evaluatingPos = sourcePos + 1; boolean isHex = false; // If the Character after the Pound Sign is an 'X', it means that the number // that has been escaped is a Base 16 (Hexadecimal) number. // IMPORTANT: Check to see that the Ampersand wasn't the last char in the String if (evaluatingPos + 1 < c.length) if (c[evaluatingPos + 1] == 'x') { isHex = true; evaluatingPos++; } // Keep skipping the numbers, until a non-digit character is identified. while ((++evaluatingPos < c.length) && Character.isDigit(c[evaluatingPos])); // If the character immediately after the last digit isn't a ';' (Semicolon), // then this entire thing is NOT an escaped HTML character. In this case, make // sure to copy the next source-character to the next destination location in the // char[] array... Then continue the loop to the next 'char' (after Ampersand) if ((evaluatingPos == c.length) || (c[evaluatingPos] != ';')) { c[destPos++]=c[sourcePos++]; continue; } int escapedChar; try { // Make sure to convert 16-bit numbers using the 16-bit radix using the // standard java parse integer way. escapedChar = isHex ? Integer.parseInt(s.substring(sourcePos + 3, evaluatingPos), 16) : Integer.parseInt(s.substring(sourcePos + 2, evaluatingPos)); } // If for whatever reason java was unable to parse the digits in the escape // sequence, then copy the next source-character to the next destination-location // and move on in the loop. catch (NumberFormatException e) { c[destPos++]=c[sourcePos++]; continue; } // If the character was an Emoji, then it would be a number greater than // 2^16. Emoji's use Code Points - which are multiple characters used up // together. Their escape sequences are always characters larger than 65,535. // If so, just copy the next source-character to the next destination location, and // move on in the loop. if (escapedChar > Character.MAX_VALUE) { c[destPos++]=c[sourcePos++]; continue; } // Replace the next "Destination Location" with the (un) escaped char. c[destPos++] = (char) escapedChar; // Skip the entire HTML Escape Sequence by skipping to the location after the // position where the "evaluation" (all this processing) was occurring. This // just happens to be the next-character immediately after the semi-colon sourcePos = evaluatingPos + 1; // will be pointing at the ';' (semicolon) } // An Ampersand was just found, but it was not followed by a '#' (Pound Sign). This // means that it is not a "numbered" (to invent a term) HTML Escape Sequence. Instead // we shall check if there is a valid Escape-String (before the next semi-colon) that // can be identified in the Hashtable 'htmlEscChars' else if (sourcePos < (c.length - 1)) { // We need to create a 'temp variable' and it will be called "evaluating position" int evaluatingPos = sourcePos; // All text (non "Numbered") HTML Escape String's are comprised of letter or digits while ((++evaluatingPos < c.length) && Character.isLetterOrDigit(c[evaluatingPos])); // If the character immediately after the last letter or digit is not a semi-colon, // then there is no way this is an HTML Escape Sequence. Copy the next source to // the next destination location, and continue with the loop. if ((evaluatingPos == c.length) || (c[evaluatingPos] != ';')) { c[destPos++]=c[sourcePos++]; continue; } // Get the replacement character from the lookup table. Character replacement = htmlEscChars.get(s.substring(sourcePos + 1, evaluatingPos)); // The lookup table will return null if there this was not a valid escape sequence. // If this was not a valid sequence, just copy the next character from the source // location, and move on in the loop. if (replacement == null) { c[destPos++]=c[sourcePos++]; continue; } c[destPos++] = replacement; sourcePos = evaluatingPos + 1; } else { c[destPos++]=c[sourcePos++]; continue; } return new String(c, 0, destPos);
-
escChar
public static java.lang.String escChar(char c, boolean use16BitEscapeSequence)
This method shall simply escape anychar
into an HTML EscapeString
.Input 'char'
Returned String's
'中'
(Middle / China)"中"
(Base 10)
"中"
(Base 16)'日'
(Japan / Sun)"日"
(Base 10)
"日"
(Base 16)'Ñ'
(Spanish Tilda)"Ñ"
(Base 10)
"Ñ"
(Base 16)'ñ'
(Lower-Case Tilda)"ñ"
(Base 10)
"ñ"
(Base 16)'☃'
(Snowman Glyph)"☃"
(Base 10)
"☃"
(Base 16)
Java'char'
Primitive-Type:
The java primitive'char'
type, which, again, is a16-bit (2^16 65,535)
type essentially equates to the primary plane (plane 0
) of the 17 UNICODE planes. This is also known as the Basic Multi-Lingual Plane.
Here, likely any foreign language character, needed by a programmer (including all Chinese Character Glyphs) are easily found with a bit of searching. Any modern web-browser can display these characters, if they are escaped using an the HTML Escape Sequences returned by this method.
Modern-Browsers & UTF-8:
As an aside, if a programmer includes the HTML Element:<META CHARSET="utf-8">
in the<HEAD>...</HEAD>
portion of an HTML Page, it becomes easy to include such characters (from the Multi-Lingual Plane) without even needing to use Escape-Sequences for the characters.
Any Web-Browser which knows before-hand that non-ASCII characters (higher than character#255 / 0xFF
) are being transmitted, will interpret them usingUTF-8
. In this case escaping thechar's
them becomes unnecessary.- Parameters:
c
- Any Java Character. Note that the Java Primitive Type'char'
is a 16-bit type. This parameter equates to the UNICODE Characters0x0000
up to0xFFFF
.use16BitEscapeSequence
- If the user would like the returned, escaped,String
to use Base 16 for the escaped digits, passTRUE
to this parameter. If the user would like to retrieve an escapedString
that uses standard Base 10 digits, then passFALSE
to this parameter.- Returns:
- The passed character parameter
'c'
will be converted to an HTML Escape Sequence. For instance if the character'ᡃ'
, which is the Chinese Character for I, Me, Myself were passed to this method, then theString
"我"
would be returned.
If the parameter'use16BitEscapeSequence'
had been passedTRUE
, then this method would, instead, return theString "我"
. - Code:
- Exact Method Body:
return use16BitEscapeSequence ? "&#" + ((int) c) + ";" : "&#x" + Integer.toHexString((int) c).toUpperCase() + ";";
-
escCodePoint
public static java.lang.String escCodePoint(int codePoint, boolean use16BitEscapeSequence)
This method shall simply escape anyCode Point
point integer into an HTML EscapeString
. Below is a list of a few examples ofCode Points
commonly used. As stated, most of the Basic Multi Lingual Plane - which isPlane 0
of the UNICODE Space fits into the16-bit
javaPrimitive Type 'char'
. For such situations,"Code Points"
have very little application to software. Essentially, Java's16-bit 'char' primitive type
gives that to the programmer "for free" - without needing to think past, again, Java'sprimitive-type 'char'
.
Although"Code Points"
were developed decades ago, today, one of the most common uses for them are theEmoji's
being used on numerous web-sites. It is important to note that not allEmoji's
will fit into a singleCode Point
, and, as such, equating a"Code Point"
with an"Emoji"
is actually incorrect. However, for the more complicatedEmoji's
available, all that is really going on is that sequences ofcode points
are being sent and interpreted by the web-browser - as a single glyph or character-image.
Escaping Emoji's:
Just as with Foreign-Language characters, thecode-points
themselves (without having been escaped) can be included directly into a text file, as long as the HTML-File indicates that non-ASCII, orUTF-8
data is being transmitted. In such cases, to avoid using these Escape-Sequences at all, just include the usual Javachar's
in themeta
tag in the HTML<HEAD>...</HEAD>
section, as follows:
HTML-Tag to Include:<META CHARSET="utf-8">
.
And here is a (very) brief sample table of Emoji's and their HTML Escape-Sequences:Input Code Point (int)
Returned String's
😀 (Grinning Face)
(128512
)"😀"
(Base 10)
"😀"
(Base 16)👍 (Thumb's Up)
(128077
)"👍"
(Base 10)
"👍"
(Base 16)🌮 (Taco)
(127790
)"🌮"
(Base 10)
"🌮"
(Base 16)'A'
(Upper-Case A)
(ASCII# 65
)"A"
(Base 10)
"A"
(Base 16)'0'
(Number Zero)
(ASCII# 48
)"0"
(Base 10)
"0"
(Base 16)'中'
(Middle-China)
(20013
)"中"
(Base 10)
"中"
(Base 16)'ü'
(German Umlaut)
(252
)"ü"
(Base 10)
"ü"
(Base 16)'Ñ'
(Spanish Tilda)
(209
)"Ñ"
(Base 10)
"Ñ"
(Base 16)
Again, If the'.html'
files you are providing to a web-browser indicate the<META CHARSET="utf-8">
, it is not necessary to provide HTML escape sequences for anEmoji
, or any'Code Point'
at all. Instead, if the text-editor you are using to edit your'.html'
files can handlecode points
, they may be included directly into the'html'
file itself.
Multi-Code-Point Emoji's:
There are numerousEmoji's
that are represented by sequences ofcode-points
, AND NOT just a singlecode point
integer. In such cases, providing HTML escape sequences will actually prevent the browser from rendering the "conglomerate"Emoji
.
The Emoji's below do not need to be escaped, (because they are sequences ofcode points
, rather than just singlecode points
). Instead, theircode points
must be included directly into the'.html'
file itself - or they will not be properly rendered by the web-browser...Emoji Code Point Sequence
👁️🗨️
"Eye in Speech"U+1F441 U+200D U+1F5E8
==>
👁 (Eye -0x1F441;
) +
GLUE (0X200D;
) +
🗨 (Speech Bubble -0x1F5E8
)👉🏿
"Index-Finger Pointing, Dark Hand""U+1F449 U+1F3FF"
==>
👉 (Index Finger Pointing -U+1F449
) +
Dark Skin Color -U+1F3FF
- Parameters:
codePoint
- This will take any integer. It will be interpreted as aUNICODE
code point
.
NOTE: Java uses 16-bit values for it's primitive'char'
type. This is also the "first plane" of the UNICODE Space and actually referred to as the Basic Multi Lingual Plane. Any value passed to this method that is lower than65,535
would receive the same escape-String
that it would from a call to the methodescChar(char, boolean)
.use16BitEscapeSequence
- If the user would like the returned, escaped,String
to use Base 16 for the escaped digits, passTRUE
to this parameter. If the user would like to retrieve an escapedString
that uses standard Base 10 digits, then passFALSE
to this parameter.- Returns:
- The
code point
will be converted to an HTML Escape Sequence, as ajava.lang.String
. For instance if thecode point
for "the snowman" glyph (character ☃), which happens to be represented by acode point
that is below65,535
(and, incidentally, does "fit" into a single Java'char'
) - this method would return theString "☃"
.
If the parameter'use16BitEscapeSequence'
had been passedTRUE
, then this method would, instead, return theString "☃"
. - Throws:
java.lang.IllegalArgumentException
- Java has a method for determining whether any integer is a validcode point
. Not all of the integers "fit" into the 17 Unicode "planes". Note that each of the planes in'Unicode Space'
contain65,535
(or2^16
) characters.- Code:
- Exact Method Body:
if (! Character.isValidCodePoint(codePoint)) throw new IllegalArgumentException( "The integer you have passed to this method [" + codePoint + "] was deemed an " + "invalid Code Point after a call to: [java.lang.Character.isValidCodePoint(int)]. " + "Therefore this method is unable to provide an HTML Escape Sequence." ); return use16BitEscapeSequence ? "&#" + codePoint + ";" : "&#x" + Integer.toHexString(codePoint).toUpperCase() + ";";
-
hasHTMLEsc
public static boolean hasHTMLEsc(char c)
Check the internalEscape Sequence Lookup Table
. If there is an escape sequenceString
associated with thechar
provided to this method, then return TRUE. If there is no suchEscape Sequence
in theLookup Table
associated with parameter'c'
, then return FALSE.
TheLookup Table
can identify whetherchar
parameter'c'
has an associated HTML Escape Sequence, or not. Escape sequences are always short, text-String's
that were selected by the w3C (long ago, in the 1990's).
Returns TRUE if there is an associatedString
escape-sequence forchar
-parameter'c'
parameter, and FALSE otherwise. Please review the brief sample table below:Input Character: Method Return Value: '&'
(ampersand)TRUE
'A'
(letter-A)FALSE
'<'
(less-than-symbol)TRUE
'9'
(number-9)FALSE
'>'
(less-than-symbol)TRUE
View Escape-Codes:
The list included within the page attached (below) is a complete list of all Text-String
HTML Escape-Sequences that are known to this class. This list, does not include anyCode-Point, Hex
orDecimal-Number
sequences.All HTML Escape Sequences
- Parameters:
c
- Any ASCII or UNICODE Character- Returns:
TRUE
if there is aString
escape sequence for this character, andFALSE
otherwise.- See Also:
htmlEsc(char)
- Code:
- Exact Method Body:
return htmlEscSeq.get(Character.valueOf(c)) != null;
-
htmlEsc
public static java.lang.String htmlEsc(char c)
Check the internalEscape Sequence Lookup Table
. If there is an escape sequenceString
associated with thechar
provided to this method, then return it.
For Instance:Input Character: Method Return Value: '&'
"amp"
'A'
(letter-A)null
'<'
(less-than-symbol)"lt"
'9'
(number-9)null
'>'
(greater-than-symbol)"gt"
View Escape Codes:
The list included within the page attached (below) is a complete list of all Text-String
HTML Escape-Sequences that are known to this class. This list, does not include anyCode-Point, Hex
orDecimal-Number
sequences.All HTML Escape Sequences
- Parameters:
c
- Any ASCII or UNICODE Character- Returns:
- The
String
that is used by web-browsers to escape this ASCII / Uni-Code character - if there is one saved in the internalLookup Table
. If the character provided does not have an associatedHTML Escape String
, then 'null' is returned.
NOTE: The entire escape-String
is not provided, just the inner-characters. The leading'&'
(Ampersand) and the trailing';'
(Semi-Colon) are not appended to the returnedString
. - See Also:
hasHTMLEsc(char)
- Code:
- Exact Method Body:
return htmlEscSeq.get(Character.valueOf(c));
-
-