java.lang.Object
- Torello.CSS.CSSToken
- - Torello.CSS.UnicodeRange

All Implemented Interfaces:

java.io.Serializable, java.lang.CharSequence, java.lang.Comparable<java.lang.CharSequence>
```
public class UnicodeRange
extends CSSToken
implements java.lang.CharSequence, java.io.Serializable, java.lang.Comparable<java.lang.CharSequence>
```
This is a Token Data-Class. It is a descendant of the root CSSToken-Class: CSSToken. Instances of the class are usually are produced by the CSSTokenizer class. Many (but not all) of these subclasses maintain a static-method for building instances of this class named 'build'. Any CSSToken-subclass that is neither a singleton-instance, nor an "Error-Subtype" should have such a builder. Singeton instances do not need builders, and the two Error-Subtype Classes can only be generated by the tokenizer.

All CSSToken subclasses have a CSSToken.str field which contains the exact character data that was extracted and used to construct instances of this class. All sub-casses also have several "Loop Optimization" methods. These are methods that may or may not be useful in light of some of the newer additions to JDK 17 & 21 including the 'instanceof varName' conditional-expression variable-naming features.

The algorithms used to write this tokenizer were generated based solely on the CSS Working-Group's Syntax-Documentation. This document may be viewed here: CSS Working-Group CSS-Syntax. There is an external site that maintain all thing CSS located at drafts.csswg.org

Represents a range of characters in Unicode.

See Also:

Serialized Form
Hi-Lited Source-Code:
- View Here: Torello/CSS/UnicodeRange.java
- Open New Browser-Tab: Torello/CSS/UnicodeRange.java
File Size: 14,935 Bytes Line Count: 334 '\n' Characters Found

Field Summary

Serializable ID

Modifier and Type Field

protected static long serialVersionUID

This Data Class Instance Fields

Modifier and Type Field

int eRange

int sRange
- Fields inherited from class Torello.CSS.CSSToken
  str

Method Summary

Static Builders: Build an Instance of this class

Modifier and Type	Method
`static UnicodeRange`	`build(String rangeStr)`

Verify & Identify: CSS Working-Group Implementation
Modifier and Type	Method
`static boolean`	`is(int[] css, int sPos)`

Tokenize CSS: CSS Working-Group Implementation
Modifier and Type	Method
`protected static void`	`consume(int[] css, ByRef<Integer> POS, Consumer<CSSToken> returnParsedToken)`

Loop Optimizer Methods: 'is' & 'if'
Modifier and Type	Method
`UnicodeRange`	`ifUnicodeRange()`
`boolean`	`isUnicodeRange()`

Methods inherited from class Torello.CSS.CSSToken
asAtKeyword, asBadStr, asBadURL, asCDC, asCDO, asComment, asDelimiter, asDimension, asFunc, asHash, asIdentifier, asNum, asPercentage, asPunct, asStr, asUnicodeRange, asURL, asWhitespace, charAt, compareTo, equals, ifAtKeyword, ifBadStr, ifBadURL, ifCDC, ifCDO, ifComment, ifDelimiter, ifDelimiter, ifDimension, ifFunc, ifHash, ifIdentifier, ifNum, ifPercentage, ifPunct, ifPunct, ifStr, ifURL, ifWhitespace, isAtKeyword, isBadStr, isBadURL, isCDC, isCDO, isComment, isDelimiter, isDelimiter, isDimension, isFunc, isHash, isIdentifier, isNum, isPercentage, isPunct, isPunct, isStr, isURL, isWhitespace, length, subSequence, toString

Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface java.lang.CharSequence
charAt, chars, codePoints, length, subSequence, toString

Methods inherited from interface java.lang.Comparable
compareTo

- Field Detail
  - serialVersionUID
    
    🡇 ⇈ ⮫ 🗕 🗗 🗖
    protected static final long serialVersionUID
    
    This fulfils the SerialVersion UID requirement for all classes that implement Java's interface java.io.Serializable. Using the Serializable Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.
    
    See Also:
    
    Constant Field Values
    
    Code:
    
    Exact Field Declaration Expression:
    
    protected static final long serialVersionUID = 1;
  - sRange
    
    🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
    public final int sRange
    
    The starting value of the range that has been specified, as a Java Integer.
  - eRange
    
    🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
    public final int eRange
    
    The ending value of the range that has been specified, as a Java Integer
- Method Detail
  - isUnicodeRange
    
    🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
    public final boolean isUnicodeRange()
    
    Description copied from class: CSSToken
    
    Loop Optimization: This method only returns TRUE if this is an actual instance of UnicodeRange.
    
    Overrides:
    
    isUnicodeRange in class CSSToken
    
    Returns:
    
    This method returns FALSE for all instances of CSSToken, except when 'this' instance is actually the UnicodeRange Subclass.
    
    That class has overridden this method, and returns TRUE.
    
    See Also:
    
    isUnicodeRange()
  - ifUnicodeRange
    
    🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
    public final UnicodeRange ifUnicodeRange()
    
    Description copied from class: CSSToken
    
    Loop Optimization: When this method is invoked on an instance of sub-class UnicodeRange this method produces 'this' instance.
    
    Overrides:
    
    ifUnicodeRange in class CSSToken
    
    Returns:
    
    This method shall return null, always, except when 'this' is an actual instance of UnicodeRange. When so, this method simply returns 'this'. All other sub-classes of (abstract) class CSSToken inherit this method, and therefore return null.
    
    See Also:
    
    ifUnicodeRange()
  - build
    
    🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
    public static UnicodeRange build(java.lang.String rangeStr)
    
    Static-Builder Method for creating an instance of this class. This Static-Method is a substitute for an actual Constructor. Because many of the 'consume(...)' methods in the Token Classe for Torello.CSS actually generate / spit-out more than CSSToken instance, writing publicly available constructors is largely impossible.
    
    The upside to this approach is that the build methods and the consume methods share identical code. Furthermore this code is (nearly) perfectly based on the Pseudo-Code on the CSS Working-Group Website.
    
    Parameters:
    
    rangeStr - Any Java-String that can be parsed into an instance of Str
    
    Returns:
    
    An instance of Str.
    
    If the contents of the Input-String parameter 'rangeStr' cannot be consumed, exactly, by this class' 'consume' method, then an exception shall throw.
    
    Throws:
    
    TokenizeException - This exception may be thrown for any number of reasons involving the inability to parse input parameter 'rangeStr'.
  - is
    
    🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
    public static boolean is(int[] css, int sPos)
    
    Checks whether or not the next token to consume is a Unicode Range.
    
    Tokenizer: Escape-Sequence Check Method, Pseudo-Code
    Making use of the CSS Parser DOES NOT require any knowledge of how the underlying Pass 1 Tokenizer actually works. Browser-War people are usually pretty convincing that parsing CSS is a "Moving Target" type of operation, not to be engaged by mere mortals.
    
    Below is the CSS Working Group's Escape-Sequence Pseudo-Code. You may review it if you are at wit's end, and have nothing better to do. There is no need to actually invoke this method, it is here solely for informational purposes.
    These Parsing Pseudo-Code Instructions and Rail-Road Diagrams have been copied from the CSS-Working-Group Web-Site:
    https://drafts.csswg.org/css-syntax/#check-if-three-code-points-would-start-a-unicode-range
    
    4.3.11. Check if three code points would start a unicode-range
    
    This section describes how to check if three code points would start a unicode-range. The algorithm described here can be called explicitly with three code points, or can be called with the input stream itself. In the latter case, the three code points in question are the current input code point and the next two input code points, in that order.
    
    Note: This algorithm will not consume any additional code points.
    
    If all of the following are true:
    
    The first code point is either U+0055 LATIN CAPITAL LETTER U (U) or U+0075 LATIN SMALL LETTER U (u)
    
    The second code point is U+002B PLUS SIGN (+).
    
    The third code point is either U+003F QUESTION MARK (?) or a hex digit
    
    then return true.
    
    Otherwise return false.
    
    Parameters:
    
    css - CSS-String as an array of code-points.
    
    sPos - The array-index where the tokenizer is to consume its next token
    
    Returns:
    
    TRUE if and only if the next token in the array is a Unicode-Range
  - consume
    
    🡅 ⇈ ⮫ 🗕 🗗 🗖
    protected static void consume (int[] css, ByRef<java.lang.Integer> POS, java.util.function.Consumer<CSSToken> returnParsedToken)
    
    This is a tokenizer method which "consumes" the next UnicodeRange-Token from the input Code-Point Array.
    
    Tokenizer: UnicodeRange Consume Method, Pseudo-Code
    Making use of the CSS Parser DOES NOT require any knowledge of how the underlying Pass 1 Tokenizer actually works. Browser-War people are usually pretty convincing that parsing CSS is a "Moving Target" type of operation, not to be engaged by mere mortals.
    
    Below is the CSS Working Group's UnicodeRange Pseudo-Code. You may review it if you are at wit's end, and have nothing better to do. There is no need to actually invoke this method, it is here solely for informational purposes.
    These Parsing Pseudo-Code Instructions and Rail-Road Diagrams have been copied from the CSS-Working-Group Web-Site:
    https://drafts.csswg.org/css-syntax/#consume-unicode-range-token
    
    Consume a unicode-range token
    
    This section describes how to consume a unicode-range token from a stream of code points. It returns a <unicode-range-token>.
    
    Note: This algorithm does not do the verification of the first few code points that are necessary to ensure the returned code points would constitute an <unicode-range-token>. Ensure that the stream would start a unicode-range before calling this algorithm.
    
    Note: This token is not produced by the tokenizer under normal circumstances. This algorithm is only called during consume the value of a unicode-range descriptor, which itself is only called as a special case for parsing the unicode-range descriptor; this single invocation in the entire language is due to a bad syntax design in early CSS.
    
    Consume the next two input code points and discard them.
    
    Consume as many hex digits as possible, but no more than 6. If less than 6 hex digits were consumed, consume as many U+003F QUESTION MARK (?) code points as possible, but no more than enough to make the total of hex digits and U+003F QUESTION MARK (?) code points equal to 6.
    
    Let first segment be the consumed code points.
    
    If first segment contains any question mark code points, then:
    
    Replace the question marks in first segment with U+0030 DIGIT ZERO (0) code points, and interpret the result as a hexadecimal number. Let this be start of range.
    
    Replace the question marks in first segment with U+0046 LATIN CAPITAL LETTER F (F) code points, and interpret the result as a hexadecimal number. Let this be end of range.
    
    Return a new <unicode-range-token> starting at start of range and ending at end of range.
    
    Otherwise, interpret first segment as a hexadecimal number, and let the result be start of range.
    
    If the next 2 input code points are U+002D HYPHEN-MINUS (-) followed by a hex digit, then:
    
    Consume the next input code point.
    
    Consume as many hex digits as possible, but no more than 6. Interpret the consumed code points as a hexadecimal number. Let this be end of range.
    
    Return a new <unicode-range-token> starting at start of range and ending at end of range.
    
    Otherwise, return a new <unicode-range-token> both starting and ending at start of range.
    
    <unicode-range-token>

Serializable ID

Modifier and Type	Field
`protected static long`	`serialVersionUID`

This Data Class Instance Fields
Modifier and Type	Field
`int`	`eRange`
`int`	`sRange`

Fields
int	eRange
int	sRange
long	serialVersionUID
Methods
UnicodeRange	build(String rangeStr)
void	consume(int[] css, ByRef POS, Consumer returnParsedToken)
UnicodeRange	ifUnicodeRange()
boolean	is(int[] css, int sPos)
boolean	isUnicodeRange()

Class UnicodeRange

Field Summary

Fields inherited from class Torello.CSS.CSSToken

Method Summary

Methods inherited from class Torello.CSS.CSSToken

Methods inherited from class java.lang.Object

Methods inherited from interface java.lang.CharSequence

Methods inherited from interface java.lang.Comparable

Field Detail

serialVersionUID

sRange

eRange

Method Detail

isUnicodeRange

ifUnicodeRange

build

is

Tokenizer: Escape-Sequence Check Method, Pseudo-Code

4.3.11. Check if three code points would start a unicode-range

consume

Tokenizer: UnicodeRange Consume Method, Pseudo-Code

Consume a unicode-range token