Package Torello.Languages
Class Verbs
- java.lang.Object
-
- Torello.Languages.Verbs
-
public class Verbs extends java.lang.Object
Conjugating Verbs (Spanish).
The primary use of this class is to facilitate addingHTML <SPAN DATA-RV="regular_verb">
elements, and also<SPAN DATA-IV="irregular_verb">
elements to a page of text. Primarily this can be of value because there are Java-ScriptzIndex
based popup windows that may be easily added by incorporating the simple Java-Script files provided to your Spanish Language Pages.
For More Information, please view the pages @ SpanishNewsBoard.com to view the concept of "Verb Conjugation Popup Windows."
-
-
Nested Class Summary
Nested Classes Modifier and Type Class static class
Verbs.WebFiles
-
Field Summary
Fields Modifier and Type Field static String[]
skip
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method static Vector<HTMLNode>
addSpanishVerbSpans(String text, TreeSet<String> regularVerbsFound, TreeSet<String> irregularVerbsFound, TreeSet<String> wordsNotFound)
static void
addSpanishVerbSpans(Vector<HTMLNode> page, TreeSet<String> regularVerbsFound, TreeSet<String> irregularVerbsFound, TreeSet<String> wordsNotFound)
static String
getDefinition(String infinitiveInLowerCase)
static String
getInfinitive(String wordInLowerCase)
static Iterator<String>
infinitives()
static Iterator<String>
irregularInfinitives()
static boolean
isIrregular(String infinitiveInLowerCase)
static void
loadConjugations()
static void
loadDefinitions()
static void
loadInfinitives()
static void
loadIrregularInfinitives()
static void
releaseConjugations()
static void
releaseDefinitions()
static void
releaseInfinitives()
static void
releaseIrregularInfinitives()
-
-
-
Field Detail
-
skip
public static java.lang.String[] skip
This software is not perfect - Human Language is a new order of issues. There are many features that could be added to make a better translator, but I have been busy writing an HTML Scrape Package instead. When you see this array, what it means is that these words are extremely common words in Spanish, but usually, in about 80% to 90% of cases, aren't verbs. A "Lexical Analysis" could probably figure out much better when a word is guaranteed to be verb, but for now, these words are "just skipped" and never identified as verbs at all.
NOTE: You may change this at your discretion, just re-assign the array.
-
-
Method Detail
-
loadConjugations
public static void loadConjugations()
Loads the ConjugationsString
into memory. This must be in memory before working with Verb-Spans.- See Also:
LFEC.loadFile_JAR(Class, String)
- Code:
- Exact Method Body:
conjugations = LFEC.loadFile_JAR(Verbs.class, CONJUGATIONS);
-
releaseConjugations
public static void releaseConjugations()
Releases the memory for the (rather large) Java-String
containing the verb conjugations, and callsSystem.gc()
.
-
loadIrregularInfinitives
public static void loadIrregularInfinitives()
Loads the list of Irregular Infinitives into Java Memory, from the JAR. ThisTreeSet
needs to be loaded into memory before working with Verb-Spans.- See Also:
LFEC.readObjectFromFile_JAR(Class, String, boolean, Class)
- Code:
- Exact Method Body:
irregularInfinitives = (TreeSet<String>) LFEC.readObjectFromFile_JAR (Verbs.class, IRREG_INFINITIVES, true, TreeSet.class);
-
releaseIrregularInfinitives
public static void releaseIrregularInfinitives()
Releases the memory for the (rather large)java.util.TreeSet
of Irregular Infinitives, and callsSystem.gc()
.
-
loadInfinitives
public static void loadInfinitives()
Loads the complete list of known infinitives from the JAR to theTreeSet<String>
- See Also:
LFEC.readObjectFromFile_JAR(Class, String, boolean, Class)
- Code:
- Exact Method Body:
infinitives = (TreeSet<String>) LFEC.readObjectFromFile_JAR (Verbs.class, INFINITIVES, true, TreeSet.class);
-
releaseInfinitives
public static void releaseInfinitives()
Releases the memory for theTreeSet
of infinitives, and callsSystem.gc()
.
-
loadDefinitions
public static void loadDefinitions()
Loads the definitions file - which is typed as aTreeMap<String, String>
- See Also:
LFEC.readObjectFromFile_JAR(Class, String, boolean, Class)
- Code:
- Exact Method Body:
definitions = (TreeMap<String, String>) LFEC.readObjectFromFile_JAR (Verbs.class, DEFINITIONS, true, TreeMap.class);
-
releaseDefinitions
public static void releaseDefinitions()
Releasees the memory for theTreeMap
of definitions, and callsSystem.gc()
-
infinitives
public static java.util.Iterator<java.lang.String> infinitives()
Generates an iterator of Spanish Verb Infinitives. Items may not be removed via the iterator's'remove()'
method.- Returns:
- An iterator of all Spanish Verbs loaded into the infinitives TreeSet.
- See Also:
RemoveUnsupportedIterator
- Code:
- Exact Method Body:
return new RemoveUnsupportedIterator<String>(infinitives.iterator());
-
irregularInfinitives
public static java.util.Iterator<java.lang.String> irregularInfinitives()
Generates an iterator of Spanish Irregular-Verbs in Infinitive Form. Items may not be removed via the iterator's'remove()'
method.- Returns:
- An iterator of all Irregular Spanish Verbs loaded into the irregular-infinitives TreeSet.
- See Also:
RemoveUnsupportedIterator
- Code:
- Exact Method Body:
return new RemoveUnsupportedIterator<String>(irregularInfinitives.iterator());
-
getDefinition
public static java.lang.String getDefinition (java.lang.String infinitiveInLowerCase)
Gets the quick-definition of a Spanish Verb.
EXPECTATIONS:- The "definitions" data file must already be loaded into memory
- To be precise, loadIDefinitions() needs to have been called!
- word MUST be in lower-case Spanish - otherwise results might be inaccurate!
- TRY: ES.toLowerCaseSpanish(String) to make sure.
- Parameters:
infinitiveInLowerCase
- This may be any Spanish Verb - as long as it is in the infinitive form.- Returns:
- Will return the string stored as the value in the
TreeMap<String, String>
definitions, and null if this infinitive is not found in the dictionary. - See Also:
ES.toLowerCaseSpanish(String)
- Code:
- Exact Method Body:
return definitions.get(infinitiveInLowerCase);
-
getInfinitive
public static java.lang.String getInfinitive (java.lang.String wordInLowerCase)
Get the infinitive form of a Verb-String
.
EXPECTATIONS:- The "conjugations" data file must already be loaded into memory
- To be precise, loadIConjugations() needs to have been called!
- word MUST be in lower-case Spanish - otherwise results might be inaccurate!
- TRY: ES.toLowerCaseSpanish(String) to make sure.
- Parameters:
wordInLowerCase
- This can be any word (in Spanish... or any language for that matter).
It is expected to be a conjugated form of a Spanish verb. If it is... The original infinitive form of that verb will be returned.- Returns:
- Returns the Infinitive of a verb - if the word passed is a direct conjugation of that verb.
- Returns null if there are no matching verbs conjugations in
private static String conjugations
- Code:
- Exact Method Body:
// Eliminates common words that aren't verbs - but conjugate .. "para" "como" // for (int k=0; k < skip.length; k++) if (wtlc.equals(skip[k])) return null; // GREP through the conjugations data file (stored in String: conjugations) int pos = conjugations.indexOf(" " + wordInLowerCase + ","); if (pos == -1) if (wordInLowerCase.charAt(wordInLowerCase.length() - 1) == 'r') pos = conjugations.indexOf("\n" + wordInLowerCase + ":"); // the post-increment (++) is for the infinitive case match. // Specifically, the first character, in this (the infinitive) case, would be a // newline '\n'.. and a '\n' character is exactly what the loop which follows is // grep'ing for... if (pos == -1) return null; else pos++; // There *WAS* a match in the conjugations data file. - get infinitive and return while ((conjugations.charAt(--pos) != '\n') && (pos > 0)); return conjugations.substring(pos + 1, conjugations.indexOf(':', pos + 1));
-
isIrregular
public static boolean isIrregular(java.lang.String infinitiveInLowerCase)
Checks if a word is an irregular verb.
EXPECTATIONS:- The "irregular infinitives" data file must already be loaded into memory
- To be precise, loadIrregularInfinitives() needs to have been called!
- word MUST be in lower-case Spanish - otherwise results will be inaccurate!
- TRY: ES.toLowerCaseSpanish(String) to make sure
- Parameters:
infinitiveInLowerCase
- This may be any Spanish Verb - as long as it is in the infinitive form. This word must have been converted to lower case, and if not, it will likely return null.- Returns:
- Will return TRUE if this verb is contained by the list of irregular-verbs Will return FALSE otherwise.
- See Also:
ES.toLowerCaseSpanish(String)
- Code:
- Exact Method Body:
return irregularInfinitives.contains(infinitiveInLowerCase);
-
addSpanishVerbSpans
public static void addSpanishVerbSpans (java.util.Vector<HTMLNode> page, java.util.TreeSet<java.lang.String> regularVerbsFound, java.util.TreeSet<java.lang.String> irregularVerbsFound, java.util.TreeSet<java.lang.String> wordsNotFound)
This will call the "addSpanishVerbSpans" on eachTextNode
found in the pageVector
.- Parameters:
regularVerbsFound
- If this parameter isn't null, than any and all regular verbs found within the text will be added to thisTreeSet
. If this parameter is null, it will be ignored.irregularVerbsFound
- If this parameter isn't null, than any irregular-verbs found in this text will be added to thisTreeSet
. If this parameter is null, it will be ignored.wordsNotFound
- All words that are found, and aren't verbs are entered into thisTreeSet
, if this parameter is not null. If this parameter is null, it will be ignored.- See Also:
addSpanishVerbSpans(String, TreeSet, TreeSet, TreeSet)
- Code:
- Exact Method Body:
HTMLNode n; for (int i=0; i < page.size(); i++) if ((n = page.elementAt(i)) instanceof TextNode) { Vector<HTMLNode> withSpans = addSpanishVerbSpans (n.str, regularVerbsFound, irregularVerbsFound, wordsNotFound); page.removeElementAt(i); page.addAll(i, withSpans); i += withSpans.size() - 1; // Trust me, this is right! // If "withSpans.size() == 1" (a.k.a. "no-change"), then should do: i += 0; // If "withSpans.size() == 2" (increased by 1), then should do: i += 1; }
-
addSpanishVerbSpans
public static java.util.Vector<HTMLNode> addSpanishVerbSpans (java.lang.String text, java.util.TreeSet<java.lang.String> regularVerbsFound, java.util.TreeSet<java.lang.String> irregularVerbsFound, java.util.TreeSet<java.lang.String> wordsNotFound)
The purpose of this class is to go through the Spanish Verbs in an HTML page, and replace them with HTML<SPAN>
elements to facilitate Verb-Conjugation Popup-Windows.- Parameters:
regularVerbsFound
- If this parameter isn't null, than any and all regular verbs found within the text will be added to thisTreeSet
. If this parameter is null, it will be ignored.irregularVerbsFound
- If this parameter isn't null, than any irregular-verbs found in this text will be added to thisTreeSet
. If this parameter is null, it will be ignored.wordsNotFound
- All words that are found, and aren't verbs are entered into thisTreeSet
, if this parameter is not null. If this parameter is null, it will be ignored.- Returns:
- An html sub-page (as a
Vector
) where each found Spanish-Verb has been surrounded by an HTML<SPAN>
element that indicates the regularity of the verb, and it's infinitive-form conjugation. - See Also:
ES.onlyLanguageChars(String)
,ES.toLowerCaseSpanish(String)
,HTMLPage.getPageTokens(CharSequence, boolean)
- Code:
- Exact Method Body:
// Keep list of found regular-verbs in the tree-set boolean keepRV = regularVerbsFound != null; // Keep list of found irregular-verbs in the tree set boolean keepIV = irregularVerbsFound != null; // Keep list of words that weren't verbs in the tree-set boolean keepNV = wordsNotFound != null; StringBuilder outSB = new StringBuilder(); // Splits the string by spaces String[] words = text.split(" "); for (int j=0; j < words.length; j++) { // Sometimes it is the empty string or just white-space String trim = words[j].trim(); if (trim.length() == 0) { outSB.append(" " + words[j]); continue; } // Eliminates leading and trailing punctuation & HTML tags Matcher m = P1.matcher(trim); if (! m.find()) { outSB.append(" " + words[j]); continue; } String pre = m.group(2); String word = m.group(3); String post = m.group(4); if (! ES.onlyLanguageChars(word)) System.out.println ("ORIG: [" + words[j] + "], " + pre + ", " + word + ", " + post); if (word == null) { outSB.append(" " + words[j]); continue; } if (pre == null) pre = ""; if (post == null) post = ""; if (word.length() == 0) { outSB.append(" " + words[j]); continue; } String lc = ES.toLowerCaseSpanish(word); // Skip the "ultra-common" non-verbs that look just like verbs. for (String w : skip) if (lc.equals(w)) continue; String infinitive= getInfinitive(lc); if (infinitive == null) { if (keepNV) wordsNotFound.add(lc); continue; } else { if (keepRV) regularVerbsFound.add(infinitive); } outSB.append(" " + pre + "<SPAN CLASS=\""); if (isIrregular(infinitive)) { outSB.append('I'); if (keepIV) irregularVerbsFound.add(infinitive); } else { outSB.append('R'); } outSB.append("V\" DATA-V=\"" + infinitive + "\">" + word + "</SPAN>" + post); } outSB.append('\n'); return HTMLPage.getPageTokens(outSB, false);
-
-