Package Torello.HTML.Tools.NewsSite
Class NewsSite
- java.lang.Object
-
- Torello.HTML.Tools.NewsSite.NewsSite
-
- All Implemented Interfaces:
java.io.Serializable
public class NewsSite extends java.lang.Object implements java.io.Serializable
The 'data flow' encapsulation class that contains most of the salient features of a news oriented web-site.
This class is intended to allow a programmer to store the entire list of object references necessary to download a day's news-content from a news website. This class may be serialized, and saved to disk.- See Also:
- Serialized Form
Hi-Lited Source-Code:- View Here: Torello/HTML/Tools/NewsSite/NewsSite.java
- Open New Browser-Tab: Torello/HTML/Tools/NewsSite/NewsSite.java
File Size: 6,305 Bytes Line Count: 177 '\n' Characters Found
-
-
Field Summary
Serializable ID Modifier and Type Field static long
serialVersionUID
Primary NewsSite Data Modifier and Type Field Country
country
String
description
LC
languageCode
String
siteName
URL
siteURL
Getters (@FunctionalInterface - Lambdas) Modifier and Type Field ArticleGet
articleGetter
StrFilter
bannerAndAddFinder
URLFilter
filter
LinksGet
linksGetter
-
Constructor Summary
Constructors Constructor Description NewsSite(String siteName, Country country, String siteURLAsStr, LC languageCode, String description, Vector<URL> sectionURLs, URLFilter filter, LinksGet linksGetter, ArticleGet articleGetter, StrFilter bannerAndAddFinder)
Simple constructor for this data-class.NewsSite(String siteName, Country country, String siteURLAsStr, LC languageCode, String description, Vector<URL> sectionURLs, StrFilter filter, LinksGet linksGetter, ArticleGet articleGetter, StrFilter bannerAndAddFinder)
Convenience Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Iterator<URL>
sectionURLsIter()
Vector<URL>
sectionURLsVec()
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
This fulfils the SerialVersion UID requirement for all classes that implement Java'sinterface java.io.Serializable
. Using theSerializable
Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.- See Also:
- Constant Field Values
- Code:
- Exact Field Declaration Expression:
public static final long serialVersionUID = 1;
-
siteName
public final java.lang.String siteName
A Simple Name for the news-site
-
country
-
siteURL
public final java.net.URL siteURL
URL
of the main-page for the news web-site
-
languageCode
public final LC languageCode
A Language Code instance for the web-site, if needed.
-
description
public final java.lang.String description
A simple text description of the news web-site
-
filter
public final URLFilter filter
- See Also:
ScrapeURLs
- Code:
- Exact Field Declaration Expression:
public final URLFilter filter;
-
linksGetter
public final LinksGet linksGetter
An instance ofLinksGet
for retrieving Article-URL
links from a section page- See Also:
ScrapeURLs
- Code:
- Exact Field Declaration Expression:
public final LinksGet linksGetter;
-
articleGetter
public final ArticleGet articleGetter
An instance ofArticleGet
used to retrieve news-articles from this site.- See Also:
ScrapeArticles
- Code:
- Exact Field Declaration Expression:
public final ArticleGet articleGetter;
-
bannerAndAddFinder
public final StrFilter bannerAndAddFinder
An instance ofStrFilter
for finding banner's or ad's- See Also:
ScrapeArticles
- Code:
- Exact Field Declaration Expression:
public final StrFilter bannerAndAddFinder;
-
-
Constructor Detail
-
NewsSite
public NewsSite(java.lang.String siteName, Country country, java.lang.String siteURLAsStr, LC languageCode, java.lang.String description, java.util.Vector<java.net.URL> sectionURLs, StrFilter filter, LinksGet linksGetter, ArticleGet articleGetter, StrFilter bannerAndAddFinder)
Convenience Constructor
May pass aStrFilter
to theURLFilter
parameter instead.
Invokes:NewsSite(String, Country, String, LC, String, Vector, URLFilter, LinksGet, ArticleGet, StrFilter)
-
NewsSite
public NewsSite(java.lang.String siteName, Country country, java.lang.String siteURLAsStr, LC languageCode, java.lang.String description, java.util.Vector<java.net.URL> sectionURLs, URLFilter filter, LinksGet linksGetter, ArticleGet articleGetter, StrFilter bannerAndAddFinder)
Simple constructor for this data-class.- Parameters:
siteName
- This site's namecountry
- The country-of-origin for this news web-site.siteURLAsStr
- The primaryURL
for the news web-site.languageCode
- If this site uses a non-English system, the'languageCode'
parameter can keep track of the language.description
- Brief Description of the site.sectionURLs
- This should list the primary news-sections on the web-site. News sections include lists such as "Life", "Health", "Business", "World News", "Sports" - but this list could actually include just about anything.filter
- If, when scraping a section, there areURL's
that need to be filtered, this parameter can help filtering non-Article, non-news links. As explained in theclass ScrapeURL's
, this is often a simple one-lined lambda-expression that identifies whichURL's
match a Regular-ExpressionPattern
.linksGetter
- This is a 'getter', which also is often just a one line regular-expression lambda for retrieving the links from a section web-page.articleGetter
- This should implement theArticleGet
interface.bannerAndAddFinder
- Filter for finding repetitive ads or banners.- Code:
- Exact Constructor Body:
this.siteName = siteName; this.country = country; this.languageCode = languageCode; this.description = description; this.sectionURLs = (Vector<URL>) sectionURLs.clone(); this.filter = filter; this.linksGetter = linksGetter; this.articleGetter = articleGetter; this.bannerAndAddFinder = bannerAndAddFinder; try { this.siteURL = new URL(siteURLAsStr); } catch (MalformedURLException e) { throw new NewsSiteException( "Unable to instantiate the parameter 'siteURLAsStr'. There was a Malformed URL " + "Exception thrown. Please see this Exceptions Throwable.getCause() for more " + "details.", e ); }
-
-
Method Detail
-
sectionURLsIter
public java.util.Iterator<java.net.URL> sectionURLsIter()
Retrieves the Section URL's (life, comedy, sports, business, world) for this news-site- Returns:
- An
Iterator<URL>
of the different sections for a particular news-site. - Code:
- Exact Method Body:
return new RemoveUnsupportedIterator<URL>(sectionURLs.iterator());
-
sectionURLsVec
public java.util.Vector<java.net.URL> sectionURLsVec()
Retrieves the Section URL's (life, comedy, sports, business, world) for this news-site- Returns:
- A
Vector<URL>
of the different sections for a particular news-site. - Code:
- Exact Method Body:
return (Vector<URL>) sectionURLs.clone();
-
-