Class NewsSite

  • All Implemented Interfaces:
    java.io.Serializable

    public class NewsSite
    extends java.lang.Object
    implements java.io.Serializable
    The 'data flow' encapsulation class that contains most of the salient features of a news oriented web-site.

    This class is intended to allow a programmer to store the entire list of object references necessary to download a day's news-content from a news website. This class may be serialized, and saved to disk.
    See Also:
    Serialized Form


    • Field Detail

      • serialVersionUID

        🡇     🗕  🗗  🗖
        public static final long serialVersionUID
        This fulfils the SerialVersion UID requirement for all classes that implement Java's interface java.io.Serializable. Using the Serializable Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.
        See Also:
        Constant Field Values
        Code:
        Exact Field Declaration Expression:
         public static final long serialVersionUID = 1;
        
      • siteName

        🡅  🡇     🗕  🗗  🗖
        public final java.lang.String siteName
        A Simple Name for the news-site
        Code:
        Exact Field Declaration Expression:
         public final String siteName;
        
      • country

        🡅  🡇     🗕  🗗  🗖
        public final Country country
        Country of origin for the news-site in question
        Code:
        Exact Field Declaration Expression:
         public final Country country;
        
      • siteURL

        🡅  🡇     🗕  🗗  🗖
        public final java.net.URL siteURL
        URL of the main-page for the news web-site
        Code:
        Exact Field Declaration Expression:
         public final URL siteURL;
        
      • languageCode

        🡅  🡇     🗕  🗗  🗖
        public final LC languageCode
        A Language Code instance for the web-site, if needed.
        Code:
        Exact Field Declaration Expression:
         public final LC languageCode;
        
      • description

        🡅  🡇     🗕  🗗  🗖
        public final java.lang.String description
        A simple text description of the news web-site
        Code:
        Exact Field Declaration Expression:
         public final String description;
        
    • Constructor Detail

      • NewsSite

        🡅  🡇     🗕  🗗  🗖
        public NewsSite​(java.lang.String siteName,
                        Country country,
                        java.lang.String siteURLAsStr,
                        LC languageCode,
                        java.lang.String description,
                        java.util.Vector<java.net.URL> sectionURLs,
                        URLFilter filter,
                        LinksGet linksGetter,
                        ArticleGet articleGetter,
                        StrFilter bannerAndAddFinder)
        Simple constructor for this data-class.
        Parameters:
        siteName - This site's name
        country - The country-of-origin for this news web-site.
        siteURLAsStr - The primary URL for the news web-site.
        languageCode - If this site uses a non-English system, the 'languageCode' parameter can keep track of the language.
        description - Brief Description of the site.
        sectionURLs - This should list the primary news-sections on the web-site. News sections include lists such as "Life", "Health", "Business", "World News", "Sports" - but this list could actually include just about anything.
        filter - If, when scraping a section, there are URL's that need to be filtered, this parameter can help filtering non-Article, non-news links. As explained in the class ScrapeURL's, this is often a simple one-lined lambda-expression that identifies which URL's match a Regular-Expression Pattern.
        linksGetter - This is a 'getter', which also is often just a one line regular-expression lambda for retrieving the links from a section web-page.
        articleGetter - This should implement the ArticleGet interface.
        bannerAndAddFinder - Filter for finding repetitive ads or banners.
        Code:
        Exact Constructor Body:
         this.siteName           = siteName;
         this.country            = country;
         this.languageCode       = languageCode;
         this.description        = description;
         this.sectionURLs        = (Vector<URL>) sectionURLs.clone();
         this.filter             = filter;
         this.linksGetter        = linksGetter;
         this.articleGetter      = articleGetter;
         this.bannerAndAddFinder = bannerAndAddFinder;
        
         try
             { this.siteURL = new URL(siteURLAsStr); }
        
         catch (MalformedURLException e)
         {
             throw new NewsSiteException(
                 "Unable to instantiate the parameter 'siteURLAsStr'.  There was a Malformed URL " +
                 "Exception thrown.  Please see this Exceptions Throwable.getCause() for more " +
                 "details.", e
             );
         }
        
    • Method Detail

      • sectionURLsIter

        🡅  🡇     🗕  🗗  🗖
        public java.util.Iterator<java.net.URL> sectionURLsIter()
        Retrieves the Section URL's (life, comedy, sports, business, world) for this news-site
        Returns:
        An Iterator<URL> of the different sections for a particular news-site.
        Code:
        Exact Method Body:
         return new RemoveUnsupportedIterator<URL>(sectionURLs.iterator());
        
      • sectionURLsVec

        🡅     🗕  🗗  🗖
        public java.util.Vector<java.net.URL> sectionURLsVec()
        Retrieves the Section URL's (life, comedy, sports, business, world) for this news-site
        Returns:
        A Vector<URL> of the different sections for a particular news-site.
        Code:
        Exact Method Body:
         return (Vector<URL>) sectionURLs.clone();