www.Gov.CN Article Carousel - Article Scrape
Generated by a call to:
Pause pause = Pause.getFSInstance("state.dat");
ScrapedArticleReceiver receiver = ScrapedArticleReceiver.saveToFS("articleData/");
StorageWriter log = new StorageWriter();
pause.initialize();
// articleURLs parameter was generated in previous step.
ScrapeArticles.download(receiver, articleURLs, ns.articleGetter, true, null, false, pause, log);
*****************************************************************************************
*****************************************************************************************
Downloading Articles
*****************************************************************************************
*****************************************************************************************
Visiting URL: [0000 of 0003, 0000 of 0004] - http://www.gov.cn/xinwen/2020-09/23/content_5546551.htm
Available Memory: 30,040,648 Total Memory: 32,571,392
Page contains (495) HTMLNodes.
Page <TITLE> element is: 习近平会见联合国秘书长古特雷斯_滚动新闻_中国政府网
Article body contains (268) HTMLNodes.
Article contains (6) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0000 of 0003, 0001 of 0004] - http://www.gov.cn/premier/2020-09/22/content_5546059.htm
Available Memory: 29,132,232 Total Memory: 32,571,392
Page contains (546) HTMLNodes.
Page <TITLE> element is: 李克强在上海考察_总理_中国政府网
Article body contains (326) HTMLNodes.
Article contains (11) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0000 of 0003, 0002 of 0004] - http://www.gov.cn/guowuyuan/2020-09/24/content_5546932.htm
Available Memory: 29,117,112 Total Memory: 32,571,392
Page contains (464) HTMLNodes.
Page <TITLE> element is: 韩正主持召开推动长三角一体化发展领导小组全体会_国务院副总理韩正_中国政府网
Article body contains (237) HTMLNodes.
Article contains (4) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0000 of 0003, 0003 of 0004] - http://www.gov.cn/xinwen/2020-09/25/content_5546975.htm
Available Memory: 29,354,096 Total Memory: 32,571,392
Page contains (689) HTMLNodes.
Page <TITLE> element is: 大连港:多措并举促产拓市场_图片新闻_中国政府网
Article body contains (458) HTMLNodes.
Article contains (15) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0001 of 0003, 0000 of 0005] - http://www.gov.cn/premier/2020-09/22/content_5545944.htm
Available Memory: 29,337,552 Total Memory: 32,571,392
Page contains (655) HTMLNodes.
Page <TITLE> element is: 李克强在上海考察_总理图片报道_中国政府网
Article body contains (431) HTMLNodes.
Article contains (18) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0001 of 0003, 0001 of 0005] - http://www.gov.cn/premier/2020-09/16/content_5544033.htm
Available Memory: 29,324,744 Total Memory: 32,571,392
Page contains (439) HTMLNodes.
Page <TITLE> element is: 李克强出席世界经济论坛全球企业家视频特别对话会_总理图片报道_中国政府网
Article body contains (212) HTMLNodes.
Article contains (6) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0001 of 0003, 0002 of 0005] - http://www.gov.cn/premier/2020-09/11/content_5542891.htm
Available Memory: 29,311,944 Total Memory: 32,571,392
Page contains (519) HTMLNodes.
Page <TITLE> element is: 李克强出席全国深化“放管服”改革优化营商环境电视电话会议并讲话 韩正主持_总理图片报道_中国政府网
Article body contains (292) HTMLNodes.
Article contains (12) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0001 of 0003, 0003 of 0005] - http://www.gov.cn/premier/2020-08/24/content_5537088.htm
Available Memory: 29,352,824 Total Memory: 32,571,392
Page contains (489) HTMLNodes.
Page <TITLE> element is: 李克强出席澜沧江—湄公河合作第三次领导人会议_总理图片报道_中国政府网
Article body contains (262) HTMLNodes.
Article contains (10) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0001 of 0003, 0004 of 0005] - http://www.gov.cn/premier/2020-08/21/content_5536371.htm
Available Memory: 29,319,816 Total Memory: 32,571,392
Page contains (687) HTMLNodes.
Page <TITLE> element is: 李克强在重庆考察_总理图片报道_中国政府网
Article body contains (463) HTMLNodes.
Article contains (21) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0002 of 0003, 0000 of 0005] - http://www.gov.cn/xinwen/2020-09/25/content_5547242.htm
Available Memory: 29,311,064 Total Memory: 32,571,392
Page contains (639) HTMLNodes.
Page <TITLE> element is: 新疆和若铁路建设稳步推进_图片新闻_中国政府网
Article body contains (408) HTMLNodes.
Article contains (9) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0002 of 0003, 0001 of 0005] - http://www.gov.cn/xinwen/2020-09/25/content_5547240.htm
Available Memory: 29,298,040 Total Memory: 32,571,392
Page contains (535) HTMLNodes.
Page <TITLE> element is: 河南宁陵:酥梨飘香助增收_图片新闻_中国政府网
Article body contains (304) HTMLNodes.
Article contains (7) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0002 of 0003, 0002 of 0005] - http://www.gov.cn/xinwen/2020-09/25/content_5547173.htm
Available Memory: 29,333,080 Total Memory: 32,571,392
Page contains (559) HTMLNodes.
Page <TITLE> element is: 长白山脚下:小特产为村民带来新“红利”_图片新闻_中国政府网
Article body contains (328) HTMLNodes.
Article contains (7) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0002 of 0003, 0003 of 0005] - http://www.gov.cn/xinwen/2020-09/25/content_5547255.htm
Available Memory: 29,295,944 Total Memory: 32,571,392
Page contains (519) HTMLNodes.
Page <TITLE> element is: 古村晒秋庆丰收_图片新闻_中国政府网
Article body contains (288) HTMLNodes.
Article contains (7) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
Visiting URL: [0002 of 0003, 0004 of 0005] - http://www.gov.cn/xinwen/2020-09/25/content_5547236.htm
Available Memory: 29,264,680 Total Memory: 32,571,392
Page contains (489) HTMLNodes.
Page <TITLE> element is: 河北唐山:剪纸文化进校园_图片新闻_中国政府网
Article body contains (258) HTMLNodes.
Article contains (6) image TagNodes.
ARTICLE LOADED. Sending to ScrapedArticleReceiver.
*****************************************************************************************
Traversing Site Completed.
Loaded a total of (14) articles.