/SeeNewsServer

News server side, JavaWeb implementation, based on Alibaba Cloud and Qiniu Cloud

Primary LanguageLexApache License 2.0Apache-2.0

SeeNewsServer

Server side of personal news APP, Java Servlet + Mysql implementation

The first version was hosted on Sina Cloud and later transferred to Alibaba Cloud. Pictures are stored in Qiniuyun CDN

JavaServlet+Mysql

Development records

Online log monitoring system Yesterday's updated news data will be sent to your mailbox at 10 o'clock every day Modify the problem of split method returning a single element for [][""] Initialization method: From the first page to the last page, 53 records per page, crawling news
If it is interrupted midway, breakpoint initialization is required. The method is:
Get the smallest id from the database, and then find out which page of the website the id is on
Crawl news records below this location

Random Image API

http://7xr4g8.com1.z0.glb.clouddn.com/671 Get pictures

blog.csdn.net/never_cxb

671 is a numerical number. Currently, the valid icon numbers are 0 to 964. Random pictures can be obtained by randomly generating IDs.

Random randrom = new Random(47);
String url = "http://7xr4g8.com1.z0.glb.clouddn.com/" +randrom.nextInt(964+1);

mysql Create table statement

Modify table type and length according to Exception: Data too long for column
title Longer example:
"Intelligent Perception and Image Understanding" Key Laboratory of the Ministry of Education The 15th Academic Week and Brain-like Computing and Big Data Deep Learning Frontier Forum source Longer example: Key Laboratory of Antenna and Microwave Technology The final table field type and length are as follows:

CREATE TABLE `rotation` (
  `id` int(11) NOT NULL,
  `image_urls` text,
  `title` varchar(100) DEFAULT NULL,
  `publish_date` date NOT NULL,
  `read_times` int(11) NOT NULL,
  `source` varchar(50) DEFAULT NULL,
  `body` longtext,
  UNIQUE KEY `id_UNIQUE` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Feature list

Crawling exception email notification

Based on JavaMail, send email notifications for abnormal pictures and abnormal URLs

Abnormal image url

Normal path of the picture /uploads/image/20160109/20160109***.jpg Old path /uploads/old/201152**.jpg

News id Abnormal image link Description
7798 src="/Public/kindeditor/php/
../../../uploads/image/2015**.jpg"
多了/Public/kindeditor/php/
前面需加上http://see.xidian.edu.cn
7302 <img src="file://C:\Users\ADMINI~1\AppData
\Local\Temp\%W@GJ$ACOF(TYDYECOKVDYB.png">
图片资源不存在
忽略
7017 src="http://see.xidian.edu.cn/
uploads/image/20141021/20**.jpg"
绝对路径开头

Reused file download icon

Icon Original address Qiniu key value
http://rsc.xidian.edu.cn/plus/img/addon.gif
http://see.xidian.edu.cn/uploads/old/ico/zip.jpg
http://xgc.xidian.edu.cn/images/mid.gif
http://jwc.xidian.edu.cn/images/ico/rar.jpg
http://202.117.120.88/images/download.gif
The resource does not exist, use the above gif instead
912720f605b84070e223d0dab690a114
3949a245e521f81ffd18e5d01347a20d
2a8eac72c3697a837dd66e9e5243a089
bc87e43d342b380a2145ee1bb8298759
f7324b0d360946315ac83fb8f2703044
The key for each link
http://see.xidian.edu.cn/uploads/old/file/doc.gif
http://jwc.xidian.edu.cn/images/ico/doc.jpg
http://see.xidian.edu.cn/uploads/old/ico/doc.jpg
b5805b46ce8cf9c634b3820a23d64ca6
f8d0fc587a7c7295835e8094af094d2d
ad5d0e0cf63834756dde3dc5e9629d8
http://see.xidian.edu.cn/uploads/old/file/xls.gif
http://jwc.xidian.edu.cn/images/ico/xls.jpg
http://zzb.xidian.edu.cn/new/WebEdit/sysimage/icon16/xls.gif
84b7028179e09614540cea8dd0122c3c
d72210a72c0e174245a65e8755f6eaa
1323ef50b1457274c914413b067e9192

Collected exception href:

News id Dirty data Description
- href="Electronic Academy" href is Chinese
7837 /uploads/file/20151202/20151202101309_73187.zip The same href appears multiple times
resulting in multiple substitutions
http://see.xidian.edu.cnhtt
p://see.xidian.edu.cn/**.zip
7710 href="Cultivation project application related documents" href is Chinese
- href="601240943@qq.com" Only email address
without the preceding "mailto:"
- kb.xidian.cc Does not start with http
6283 https://mail.google.com/mail/h/** https starts with
6206 ftp://linux.xidian.edu.cn ftp starts with

Note: Regular href starts with http https

Upload pictures to Qiniu Cloud

Asynchronously upload pictures to Qiniu Cloud

Thanks to open source, dependent class libraries