/tongxinyun_spiders

Crawlers designed for Tongxinyun Blog and more.

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

tongxinyun_crawlers

FOR PRIVATE AND LEGITIAMTE USE ONLY

Half a year ago when I peered into the APIs of the once popular site tongxinyun, I was just keen to consecutively record some of the blogs (see microblog.ipynb); yet I found more. As described as instrcutions in the other two files, I found it possible to directly fetch the personal information of those who'd ever registered. Though not shown on the frontend (of course it wouldn't), information like e-mail address, 11-digit phone number, degree, and of course one's name and major are fecthable, regardless of the user's settings.
The site is however, ostensibly, up till now under maintanence and seems unlikely to be opened again, whereas its several internal subsites does work (I hope that you have ever logged in and never since logged out —— the talented administrator set the login API inaccessible). It's still practical to manually browse onto the public address book (though some irrelative warnings may pop out and the webpage looks really shabby), use its embedded function of directly searching somebody by name to get his/her system id, and finally see this guy's information by a concealed API.
Each of the files are proved to be effective when the site was still open, but now that it is no longer pragmatic by sending a request, these are for reference and archive now. Hope you enjoy.

2019 Dec.15: set depository public.

2020 Oct.13: Actually about half a month ago my friends and I noticed the site was opened up again and to our joy, this project proved to be efficacious despite some minor flaws about the data format. Yet we are not going to update anymore on the code: the current version is just adequate as far as our interest went. Any further contribution is nonetheless welcomed.

By the way, high tribute shall be paid to the "system manager", who is also an enthusiastic coder (accroding to our humble inference). This person devoted considerable resources to this site even when it was suspended, and we believe his commitment has gone beyond duty. We hope that the Cenotaph of Tongxinyun is no longer necessary and the site will be well maintained no matter what mishap takes place.

To end with, this project has been archived to the Arctic Code Vault Contributor Project in 2020 February. The curiousness of several sophomores will hopefully be revealed after generations. See you guys at the other end of the world.