/SemanticNews

Only top headlines; no gossip, no ads, no clickbait, no nonsense.

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

SemanticNews

Rss Python FastAPI PyTorch SQLite

Sick of the "computer algorithms" that google news uses, I wanted to make a more open version of a news reader for personal use, with the benifit of zero tracking or cookies.

The theory is that, given a pool of article titles, major events would have similar titles and thus are headlines.

To compare headlines, semantics are extracted via distilbert-base-uncased, where we can use k-means to find the center of clusters and rank headlines based on their distances.

Visit a version of it (running on a raspberry pi 3) Here!

RSS feed

An rss feed is supported at https://semanticnews.dedyn.io/feed/rss as well as all other endpoints.

Running

You will need python3 (tested on 3.9.2) and the following libraries

pip3 install fastapi uvicorn rfeed feedparser numpy transformers torch onnxruntime

Then just download, unzip, and start the local server and visit http://127.0.0.1:8080 on your browser!

python3 main.py

Note, the startup time will be awfully slow due to downloading and converting the bert model to onnx to run on a pi, as well as the initial population and vectorisation of articles.

How you can help

Got any other rss source you want to see added? Chuck in a pull request for sources.py and Ill see to it.

Remember to give this repo a ⭐ if you found it useful.