/semantic-web-scraper

Primary LanguageJavaGNU General Public License v2.0GPL-2.0

SemanticWebScraper

I've got a terriable issue. I rack up tabs faster than I can read, and I just keep opening more. It's like a habbit I can't kick, and I've got more articles backloged than anyone can care to read. So rather then make a project to curb me of this obsessive desire to hord tabs, I've made the semantic-web-scraper to enable it in a sane manner.

The goal is to have a program handle the following

  • Given a list of urls.
  • Reads all text in pages of urls.
  • Extracts keywords from all text.
  • Attempts to cluster urls based on keywords.
  • Displays clusters in graph.
  • Outputs bookmark directorys