/lute

Scrape rateyourmusic.com to curate a personalized, self-hosted music database and album recommendation engine.

Primary LanguageRust

Lute

Lute is a RateYourMusic album scraper and recommendation engine. It is a power-tool for music discovery that allows you to curate a self-hosted music database and generate personalized recommendations.

Features

  • Efficient Scraping: Crawl and index albums from RateYourMusic.
  • Polite: Fully configurable crawler concurrency, rate-limiting, backoff, max queue size, and data staleness checks. Remember we love RYM and don't want to get banned.
  • Personalized Collections: Curate albums into "Profiles" for tailored recommendations.
  • Spotify Integration: Import albums from your Spotify catalogue into profiles.
  • Advanced Recommendation Methods:
    • Quantile Ranking
    • [Coming Soon] Vector Similarity Search: Using OpenAI's API for album embeddings.
  • Browser Extension: Parse and index albums from RYM in real-time while browsing the site.
  • Proxy Support: Bring your own crawler proxy for uninterrupted scraping.
  • Interfaces: GRPC API and Web-based UI.
  • Data Export: Export to Postgres and Bolt-compatible graph databases(Neo4j, Memgraph) using connectors. Build your own connectors using the event-stream GRPC API.
  • Monitoring: OpenTelemetry support for diagnostics.

Disclaimer: This project is for educational purposes only. Excessive scraping of RateYourMusic may result in your IP being banned.

Getting Started

Requirements

  • Redis
  • S3-compatible object storage

Work in progress