/banditore

Banditore retrieves new releases from your starred GitHub repositories and generate an Atom feed with them.

Primary LanguagePHPMIT LicenseMIT

Banditore

CI Coveralls Status PHPStan level max

Banditore retrieves new releases from your GitHub starred repositories and put them in a RSS feed, just for you.

Requirements

  • PHP >= 7.4 (with pdo_mysql)
  • MySQL >= 5.7
  • Redis (to cache requests to the GitHub API)
  • RabbitMQ, which is optional (see below)
  • Supervisor (only if you use RabbitMQ)
  • NVM & Yarn to install assets

Installation

  1. Clone the project

    git clone https://github.com/j0k3r/banditore.git
  2. Register a new OAuth GitHub application and get the Client ID & Client Secret for the next step (for the Authorization callback URL put http://127.0.0.1:8000/callback)

  3. Install dependencies using Composer and define your parameter during the installation

    APP_ENV=prod composer install -o --no-dev

    If you want to use:

    • Sentry to retrieve all errors, register here and get your dsn (in Project Settings > DSN).
  4. Setup the database

    php bin/console doctrine:database:create -e prod
    php bin/console doctrine:schema:create -e prod
  5. Install assets

    nvm install
    yarn install
  6. You can now launch the website:

    php bin/console server:run -e prod

    And access it at this address: http://127.0.0.1:8000

Running the instance

Once the website is up, you now have to setup few things to retrieve new releases. You have two choices:

  • using crontab command (very simple and ok if you are alone)
  • using RabbitMQ (might be better if you plan to have more than few persons but it's more complex) 🤙

Without RabbitMQ

You just need to define these 2 cronjobs (replace all /path/to/banditore with real value):

# retrieve new release of each repo every 10 minutes
*/10  *   *   *   *   php /path/to/banditore/bin/console -e prod banditore:sync:versions >> /path/to/banditore/var/logs/command-sync-versions.log 2>&1
# sync starred repos of each user every 5 minutes
*/5   *   *   *   *   php /path/to/banditore/bin/console -e prod banditore:sync:starred-repos >> /path/banditore/to/var/logs/command-sync-repos.log 2>&1

With RabbitMQ

  1. You'll need to declare exchanges and queues. Replace guest by the user of your RabbitMQ instance (guest is the default one):
php bin/console messenger:setup-transports -vvv sync_starred_repos
php bin/console messenger:setup-transports -vvv sync_versions
  1. You now have two queues and two exchanges defined:
  • banditore.sync_starred_repos: will receive messages to sync starred repos of all users
  • banditore.sync_versions: will receive message to retrieve new release for repos
  1. Enable these 2 cronjobs which will periodically push messages in queues (replace all /path/to/banditore with real value):
# retrieve new release of each repo every 10 minutes
*/10  *   *   *   *   php /path/to/banditore/bin/console -e prod banditore:sync:versions --use_queue >> /path/to/banditore/var/logs/command-sync-versions.log 2>&1
# sync starred repos of each user every 5 minutes
*/5   *   *   *   *   php /path/to/banditore/bin/console -e prod banditore:sync:starred-repos --use_queue >> /path/banditore/to/var/logs/command-sync-repos.log 2>&1
  1. Setup Supervisor using the sample file from the repo. You can copy/paste it into /etc/supervisor/conf.d/ and adjust path. The default file will launch:
  • 2 workers for sync starred repos
  • 4 workers to fetch new releases

Once you've put the file in the supervisor conf repo, run supervisorctl update && supervisorctl start all (update will read your conf, start all will start all workers)

Monitoring

There is a status page available at /status, it returns a json with some information about the freshness of fetched versions:

{
    "latest": {
        "date": "2019-09-17 19:50:50.000000",
        "timezone_type": 3,
        "timezone": "Europe\/Berlin"
    },
    "diff": 1736,
    "is_fresh": true
}
  • latest: the latest created version as a DateTime
  • diff: the difference between now and the latest created version (in seconds)
  • is_fresh: indicate if everything is fine by comparing the diff above with the status_minute_interval_before_alert parameter

For example, I've setup a check on updown.io to check that status page and if the page contains "is_fresh":true. So I receive an alert when is_fresh is false: which means there is a potential issue on the server.

Running the test suite

If you plan to contribute (you're awesome, I know that ✌️), you'll need to install the project in a different way (for example, to retrieve dev packages):

git clone https://github.com/j0k3r/banditore.git
composer install -o
php bin/console doctrine:database:create -e=test
php bin/console doctrine:schema:create -e=test
php bin/console doctrine:fixtures:load --env=test -n
php bin/simple-phpunit -v

By default the test connexion login is root without password. You can change it in app/config/config_test.yml.

How it works

Ok, if you goes that deeper in the readme, it means you're a bit more than interested, I like that.

Retrieving new release / tag

This is the complex part of the app. Here is a simplified solution to achieve it.

New release

It's not as easy as using the /repos/:owner/:repo/releases API endpoint to retrieve latest release for a given repo. Because not all repo owner use that feature (which is a shame in my case).

All information for a release are available on that endpoint:

  • name of the tag (ie: v1.0.0)
  • name of the release (ie: yay first release)
  • published date
  • description of the release

Check a new release of that repo as example: https://api.github.com/repos/j0k3r/banditore/releases/5770680

New tag

Some owners also use tag which is a bit more complex to retrieve all information because a tag only contains information about the SHA-1 of the commit which was used to make the tag. We only have these information:

  • name of the tag (ie: v1.4.2)
  • name of the release will be the name of the tag, in that case

Check tag list of swarrot/SwarrotBundle as example: https://api.github.com/repos/swarrot/SwarrotBundle/tags

After retrieving the tag, we need to retrieve the commit to get these information:

  • date of the commit
  • message of the commit

Check a commit from the previous tag list as example: https://api.github.com/repos/swarrot/SwarrotBundle/commits/84c7c57622e4666ae5706f33cd71842639b78755

GitHub Client Discovery

This is the most important piece of the app. One thing that I ran though is hitting the rate limit on GitHub. The rate limit for a given authenticated client is 5.000 calls per hour. This limit is never reached when looking for new release (thanks to the conditional requests of the GitHub API) on a daily basis.

But when new user sign in, we need to sync all its starred repositories and also all their releases / tags. And here come the gourmand part:

  • one call for the list of release
  • one call to retrieve information of each tag (if the repo doesn't have release)
  • one call for each release to convert markdown text to html

Let's say the repo:

  • has 50 tags: 1 (get tag list) + 50 (get commit information) + 50 (convert markdown) = 101 calls.
  • has 50 releases: 1 (get tag list) + 50 (get each release) + 50 (convert markdown) = 101 calls.

And keep in mind that some repos got also 1.000+ tags (!!).

To avoid hitting the limit in such case and wait 1 hour to be able to make requests again I created the GitHub Client Discovery class. It aims to find the best client with enough rate limit remain (defined as 50).

  • it first checks using the GitHub OAuth app
  • then it checks using all user GitHub token

Which means, if you have 5 users on the app, you'll be able to make (1 + 5) x 5.000 = 30.000 calls per hour