/sweep

Sweep: AI-powered Junior Developer for small features and bug fixes.

Primary LanguagePythonOtherNOASSERTION

Github Issues ⟶  Refactored and Tested Python Code!

Install Sweep Github App Self Host Sweep Docker Image Docker Pulls Docs Better Stack Badge Python Unit Tests

🎊 We recently updated our README to reflect our improvements to Python refactors and unit tests!


Sweep is an AI junior developer that refactors and writes unit tests for Python. 🐍 🤖

Install Sweep and open a Github Issue like: Sweep: Refactor the run function in main.py and Sweep will:

  1. Identify the best places to refactor your code
  2. Refactor and add unit tests through Github
  3. Run and debug your code to open a Pull Request

Features

  • Turns issues directly into pull requests (without an IDE)
  • Addresses developer replies & comments on its PRs
  • Understands your codebase using the dependency graph, text, and vector search.
  • Runs your unit tests and autoformatters to validate generated code.

Sweep Youtube Tutorial


What makes Sweep Different

We've been addressing code modification using LLMs for a while. We found and are fixing a lot of issues.

  •  Refactoring Code LLMs are bad at refactoring code. It's really challenging for them to extract all of the necessary parameters. Check out https://docs.sweep.dev/blogs/refactor-python!
    • Sweep solves this by using Rope and our custom DSL to perform perfect refactors every time!
  •  Unit Test Most AI unit test copilots don't even validate the code. They leave it to the user to make sure the generated code works, which is half of the battle. Check out https://docs.sweep.dev/blogs/ai-unit-tests!
    • Sweep runs your code for you, which catches bugs and makes sure each line of old and new code has been properly validated!
  • Formatting LLMs are also bad at properly formatting code, such as by adding typehints and making sure we use tabs instead of spaces. Check out https://docs.sweep.dev/blogs/super-linter!
    • Sweep uses it's sandbox to format your code, and uses Rules to perform other changes like adding typehints, or any other small chores!

Getting Started

GitHub App

Install Sweep by adding the Sweep GitHub App to your desired repositories.

  • For more details, visit our Installation page.

  • Note: Sweep only considers issues with the "Sweep:" title on creation and not on update. If you want Sweep to pick up an issue after it has been created, you can add the "Sweep" label to the issue.

  • We focus on Python but support all languages GPT-4 can write. This includes JS/TS, Rust, Go, Java, C# and C++.

Self-Hosting

You can self-host Sweep with our Docker image (https://hub.docker.com/r/sweepai/sweep). Please check out our deployment instructions here! https://docs.sweep.dev/deployment

Development

Starting the Webhook

  1. Clone the repo with git clone https://github.com/sweepai/sweep.
  2. Create .env according to https://docs.sweep.dev/deployment.
  3. Run docker compose up --build. This will take a few moments to start.

To build our Docker images, run docker compose build.


Story

We used to work in large, messy repositories, and we noticed how complex the code could get without regular refactors and unit tests. We realized that AI could handle these chores for us, so we built Sweep!

Unlike existing AI solutions, Sweep can solve entire tickets and can be parallelized + asynchronous: developers can spin up 10 tickets and Sweep will address them all at once.

The Stack

  • GPT-4 32k
  • Code Search Engine using Python AST
  • Code Sandbox
  • Programmatic refactors using Rope!

Highlights

Examine pull requests created by Sweep here.

Pricing

Every user receives unlimited GPT-3.5 tickets and 5 GPT-4 tickets per month. For professionals who want to try unlimited GPT-4 tickets and priority support, you can get a one week free trial of Sweep Pro.

For more GPT-4 tickets visit our payment portal!

You can self-host Sweep's docker image on any machine (AWS, Azure, your laptop) for free. You can get enterprise support by contacting us.


Limitations of Sweep

  • Gigantic repos: >5000 files. We have default extensions and directories to exclude but sometimes this doesn't catch them all. You may need to block some directories (see blocked_dirs)

    • If Sweep is stuck at 0% for over 30 min and your repo has a few thousand files, let us know.
  • Large-scale refactors: >3 files or >150 lines of code changes

    • e.g. Refactor the entire codebase from TensorFlow to PyTorch
    • Sweep works best when pointed to a file, and we're continously improving Sweep's automation!
  • Editing images and other non-text assets

    • e.g. Use the logo to create favicons for our landing page

Contributing

Contributions are welcome and greatly appreciated! To get set up, see Development. For detailed guidelines on how to contribute, please see the CONTRIBUTING.md file.

Contributors

Thank you for your contribution!

and, of course, Sweep!