/claude-dev

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, and more with your permission every step of the way.

Primary LanguageTypeScriptMIT LicenseMIT

Claude Dev

Download VSCode Extension | Join the Discord

Thanks to Claude 3.5 Sonnet's agentic coding capabilities Claude Dev can handle complex software development tasks step-by-step. With tools that let him create & edit files, explore complex projects, and execute terminal commands (after you grant permission), he can assist you in ways that go beyond simple code completion or tech support. While autonomous AI scripts traditionally run in sandboxed environments, Claude Dev provides a human-in-the-loop GUI to supervise every file changed and command executed, providing a safe and accessible way to explore the potential of agentic AI.

  • Paste images in chat to use Claude's vision capabilities and turn mockups into fully functional applications or fix bugs with screenshots
  • Inspect diffs of every change Claude makes right in the editor, and provide feedback until you're satisfied with the result
  • Runs CLI commands directly in chat, so you never have to open a terminal yourself (+ respond to interactive commands by sending a message)
  • Presents permission buttons (i.e. 'Approve terminal command') before tool use or sending information to the API
  • Keep track of total tokens and API usage cost for the entire task loop and individual requests
  • Set a maximum # of API requests allowed for a task before being prompted for permission to proceed
  • When a task is completed, Claude determines if he can present the result to you with a terminal command like open -a "Google Chrome" index.html, which you run with a click of a button

Pro tip: Use the Cmd + Shift + P shortcut to open the command palette and type Claude Dev: Open In New Tab to start a new task right in the editor.

How it works

Claude Dev uses an autonomous task execution loop with chain-of-thought prompting and access to powerful tools that give him the ability to accomplish nearly any task. Start by providing a task and the loop fires off, where Claude might use certain tools (with your permission) to accomplish each step in his thought process.

Tools

Claude Dev has access to the following capabilities:

  1. execute_command: Execute terminal commands on the system (only with your permission, output is streamed into the chat and you can respond to stdin or exit long-running processes when you're ready)
  2. read_file: Read the contents of a file at the specified path
  3. write_to_file: Write content to a file at the specified path, automatically creating any necessary directories
  4. list_files: List all paths for files in the specified directory. When recursive = true, it recursively lists all files in the directory and its nested folders (excludes files in .gitignore). When recursive = false, it lists only top-level files (useful for generic file operations like retrieving a file from your Desktop).
  5. list_code_definition_names: Parses all source code files at the top level of the specified directory to extract names of key elements like classes and functions (see more below)
  6. search_files: Search files in a specified directory for text that matches a given regex pattern (useful for refactoring code, addressing TODOs and FIXMEs, removing dead code, etc.)
  7. ask_followup_question: Ask the user a question to gather additional information needed to complete a task (due to the autonomous nature of the program, this isn't a typical chatbot–Claude Dev must explicitly interrupt his task loop to ask for more information)
  8. attempt_completion: Present the result to the user after completing a task, potentially with a terminal command to kickoff a demonstration

Working in Existing Projects

When given a task in an existing project, Claude will look for the most relevant files to read and edit the same way you or I would–by first looking at the names of directories, files, classes, and functions since these names tend to reflect their purpose and role within the broader system, and often encapsulate high-level concepts and relationships that help understand a project's overall architecture. With tools like list_code_definition_names and search_files, Claude is able to extract names of various elements in a project to determine what files are most relevant to a given task without you having to mention @files or @folders yourself.

  1. File Structure: When a task is started, Claude is given an overview of your project's file structure. It turns out Claude 3.5 Sonnet is really good at inferring what it needs to process further just from these file names alone.

  2. Source Code Definitions: Claude may then use the list_code_definition_names tool on specific directories of interest. This tool uses tree-sitter to parse source code with custom tag queries that extract names of classes, functions, methods, and other definitions. It works by first identifying source code files that tree-sitter can parse (currently supports python, javascript, typescript, ruby, go, java, php, rust, c, c++, c#, swift), then parsing each file into an abstract syntax tree, and finally applying a language-specific query to extract definition names (you can see the exact query used for each language in src/parse-source-code/queries). The results are formatted into a concise & readable output that Claude can easily interpret to quickly understand the code's structure and purpose.

  3. Search Files: Claude can also use the search_files tool to search for specific patterns or content across multiple files. This tool uses ripgrep to perform regex searches on files in a specified directory. The results are formatted into a concise & readable output that Claude can easily interpret to quickly understand the code's structure and purpose. This can be useful for tasks like refactoring function names, updating imports, addressing TODOs and FIXMEs, etc.

  4. Read Relevant Files: With insights gained from the names of various files and source code definitions, Claude can then use the read_file tool to examine specific files that are most relevant to the task at hand.

By carefully managing what information is added to context, Claude can provide valuable assistance even for complex, large-scale projects without overwhelming its context window.

Only With Your Permission

Claude always asks for your permission first before any tools are executed or information is sent back to the API. This puts you in control of this agentic loop, every step of the way.

image

Contribution

Paul Graham said it best, "if you build something now that barely works with AI, the next models will make it really work." I've built this project with the assumption that scaling laws will continue to improve the quality (and cost) of AI models, and what might be difficult for Claude 3.5 Sonnet today will be effortless for future generations. That is the design philosophy I'd like to develop this project with, so it will always be updated with the best models, tools, and capabilities available–without wasting effort on implementing stopgaps like cheaper agents. With that said, I'm always open to suggestions and feedback, so please feel free to contribute to this project by submitting issues and pull requests.

To build Claude Dev locally, follow these steps:

  1. Clone the repository:
    git clone https://github.com/saoudrizwan/claude-dev.git
  2. Open the project in VSCode:
    code claude-dev
  3. Install the necessary dependencies for the extension and webview-gui:
    npm run install:all
  4. Launch by pressing F5 to open a new VSCode window with the extension loaded

Reviews

License

This project is licensed under the MIT License. See the LICENSE file for details.

Questions?

Contact me on X @sdrzn. Please create an issue if you come across a bug or would like a feature to be added.

Acknowledgments

Special thanks to Anthropic for providing the model that powers this extension.