🤗 VSCode extension for testing open source code completion models

It was forked from tabnine-vscode & modified for making it compatible with open source code models on hf.co/models.

** Announcement: latest version of this extension supports Code Llama 13B. Find more info here how to test Code Llama with this extension.

We also have extensions for:

Currently supported model is StarCoder from BigCode project. Find more info here.

Installing

Install just like any other vscode extension.

By default, this extension is using bigcode/starcoder & Hugging Face Inference API for the inference. However, you can configure to make inference requests to your custom endpoint that is not Hugging Face Inference API. Thus, if you are using the default Hugging Face Inference AP inference, you'd need to provide HF API Token.

HF API token

You can supply your HF API token (hf.co/settings/token) with this command:

Cmd/Ctrl+Shift+P to open VSCode command palette
Type: Hugging Face Code: Set API token

Testing

Create a new python file
Try typing def main():

Checking if the generated code is in The Stack

Hit Cmd+shift+a to check if the generated code is in in The Stack. This is a rapid first-pass attribution check using stack.dataportraits.org. We check for sequences of at least 50 characters that match a Bloom filter. This means false positives are possible and long enough surrounding context is necesssary (see the paper for details on n-gram striding and sequence length). The dedicated Stack search tool is a full dataset index and can be used for a complete second pass.

Developing

Make sure you've installed yarn on your system.

Clone this repo: git clone https://github.com/huggingface/huggingface-vscode
Install deps: cd huggingface-vscode && yarn install --frozen-lockfile
In vscode, open Run and Debug side bar & click Launch Extension

Checking output

You can see input to & output from the code generation API:

Open VSCode OUTPUT panel
Choose Hugging Face Code

Configuring

You can configure: endpoint to where request will be sent and special tokens.

Example:

Let's say your current code is this:

import numpy as np
import scipy as sp
{YOUR_CURSOR_POSITION}
def hello_world():
    print("Hello world")

Then, the request body will look like:

const inputs = `{start token}import numpy as np\nimport scipy as sp\n{end token}def hello_world():\n    print("Hello world"){middle token}`
const data = {inputs, parameters:{max_new_tokens:256}};  // {"inputs": "", "parameters": {"max_new_tokens": 256}}

const res = await fetch(endpoint, {
    body: JSON.stringify(data),
    headers,
    method: "POST"
});

const json = await res.json() as any as {generated_text: string};  // {"generated_text": ""}

Code Llama

To test Code Llama 13B model:

Make sure you have the latest version of this extesion.
Make sure you have supplied HF API token
Open Vscode Settings (cmd+,) & type: Hugging Face Code: Config Template
From the dropdown menu, choose codellama/CodeLlama-13b-hf

Community

Repository	Description
huggingface-vscode-endpoint-server	Custom code generation endpoint for this repository

osanseviero/huggingface-vscode