It was forked from tabnine-vscode & modified for making it compatible with open source code models on hf.co/models.
** Announcement: latest version of this extension supports Code Llama 13B. Find more info here how to test Code Llama with this extension.
We also have extensions for:
Currently supported model is StarCoder from BigCode project. Find more info here.
Install just like any other vscode extension.
By default, this extension is using bigcode/starcoder & Hugging Face Inference API for the inference. However, you can configure to make inference requests to your custom endpoint that is not Hugging Face Inference API. Thus, if you are using the default Hugging Face Inference AP inference, you'd need to provide HF API Token.
You can supply your HF API token (hf.co/settings/token) with this command:
Cmd/Ctrl+Shift+P
to open VSCode command palette- Type:
Hugging Face Code: Set API token
- Create a new python file
- Try typing
def main():
Checking if the generated code is in The Stack
Hit Cmd+shift+a
to check if the generated code is in in The Stack.
This is a rapid first-pass attribution check using stack.dataportraits.org.
We check for sequences of at least 50 characters that match a Bloom filter.
This means false positives are possible and long enough surrounding context is necesssary (see the paper for details on n-gram striding and sequence length).
The dedicated Stack search tool is a full dataset index and can be used for a complete second pass.
Make sure you've installed yarn on your system.
- Clone this repo:
git clone https://github.com/huggingface/huggingface-vscode
- Install deps:
cd huggingface-vscode && yarn install --frozen-lockfile
- In vscode, open
Run and Debug
side bar & clickLaunch Extension
You can see input to & output from the code generation API:
- Open VSCode
OUTPUT
panel - Choose
Hugging Face Code
You can configure: endpoint to where request will be sent and special tokens.
Example:
Let's say your current code is this:
import numpy as np
import scipy as sp
{YOUR_CURSOR_POSITION}
def hello_world():
print("Hello world")
Then, the request body will look like:
const inputs = `{start token}import numpy as np\nimport scipy as sp\n{end token}def hello_world():\n print("Hello world"){middle token}`
const data = {inputs, parameters:{max_new_tokens:256}}; // {"inputs": "", "parameters": {"max_new_tokens": 256}}
const res = await fetch(endpoint, {
body: JSON.stringify(data),
headers,
method: "POST"
});
const json = await res.json() as any as {generated_text: string}; // {"generated_text": ""}
To test Code Llama 13B model:
- Make sure you have the latest version of this extesion.
- Make sure you have supplied HF API token
- Open Vscode Settings (
cmd+,
) & type:Hugging Face Code: Config Template
- From the dropdown menu, choose
codellama/CodeLlama-13b-hf
Read more here about Code LLama.
Repository | Description |
---|---|
huggingface-vscode-endpoint-server | Custom code generation endpoint for this repository |