jllllll

Pinned Repositories

bitsandbytes
8-bit CUDA functions for PyTorch
Language:Python25 0 35
bitsandbytes-windows-webui
Windows compile of bitsandbytes for use in text-generation-webui.
Language:HTML360 7 2941
ctransformers-cuBLAS-wheels
ctransformers wheels with pre-built CUDA binaries for additional CUDA and AVX versions.
Language:HTML12 1 02
exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
Language:Python65 2 1312
flash-attention
Fast and memory-efficient exact attention - Windows wheels
Language:Python33 0 14
GPTQ-for-LLaMa-CUDA
A combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.
Language:Python22 0 03
GPTQ-for-LLaMa-Wheels
Precompiled Wheels for GPTQ-for-LLaMa
18 3 23
llama-cpp-python-cuBLAS-wheels
Wheels for llama-cpp-python compiled with cuBLAS support
Language:HTML96 1 2747
one-click-installers
Simplified installers for oobabooga/text-generation-webui.
Language:Batchfile55 7 812
windows-venv-installers
Standalone, dependency-less scripts for automatically setting up a virtual environment for easy project installation on Windows.
Language:PowerShell7 1 02

jllllll's Repositories

jllllll/bitsandbytes-windows-webui
Windows compile of bitsandbytes for use in text-generation-webui.
Language:HTML360 7 2941
jllllll/llama-cpp-python-cuBLAS-wheels
Wheels for llama-cpp-python compiled with cuBLAS support
Language:HTML96 1 2747
jllllll/exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
Language:Python65 2 1312
jllllll/one-click-installers
Simplified installers for oobabooga/text-generation-webui.
Language:Batchfile55 7 812
jllllll/flash-attention
Fast and memory-efficient exact attention - Windows wheels
Language:Python33 0 14
jllllll/bitsandbytes
8-bit CUDA functions for PyTorch
Language:Python25 0 35
jllllll/GPTQ-for-LLaMa-CUDA
A combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.
Language:Python22 0 03
jllllll/GPTQ-for-LLaMa-Wheels
Precompiled Wheels for GPTQ-for-LLaMa
18 3 23
jllllll/ctransformers-cuBLAS-wheels
ctransformers wheels with pre-built CUDA binaries for additional CUDA and AVX versions.
Language:HTML12 1 02
jllllll/windows-venv-installers
Standalone, dependency-less scripts for automatically setting up a virtual environment for easy project installation on Windows.
Language:PowerShell7 1 02
jllllll/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python5 0 02
jllllll/exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
Language:Python3 1 01
jllllll/text-generation-webui
A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
Language:Python2 0 00
jllllll/h2ogpt
Private Q&A and summarization of documents+images or chat with local GPT, 100% private, no data leaks, Apache 2.0. Demo: https://gpt.h2o.ai/
Language:Python1 0 00
jllllll/GPTQ-for-LLaMa
4 bits quantization of LLMs using GPTQ
Language:Python0 0 00
jllllll/ctransformers
Python bindings for the Transformer models implemented in C/C++ using GGML library.
Language:C0 0
jllllll/safetensors
Simple, safe way to store and distribute tensors
Language:Python0 0
jllllll/scikit-build-core
A next generation Python CMake adaptor and Python API for plugins
Language:Python0 0
jllllll/SillyTavern
LLM Frontend for Power Users.
Language:JavaScript0 0