Swipe4057's Stars
mit-han-lab/duo-attention
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
continuedev/prompt-file-examples
Sample .prompt files to use with Continue
Helixform/CodeCursor
An extension for using Cursor in Visual Studio Code.
unifiedjs/unified
☔️ interface for parsing, inspecting, transforming, and serializing content through syntax trees
Tencent/Tencent-Hunyuan-Large
simonmalberg/felix
aws-samples/generate-your-presentation-with-llm
getcursor/cursor
The AI Code Editor
continuedev/continue
⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
TabbyML/tabby
Self-hosted AI coding assistant
thu-ml/SageAttention
Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
NVIDIA/RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
searxng/searxng
SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
felladrin/MiniSearch
Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses Web-LLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
datamllab/LongLM
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
andyrdt/refusal_direction
Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".
THUDM/GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
nlp-uoregon/trankit
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
horseee/LLM-Pruner
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
Babelscape/wikineural
Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).
microsoft/table-transformer
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
IlyaGusev/ping_pong_bench
aryopg/decore
Official Implementation of "DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucination"
MERA-Evaluation/MERA
MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating SOTA models.
tianyi-lab/MoE-Embedding
Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"
fishaudio/fish-speech
Brand new TTS solution
andy-yang-1/DoubleSparse
16-fold memory access reduction with nearly no loss
AniZpZ/smoothquant
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.