thtang's Stars
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
LlamaFamily/Llama-Chinese
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
dottxt-ai/outlines
Structured Text Generation
NielsRogge/Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
DA-southampton/NLP_ability
总结梳理自然语言处理工程师(NLP)需要积累的各方面知识,包括面试题,各种基础知识,工程能力等等,提升核心竞争力
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
microsoft/BlingFire
A lightning fast Finite State machine and REgular expression manipulation library.
noamgat/lm-format-enforcer
Enforce the output format (JSON Schema, Regex etc) of a language model
RUC-NLPIR/FlashRAG
⚡FlashRAG: A Python Toolkit for Efficient RAG Research
McGill-NLP/llm2vec
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
wangyuxinwhy/uniem
unified embedding model
quqxui/Awesome-LLM4IE-Papers
Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)
ContextualAI/gritlm
Generative Representational Instruction Tuning
texttron/tevatron
Tevatron - A flexible toolkit for neural retrieval research and development.
sunnweiwei/RankGPT
Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]
SeanLee97/AnglE
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
rohan-paul/LLM-FineTuning-Large-Language-Models
LLM (Large Language Model) FineTuning
castorini/rank_llm
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
mlpc-ucsd/BLIVA
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
chaoswork/llm_illustrated
看图学大模型
facebookresearch/tart
Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.
jakespringer/echo-embeddings
stanford-oval/wikidata-emnlp23
WikiSP, a semantic parser for Wikidata. WikiWebQuestions, a SPARQL-annotated dataset on Wikidata
livingbio/fuzzy-json
Fuzzy-JSON is a compact Python package with no dependencies, designed to address the pesky JSONDecodeError that sometimes occurs when utilizing OpenAI's powerful call function.
vegetablejuiceftw/wiki-search
Wikipedia / Wikidata search project for knowledge base RAG systems.