sunnyl95's Stars
hankcs/HanLP
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
weaviate/weaviate-python-client
A python native client for easy interaction with a Weaviate instance.
pinecone-io/pinecone-python-client
The Pinecone Python client
qdrant/qdrant
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
facebookresearch/faiss
A library for efficient similarity search and clustering of dense vectors.
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
iMayK/CRUSH4SQL
stanford-futuredata/ARES
truera/trulens
Evaluation and Tracking for LLM Experiments
explodinggradients/ragas
Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
SeanLee97/AnglE
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
eosphoros-ai/sqlgpt-parser
sqlgpt-parser is a Python implementation of an SQL parser that effectively converts SQL statements into Abstract Syntax Trees (AST). By leveraging AST tree comparisons between two SQL queries, it becomes possible to achieve robust evaluation of text-to-SQL models.
macbre/sql-metadata
Uses tokenized query returned by python-sqlparse and generates query metadata
X-LANCE/medical-dataset
[ACL 2023 Findings] CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical Dataset
Dataherald/dataherald
Interact with your SQL database, Natural Language to SQL using LLMs
ray-project/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
meta-llama/codellama
Inference code for CodeLlama models
samlhuillier/code-llama-fine-tune-notebook
Fine-tune Code Llama to generate SQL queries from text
salesforce/WikiSQL
A large annotated semantic parsing corpus for developing natural language interfaces.
eosphoros-ai/DB-GPT
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
hiyouga/ChatGLM-Efficient-Tuning
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
CVI-SZU/Linly
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
wenge-research/YAYI
雅意大模型:为客户打造安全可靠的专属大模型,基于大规模中英文多领域指令数据训练的 LlaMA 2 & BLOOM 系列模型,由中科闻歌算法团队研发。(Repo for YaYi Chinese LLMs based on LlaMA2 & BLOOM)
LlamaFamily/Llama-Chinese
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
LinkSoul-AI/Chinese-Llama-2-7b
开源社区第一个能下载、能运行的中文 LLaMA2 模型!
defog-ai/sqlcoder
SoTA LLM for converting natural language questions to SQL queries
RUCKBReasoning/RESDSQL
The Pytorch implementation of RESDSQL (AAAI 2023).