Pinned Repositories
100_Tensor_exercises
100 个 PyTorch Tensor 练习题(带答案)
2021-GAIIC-Task3-Share
全球人工智能技术创新大赛-赛道三:小布助手对话短文本语义匹配
2022_GAIIC_Task1_1st
2022京东全球人工智能技术创新大赛 电商关键属性的图文匹配任务第1名方案
2022_GAIIC_Task2_5st
ark-nlp
A private nlp coding package, which quickly implements the SOTA solutions.
awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型集合
bert4torch
参考bert4keras的pytorch实现
PaddleNLP
Easy-to-use and Fast NLP library with awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications.
PromptCLUE
PromptCLUE:支持最多中文Prompt任务的开源多任务模型
Top-AI-Conferences-Paper-with-Code
Top-Conferences-Paper-with-Code (ACL、EMNLP、NAACL、COLING、AAAI、IJCAI、NeurIPS、ICLR and etc)
Tim-taoxq's Repositories
Tim-taoxq/AI-and-competition
这里用来存储做人工智能项目的代码和参加数据挖掘比赛的代码
Tim-taoxq/Awesome-LLMs-Datasets
Summarize existing representative LLMs text datasets.
Tim-taoxq/Baidu-Business-AI-Technology-Innovation-Competition-Track-2-Advertising-Image-Description-Generation
百度商业AI技术创新大赛赛道二:广告图片描述生成 Rank3方案分享
Tim-taoxq/codellm-data-preprocess-pipeline
代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota
Tim-taoxq/cutword
一个简单快速的分词、命名实体识别工具
Tim-taoxq/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
Tim-taoxq/EasyNLP
EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
Tim-taoxq/HFUTCheaterCollection
Hefei University of Technology 投稿、举报、监督、咨询Email:hfutcheater@proton.me blog| https://hfut-cheater.github.io 合肥工业大学 安徽 作弊 造假 贪污 论文抄袭 贿赂 包庇 权力寻租 挪用基金 组织舞弊 越南留学生反华 南沙群岛 购买比赛 集体舞弊|作弊封神榜 包庇行政名单
Tim-taoxq/KDD2024-WhoIsWho-Top3
KDD2024-WhoIsWho-Top3
Tim-taoxq/KddCup-2024-OAG-Challenge-1st-Solutions
Tim-taoxq/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
Tim-taoxq/LLM-Dojo
欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩🎓👨🎓
Tim-taoxq/LLM-zero2hero
Tim-taoxq/llm_illustrated
看图学大模型
Tim-taoxq/lmms-finetune
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, qwen-vl, phi3-v etc.
Tim-taoxq/MetaGPT
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Tim-taoxq/MINI_LLM
This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.
Tim-taoxq/minimind
【大模型】3小时完全从0训练一个仅有26M的小参数GPT,最低仅需2G显卡即可推理训练!
Tim-taoxq/nlp-competitions-list-review
复盘所有NLP比赛的TOP方案,只关注NLP比赛,持续更新中!
Tim-taoxq/PyTorch-Tutorial-2nd
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
Tim-taoxq/QWen2-from_ground_up
Tim-taoxq/qwen2_seq_cls
使用 Qwen2ForSequenceClassification 简单实现文本分类任务。
Tim-taoxq/RAG-competition
RAG的比赛
Tim-taoxq/SFT-and-DPO
This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)
Tim-taoxq/Steel-LLM
Train a Chinese LLM From 0 by Personal
Tim-taoxq/TinyStories
从头预训练一只超迷你 LLaMA 3——复现 TinyStories
Tim-taoxq/transformers-code
手把手带你实战 Huggingface Transformers 课程视频同步更新在B站与YouTube
Tim-taoxq/tree2retriever
Recursive Abstractive Processing for Tree-Organized Retrieval
Tim-taoxq/YAYI-UIE
雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)
Tim-taoxq/YAYI2
YAYI 2 是中科闻歌研发的新一代开源大语言模型,采用了超过 2 万亿 Tokens 的高质量、多语言语料进行预训练。(Repo for YaYi 2 Chinese LLMs)