Pinned Repositories
Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
Awesome_APIs
:octocat: A collection of APIs
bert4torch
参考bert4keras的pytorch实现
Chinese-Text-Classification-Pytorch
中文文本分类,TextCNN,TextRNN,FastText,TextRCNN,BiLSTM_Attention,DPCNN,Transformer,基于pytorch,开箱即用。
Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
ConSERT
Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
PaddleNLP
Easy-to-use NLP library with Awesome pre-trained model zoo, supporting wide-range of NLP tasks from research to industrial applications.
vector_search
各种向量搜索工具
kg-nlp's Repositories
kg-nlp/vector_search
各种向量搜索工具
kg-nlp/PaddleNLP
Easy-to-use NLP library with Awesome pre-trained model zoo, supporting wide-range of NLP tasks from research to industrial applications.
kg-nlp/bert4torch
参考bert4keras的pytorch实现
kg-nlp/Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
kg-nlp/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
kg-nlp/datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
kg-nlp/DeepLearning-500-questions
深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06
kg-nlp/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
kg-nlp/EasyNLP
EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
kg-nlp/DB-GPT
Revolutionizing Database Interactions with Private LLM Technology
kg-nlp/DB-GPT-Hub
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance, especially in Text-to-SQL.
kg-nlp/FinGLM
kg-nlp/k8s_images
k8s镜像仓库
kg-nlp/kserve
Serverless Inferencing on Kubernetes
kg-nlp/kubeflow_pytorch
kg-nlp/Match-Ignition
kg-nlp/MrDoc
mrdoc,online document system developed based on python. It is suitable for individuals and small teams to manage documents, wiki, knowledge and notes. 觅思文档,适合于个人和中小型团队的在线文档、知识库系统。
kg-nlp/NLP-Loss-Pytorch
Implementation of some unbalanced loss like focal_loss, dice_loss, DSC Loss, GHM Loss et.al
kg-nlp/public-apis
A collective list of free APIs
kg-nlp/pytorch-loss
label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful
kg-nlp/Scorecard-Bundle
A High-level Scorecard Modeling API | 评分卡建模尽在于此
kg-nlp/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
kg-nlp/sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
kg-nlp/Tabular-LLM
本项目旨在收集开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。
kg-nlp/text2vec
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
kg-nlp/text_classification
使用rnn,lstm,gru,fasttext,textcnn,dpcnn,rnn-att,lstm-att,兼容huggleface/transformers,以及以transforemrs作为词嵌入模型,后面接入cnn、rnn、attention等等做文本分类。以及各个模型的对比
kg-nlp/torchrec
Pytorch domain library for recommendation systems
kg-nlp/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
kg-nlp/vocab-coverage
语言模型中文认知能力分析
kg-nlp/xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow