chinaphilip's Stars
cocacola-lab/ChatIE
The online version is temporarily unavailable because we cannot afford the key. You can clone and run it locally. Note: we set defaul openai key. If keys exceed plan and are invalid, please tell us. The response speed depends on openai. ( sometimes, the official is too crowded and slow)
universal-ner/universal-ner
ljynlp/W2NER
Source code for AAAI 2022 paper: Unified Named Entity Recognition as Word-Word Relation Classification
thunlp/Few-NERD
Code and data of ACL 2021 paper "Few-NERD: A Few-shot Named Entity Recognition Dataset"
tomaarsen/SpanMarkerNER
SpanMarker for Named Entity Recognition
princeton-nlp/PURE
[NAACL 2021] A Frustratingly Easy Approach for Entity and Relation Extraction https://arxiv.org/abs/2010.12812
thunlp/PL-Marker
Source code for "Packed Levitated Marker for Entity and Relation Extraction"
facebookresearch/nougat
Implementation of Nougat Neural Optical Understanding for Academic Documents
OpenBMB/ToolBench
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
wjn1996/scrapy_for_zh_wiki
基于scrapy的层次优先队列方法爬取中文维基百科,并自动抽取结构和半结构数据
BIT-ENGD/baidu_baike
cbdb-project/BaiduBaikeSpider
Get data from Baidu Baike. Supports retrieving multiple synonyms for the same entry.
JinJackson/Baike_spider
spider for Baidu Baike
nam685/cosplade
CoSPLADE: Contextualizing SPLADE for Conversational Information Retrieval
jungomi/math-formula-recognition
Math formula recognition (Images to LaTeX strings)
MaliParag/ScanSSD
Scanning Single Shot Detector for Math in Document Images
gipplab/pdf-benchmark
A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents
odelliab/HowSumm
Large-scale query-focused multi-document Summarization dataset
ddaedalus/tres
Official code implementation of "Tree-based Focused Web Crawling with Reinforcement Learning" and the TRES framework
hiyouga/ChatGLM-Efficient-Tuning
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Tencent/TurboTransformers
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
bytedance/lightseq
LightSeq: A High Performance Library for Sequence Processing and Generation
kyegomez/LongNet
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
AGI-Edgerunners/LLM-Adapters
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
lucidrains/toolformer-pytorch
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
THUDM/ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
ydli-ai/CSL
[COLING 2022] CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集
lyuchenyang/Macaw-LLM
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration