Pinned Repositories
CNMWP
NLPCC2023 Shared Task 3
code-of-acm
ACM Programming Contest Training Code.
comet-commonsense
Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction" https://arxiv.org/abs/1906.05317
Graph2Tree
Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)
LLMDataPapers
Must-read papers, related blogs on the data related methods for LLM.
ScaleBiO
This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting
TAGCOS
This is the official implementation of TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data
TSN-MD
Code for Teacher-Student Networks with Multiple Decoders for Solving Math Word Problem (IJCAI 2020).
LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
LMFlowBenchmark
2003pro's Repositories
2003pro/Graph2Tree
Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)
2003pro/ScaleBiO
This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting
2003pro/TSN-MD
Code for Teacher-Student Networks with Multiple Decoders for Solving Math Word Problem (IJCAI 2020).
2003pro/TAGCOS
This is the official implementation of TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data
2003pro/CNMWP
NLPCC2023 Shared Task 3
2003pro/LLMDataPapers
Must-read papers, related blogs on the data related methods for LLM.
2003pro/bridgecoder
This repository contains the official implementation of BridgeCoder.
2003pro/comet-commonsense
Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction" https://arxiv.org/abs/1906.05317
2003pro/2003pro.github.io
2003pro/AlpacaDataCleaned
Alpaca dataset from Stanford, cleaned and curated
2003pro/Awesome-Code-Intelligence
Neural Code Intelligence Survey 2024; Reading lists and resources
2003pro/awesome-semantic-parsing
Reading list for research topics in semantic parsing
2003pro/CodeXGLUE
CodeXGLUE
2003pro/comp5111-assignment
2003pro/COMP5111-Spring2022-Student-Assignments
Assignments Materials for HKUST COMP5111 (Spring 2022).
2003pro/doremi
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
2003pro/dual-mfa-vqa
Co-attending Regions and Detections with Multi-modal Multiplicative Embedding for VQA.
2003pro/extreme-bert
Customize ExtremeBERT to Rich-Number Text Corous
2003pro/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
2003pro/handong1587.github.io
2003pro/hhexiy.github.io
2003pro/incubator-singa
Mirror of Apache Singa (Incubating)
2003pro/JuICe
Code for generating the JuICe dataset.
2003pro/NaCGEC
2003pro/nlp_bibs
2003pro/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
2003pro/research_bibs
2003pro/sensitive-stop-words
互联网常用敏感词、停止词词库
2003pro/share_slides
2003pro/Zero-shot-knowledge-graph-relational-learning
Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs