leopoldwhite
Undergrad, Xi'an Jiaotong University. Natural language processing, knowledge graphs, social network analysis.
Xi'an Jiaotong UniversityXi'an, China
leopoldwhite's Stars
happen2me/subgraph-retrieval-toolkit
SRTK: Retrieve semantic-relevant subgraphs from large-scale knowledge graphs
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
whr000001/Bot_and_Misinformation
This is code for How Do Social Bots Participate in Misinformation Spread? A Comprehensive Dataset and Analysis
whr000001/TeST
Official code for paper: TeST: Temporal-Spatial Separated Transformer for Temporal Action Localization
RulinShao/retrieval-scaling
Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".
dki-lab/Pangu
Code for reproducing the ACL'23 paper: Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments
dki-lab/GrailQA
xhluca/bm25s
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
marzenakrp/nocha
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
gastonstat/harry-potter-data
Data files of Harry Potter books
SunlifeV/CBLPRD-330k
China-Balanced-License-Plate-Recognition-Dataset-330k:A balanced dataset of 330,000 images featuring various types of Chinese license plates for recognition tasks, ideal for training and evaluating license plate recognition models.
OSU-NLP-Group/GrokkedTransformer
Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
zjunlp/LLMAgentPapers
Must-read Papers on LLM Agents.
karpathy/LLM101n
LLM101n: Let's build a Storyteller
OSU-NLP-Group/HippoRAG
[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personalized PageRank.
leopoldwhite/KGQuiz
Official repository of "KGQUIZ: Evaluating the Generalization of Encoded Knowledge in Large Language Models". TheWebConf 2024.
KindXiaoming/pykan
Kolmogorov Arnold Networks
pytorch/torchtune
PyTorch native finetuning library
SafeAILab/RAIN
[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning
NUS-HPC-AI-Lab/InfoBatch
Lossless Training Speed Up by Unbiased Dynamic Data Pruning
datamllab/LongLM
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
google-research/bleurt
BLEURT is a metric for Natural Language Generation based on transfer learning.
kevinyaobytedance/llm_unlearn
LLM Unlearning
uclanlp/corefBias
To analyze and remove gender bias in coreference resolution systems
EmpathYang/ADEPT
Source code and data for ADEPT: A DEbiasing PrompT Framework (AAAI-23).
McGill-NLP/bias-bench
ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.
nyu-mll/crows-pairs
This repository contains the data and code introduced in the paper "CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models" (EMNLP 2020).
KlaraKrieg/GrepBiasIR
Information Retrieval Gender Bias Dataset
W4ngatang/sent-bias
Code and test data for "On Measuring Bias in Sentence Encoders", to appear at NAACL 2019.