Pinned Repositories
contextual-repr-analysis
A toolkit for evaluating the linguistic knowledge and transferability of contextual representations. Code for "Linguistic Knowledge and Transferability of Contextual Representations" (NAACL 2019).
cython-crash-course
A quick intro to Cython for Python users who don't know C
evaluating-verifiability-in-generative-search-engines
Companion repo for "Evaluating Verifiability in Generative Search Engines".
flatten_gigaword
Dump the text of the Gigaword dataset into a single file, for use with language modeling (and other!) toolkits
inoculation-by-finetuning
Code for the paper "Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets", to be presented at NAACL 2019.
lexical-semantic-recognition
lost-in-the-middle
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
paraphrase-id-tensorflow
Various models and code (Manhattan LSTM, Siamese LSTM + Matching Layer, BiMPM) for the paraphrase identification task, specifically with the Quora Question Pairs dataset.
pytorch-manylinux-binaries
pytorch-paper-classifier
A simple model for classifying papers by academic venue (AI/ML/ACL), given a title and abstract. Bare-metal PyTorch port of https://github.com/allenai/allennlp-as-a-library-example .
nelson-liu's Repositories
nelson-liu/lost-in-the-middle
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
nelson-liu/pytorch-manylinux-binaries
nelson-liu/evaluating-verifiability-in-generative-search-engines
Companion repo for "Evaluating Verifiability in Generative Search Engines".
nelson-liu/lexical-semantic-recognition
nelson-liu/word2color
Given a description of a color, return its closest standard HTML4 color.
nelson-liu/allennlp
A natural language processing toolkit using state-of-the-art deep learning models.
nelson-liu/Adv360-Pro-ZMK
Production repository for the all-new Advantage360 Professional using ZMK engine
nelson-liu/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
nelson-liu/chatnoir-resiliparse
A robust web archive analytics toolkit
nelson-liu/codalab-worksheets
A collaborative platform for reproducible research (web interface and CLI).
nelson-liu/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
nelson-liu/dom-distiller
Distills the DOM
nelson-liu/downshift
🏎 A set of primitives to build simple, flexible, WAI-ARIA compliant React autocomplete, combobox or select dropdown components.
nelson-liu/editdistance-feedstock
A conda-smithy repository for editdistance.
nelson-liu/galai
Model API for GALACTICA
nelson-liu/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).
nelson-liu/HolisticTraceAnalysis
A library to analyze PyTorch traces.
nelson-liu/llama-recipes
Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger
nelson-liu/opentelemetry-python
OpenTelemetry Python API and SDK
nelson-liu/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
nelson-liu/react-hash-link
Painless hash link routing for React applications.
nelson-liu/ReadabiliPy
A simple HTML content extractor in Python. Can be run as a wrapper for Mozilla's Readability.js package or in pure-python mode.
nelson-liu/simple-wikidata-db
A set of Python scripts for preprocessing the Wikidata JSON dump and running simple queries in an efficient manner.
nelson-liu/SummEval
Resources for the "SummEval: Re-evaluating Summarization Evaluation" paper
nelson-liu/thefuzz
Fuzzy String Matching in Python
nelson-liu/tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
nelson-liu/toma
Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory
nelson-liu/transformers
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
nelson-liu/url-change-event
a wrapper event that listen & control URL changes
nelson-liu/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs