Pinned Repositories
aimrun
simple interface for integrating aim into MLOps frameworks
Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. Meanwhile, we created a new branch to build a Tabular LLM.(我们分别统一了丰富的IFT数据(如CoT数据)、多种训练效率方法以及多种LLMs,三个层面上的接口,打造方便研究人员上手的LLM-IFT研究和使用平台。我们欢迎开源爱好者在这个repo上发起任何有意义的pr,一起将尽可能多的LLM相关技术集成进来。
bitlinear
BitLinear implementation
cair
CAIR rubric for privacy metrics
dolma
Data and tools for generating and inspecting OLMo pre-training data.
nanoT5
Fast & Simple repository for pre-training and fine-tuning T5-style models
OLMo
Modeling, training, eval, and inference code for OLMo
synthesizers
meta library for synthetic data generation
syntheval
Software for evaluating the quality of synthetic data compared with real data.
syntheval-model-benchmark-example
Research paper supplement and code example of using SynthEval for executing a model benchmark
schneiderkamplab's Repositories
schneiderkamplab/bitlinear
BitLinear implementation
schneiderkamplab/syntheval
Software for evaluating the quality of synthetic data compared with real data.
schneiderkamplab/synthesizers
meta library for synthetic data generation
schneiderkamplab/cair
CAIR rubric for privacy metrics
schneiderkamplab/nanoT5
Fast & Simple repository for pre-training and fine-tuning T5-style models
schneiderkamplab/aim
Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
schneiderkamplab/aimrun
simple interface for integrating aim into MLOps frameworks
schneiderkamplab/Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. Meanwhile, we created a new branch to build a Tabular LLM.(我们分别统一了丰富的IFT数据(如CoT数据)、多种训练效率方法以及多种LLMs,三个层面上的接口,打造方便研究人员上手的LLM-IFT研究和使用平台。我们欢迎开源爱好者在这个repo上发起任何有意义的pr,一起将尽可能多的LLM相关技术集成进来。
schneiderkamplab/cramming
Cramming the training of a (BERT-type) language model into limited compute.
schneiderkamplab/detect-gpt
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
schneiderkamplab/dolma
Data and tools for generating and inspecting OLMo pre-training data.
schneiderkamplab/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
schneiderkamplab/OLMo
Modeling, training, eval, and inference code for OLMo
schneiderkamplab/scaling-sdg
Scaling study of Synthetic Data Generation models and evaluations
schneiderkamplab/snopt
Sorting Network OPTimizer
schneiderkamplab/subprocessing
A subprocess-based reimplementation of parts of Python's multiprocessing library
schneiderkamplab/syntheval-model-benchmark-example
Research paper supplement and code example of using SynthEval for executing a model benchmark
schneiderkamplab/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
schneiderkamplab/youtube-insights
Simple script for downloading Youtube comments without using the Youtube API
schneiderkamplab/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
schneiderkamplab/interviewbot
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
schneiderkamplab/organoids
Automatic segmentation and analysis of organoids
schneiderkamplab/schedule_free
Schedule-Free Optimization in PyTorch