bminixhofer's Stars
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
typst/typst
A new markup-based typesetting system that is powerful and easy to learn.
huggingface/candle
Minimalist ML framework for Rust
stas00/ml-engineering
Machine Learning Engineering Open Book
lucidrains/DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
h2oai/h2o-llmstudio
H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://h2oai.github.io/h2o-llmstudio/
facebookresearch/ijepa
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive architecture."
google-research/t5x
adapter-hub/adapters
A Unified Library for Parameter-Efficient and Modular Transfer Learning
uber/petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
google/jaxopt
Hardware accelerated, batchable and differentiable optimizers in JAX.
clab/fast_align
Simple, fast unsupervised word aligner
tatuylonen/wiktextract
Wiktionary dump file parser and multilingual data extractor
acl-org/aclpubcheck
Tools for checking ACL paper submissions
cohere-ai/cohere-python
Python Library for Accessing the Cohere API
EleutherAI/concept-erasure
Erasing concepts from neural representations with provable guarantees
HEmile/storchastic
Stochastic Automatic Differentiation library for PyTorch.
e-bug/volta
[TACL 2021] Code and data for the framework in "Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs"
cohere-ai/cohere-aws
google-research-datasets/QAmeleon
QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning PaLM with only five examples per language. We use the synthetic data to finetune downstream QA models leading to improved accuracy in comparison to English-only and translation-based baselines.
malteos/clp-transfer
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
konstantinjdobler/focus
[EMNLP 2023] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"
tomhosking/hercules
Hercules: Attributable and Scalable Opinion Summarization (ACL 2023)
CPJKU/ScaLearn
ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale
yuvalpinter/unblend
Will it Unblend? (Findings of EMNLP 2020)
fdschmidt93/trident
Generic model training framework abolishing boilerplate
wswu/worcomal
Word compounding across languages
MiniXC/speech-collator
A collator for speech datasets with different batching strategies and attribute extraction.
MiniXC/vocex
Voice Frame-Level and Utterance-Level Attribute Extraction
carlosniquini/Wtp