Pinned Repositories
entmax
The entmax mapping and its loss, a family of sparse softmax alternatives.
infinite-former
lp-sparsemap
LP-SparseMAP: Differentiable sparse structured prediction in coarse factor graphs
OpenNMT-APE
scheduled-sampling-transformers
Code for the paper "Scheduled Sampling for Transformers"
sparse-marginalization-lvm
Official PyTorch (Lightning) implementation of the NeurIPS 2020 paper "Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity".
tower-eval
triton-tutorial
From a+b to sparsemax(QK^T)V in Triton!
tutorial
Web page for our tutorial on latent structure for NLP
UA_COMET
Repository for "Uncertainty-Aware Machine Translation Evaluation", accepted to Findings of EMNLP 2021.
DeepSPIN's Repositories
deep-spin/entmax
The entmax mapping and its loss, a family of sparse softmax alternatives.
deep-spin/infinite-former
deep-spin/tower-eval
deep-spin/hallucinations-in-nmt
deep-spin/Infinite-Video
\infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation
deep-spin/spectra-rationalization
Repository for SPECTRA: Sparse Structured Text Rationalization, accepted at EMNLP 2021 main conference.
deep-spin/quest-decoding
A package for sampling from Gibbs distributions during inference with LLMs.
deep-spin/sigmorphon-seq2seq
DeepSPIN's submission to SIGMORPHON 2020
deep-spin/latim
deep-spin/reranking-laws
deep-spin/SSHN
Sparse and Structured Hopfield Networks
deep-spin/doce
This is the a repo of DOCE
deep-spin/mt-pref-alignment
deep-spin/HFYN
Hopfield-Fenchel-Young Networks: A Unified Framework for Associative Memory Retrieval
deep-spin/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
deep-spin/sparse-activations-cp
Repository containing code to reproduce results of the paper "Sparse Activations as Conformal Predictors".
deep-spin/ssm-mt
deep-spin/towerllm-alignment
Code for alignment for the towerllm project.
deep-spin/Megatron-LM-pretrain
Ongoing research training transformer models at scale
deep-spin/adasplash
AdaSplash: Adaptive Sparse Flash Attention (aka Flash Entmax Attention)
deep-spin/axolotl
Go ahead and axolotl questions
deep-spin/CHM-Net
Modern Hopfield Networks with Continuous-Time Memories
deep-spin/COMET
A Neural Framework for MT Evaluation
deep-spin/fy-vi
deep-spin/lm-evaluation-harness
A fork of lm-eval-harness.
deep-spin/lmms-eval
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
deep-spin/nanotron
Minimalistic large language model 3D-parallelism training
deep-spin/robust-mt
deep-spin/translate-smart
deep-spin/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks