sustcsonglin

ML & NLP Research. PhD student @ MIT CSAIL

MITCambridge

Pinned Repositories

disco-pointer
Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span Selection
Language:Python14 1 00
flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Language:Python1.4k 27 5670
flash-linear-rnn
Implementations of various linear RNN layers using pytorch and triton
Language:Python46 2 11
gated_linear_attention_layer
Language:Python31 4 01
mamba-triton
Language:Python46 1 12
pointer-net-for-nested
The official implementation of ACL2022``Bottom-Up Constituency Parsing and Nested Named Entity Recognition with Pointer Networks''
Language:Python33 2 13
second-order-neural-dmv
source code of COLING2020 "Second-Order Unsupervised Neural Dependency Parsing"
Language:Python17 3 11
span-based-dependency-parsing
Source code of ACL2022 "Headed-Span-Based Projective Dependency Parsing" and "Combining (second-order) graph-based and headed-span-based projective dependency parsing
Language:Python16 3 02
TN-LCFRS
Official Implementation of ACL2023: Unsupervised Discontinuous Constituency Parsing with Mildly Context-Sensitive Grammars
Language:Python11 1 01
TN-PCFG
source code of NAACL2021 "PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols“ and ACL2021 main conference "Neural Bilexicalized PCFG Induction"
Language:Python46 6 106

sustcsonglin's Repositories

sustcsonglin/flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Language:Python1.4k 27 5670
sustcsonglin/flash-linear-rnn
Implementations of various linear RNN layers using pytorch and triton
Language:Python46 2 11
sustcsonglin/mamba-triton
Language:Python46 1 12
sustcsonglin/TN-PCFG
source code of NAACL2021 "PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols“ and ACL2021 main conference "Neural Bilexicalized PCFG Induction"
Language:Python46 6 106
sustcsonglin/gated_linear_attention_layer
Language:Python31 4 01
sustcsonglin/disco-pointer
Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span Selection
Language:Python14 1 00
sustcsonglin/FlagAttention
A collection of memory efficient attention operators implemented in the Triton language.
Language:Python2 0 00
sustcsonglin/lit-gpt
Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
Language:Python2 0 0
sustcsonglin/cuda-playground
Language:Cuda1 1 0
sustcsonglin/mamba.py
An efficient Mamba implementation in PyTorch and MLX.
Language:Python1 0 0
sustcsonglin/nanokitchen
Parallel Associative Scan for Language Models
Language:Python1 0 0
sustcsonglin/safari
Convolutions for Sequence Modeling
Language:Assembly1 0 0
sustcsonglin/stk
Language:Python1 0 0
sustcsonglin/streaming-llm
Efficient Streaming Language Models with Attention Sinks
Language:Python1 0 0
sustcsonglin/sustcsonglin.github.io
:page_facing_up: Elegant & friendly homepage (bio, tech portfolio, resume, doc...) template with Markdown and VuePress
Language:HTML1 1 01
sustcsonglin/sustcsonglin_old.github.io
:page_facing_up: Elegant & friendly homepage (bio, tech portfolio, resume, doc...) template with Markdown and VuePress
Language:HTML1 1 0
sustcsonglin/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Language:Python1 0 0
sustcsonglin/transformers_ssm_copy
Language:Python1 0 0
sustcsonglin/zoology
Understand and test language model architectures on synthetic tasks.
Language:Python1 0 0
sustcsonglin/Academic-project-page-template
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
Language:JavaScript0 0
sustcsonglin/BeamTreeRecursiveCells
Language:Python0 0
sustcsonglin/cutlass-kernels
Language:Cuda0 0
sustcsonglin/hyena-dna
Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
Language:Assembly0 0
sustcsonglin/m2
Monarch Mixer
Language:Assembly0 0
sustcsonglin/mamba
Language:Python0 0
sustcsonglin/S5
Language:Python0 0
sustcsonglin/s5-pytorch
Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)
Language:Python0 0
sustcsonglin/SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
Language:Cuda0 0
sustcsonglin/stack-attention
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
Language:Python0 0
sustcsonglin/state-spaces
Sequence Modeling with Structured State Spaces
Language:Jupyter Notebook0 0