An evolving notes on machine learning and mathematical techniques in and for Natural Language Processing.
Update - Currently, I am not knowledgeable enough to determine the content to be included in this booklet; so I might include several specific research papers that require certain aspect of math and ml expertise, so as to enumerate the extension of ml and math for nlp. (2022.10.2)
density/distribution estimation over structures, generative story
- Minimax and Neyman–Pearson Meta-Learning for Outlier Languages,
Neyman-Pearson
- Annealing Techniques for Unsupervised Statistical Language Learning,
deterministic annealing
- Probability Divergences and Generative Models, Arthur Gretton's talk
mlss2021
variational inference, normalizing flow
- Learning Opinion Summarizers by Selecting Informative Reviews, Ivan Titov's group.
REINFORCE
amortized varitional inference
- Sequence-to-Sequence Learning with Latent Neural Grammars, Yoon Kim.
likelihood bounding
- Structured Reordering for Modeling Latent Alignments in Sequence Transduction, Ivan Titov's group.
marginalization
- Discrete Latent Structure in Neural Networks, Jan. 18 2023.
booklet
in structure prediction.
- Determinantal Beam Search, Ryan Cotterell's group.
beam search
- Mode recovery in neural autoregressive sequence modeling, Kyunghyun Cho's group.
sampling
- Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics
- Searching for More Efficient Dynamic Programs, Jason Eisner et al.
dp
- Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs, Dec. 10 2021.
nips2021
- What Context Features Can Transformer Language Models Use?, Jacob Andreas's group.
V-information
- Conditional probing: measuring usable information beyond a baseline, Percy Liang's group.
V-information
- On the Complexity and Typology of Inflectional Morphological Systems, Ryan Cotterell.
complexity measure
- On Homophony and Renyi Entropy, Ryan Cotterell et al.
entropy
- The Low-Dimensional Linear Geometry of Contextualized Word Representations, Jacob Andreas's group.
- On Isotropy Calibration of Transformers, ETH's group.
submodular, Gumbel-Softmax, optimization for structured prediction
- Towards Dynamic Computation Graphs via Sparse Latent Structure,
emnlp2018
marginalize. - Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder,
iclr2019
- Backpropagating through Structured Argmax using a SPIGOT,
acl2018
- Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions, Oct. 27 2021.
nips2021
code - Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions, Oct. 22 2021.
- Understanding and Testing Generalization of Deep Networks on Out-of-Distribution Data,
nips2021
code - Storchastic: A Framework for General Stochastic Automatic Differentiation,
nips2021
- Scaling Structured Inference with Randomization, Dec. 7 2021.
- Learning with Latent Structures in Natural Language Processing: A Survey, Jan. 3 2022.
- Gradient Estimation with Discrete Stein Operators, Feb. 19 2022.