Exploiting Similarities among Languages for Machine Translation |
词向量,双语翻译baseline |
|
1. 使用简单线性矩阵乘法学习语言A到语言B的映射 2. 构造评测集以及构造对照组的方法。如使用编辑距离,Word Co-occurrence构造的对照组 |
Word Translation Without Parallel Data |
多语言的Embedding,对抗学习 |
代码地址 |
用对抗学习语言A到语言B的映射 |
Transfer Learning for Deep Sentiment Analysis |
带情感信息的Embedding,迁移学习 |
|
误差函数的设计,加了正则,该正则使得学到网络对已知情感标签的词效果好;可借鉴评测方式 |
Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation |
Normalized Word Embedding以及Orthogonal Transform可提高Bilingual分类性能 |
|
提高模型训练效率以及稳定性的方法(normalize+正则),容易工程实现 |
Training Neural Word Embeddings For Transfer Learning And Translation |
迁移学习的Embedding |
|
|
Wiktionary-Based Word Embeddings |
wiki字典,多语言全局embedding |
|
基于词与词之间的关系(使用加权内积)建模 |
Adversarial Network Embedding |
对抗网络Embedding |
|
|
Interpretable Adversarial Perturbation in Input Embedding Space for Text |
对抗扰动生成方式,此对抗样本 |
|
对抗扰动生成方式增加限制,另其仅在有实际意义的word的方向变动,而非随机任意方向变动(后续或者可以加入更强限制,比如lm模型) |
Enriching Word Vectors with Subword Information |
char n-gram,形态学词向量 |
|
使用char级别的 n-gram 特征建模,得到表征词形态的词向量 |
Linguistic Regularities in Sparse and Explicit Word Representations |
|
|
nothing special |
Advances in Pre-Training Distributed Word Representations |
词向量训练优化手段 |
|
词频为Zipf分布,需要提高高频词的discard probability;类似attention的加权context embedding;可以用mutual information criterion从数量爆炸的n-gram中选择少部分信息量大的 |
Learning to Compose Words into Sentences with Reinforcement Learning |
tree-structured representations,增强学习 |
|
|
Recent Trends in Deep Learning Based Natural Language Processing |
多个nlp任务 state of art 方法 |
|
多个nlp任务最新进展,参考价值很大 |
Invariant Variation Problems |
经典不变量分析 |
其他链接 |
最经典论文,可以参考https://en.wikipedia.org/wiki/Noether%27s_theorem |
Lexicon infused phrase embeddings for named entity resolution |
词典,Phrase Embeddings |
|
改装skip-gram,除了预测上下文的context外,还预测辞典中与改词关联的context |
Improved Word and Symbol Embedding for Part-of-Speech Tagging |
词向量的使用技巧 |
|
byte-pair encoding |
Unsupervised POS Induction with Word Embeddings |
HMM、多元高斯分布 |
|
skip-gram减小window size更利于获取语法信息;假设某个tag对应的词向量符合多元高斯分布 |
Improving the Accuracy of Pre-trained Word Embeddings for Sentiment Analysis |
通过简单concat的方法改进词向量 |
|
直接简单concat word2vec、glove、pos2vec、leicon2vec,缺点是需要引入外部有监督数据 |
Part-Of-Speech Tag Embedding for Modeling Sentences and Documents |
POS Embedding |
|
nothing special |
Diagnosing and Enhancing VAE Models |
ICLR 2019接收论文 |
|
|
使用生成式对抗网络进行远距离监督关系抽取 |
|
其他链接 |
用对抗网络进行样本去噪,generator和discriminator目标相反,generator尽可能预测样本干净度,预测出来的标签取反放入discriminator训练,训练直到discriminator性能下降最大为止(discriminator初始时和generator的目标一样)。 |
Phrase-Based & Neural Unsupervised Machine Translation |
非监督机器翻译 |
|
|
Unsupervised Part-of-Speech Taggingwith Bilingual Graph-Based Projections |
非监督 POS Tagging |
|
|
Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network |
有监督训练word pair关系 |
|
借助 parse tree 特征有监督训练反义word pair,基本假设是在同一句话里,反义词同时出现的概率大于同义词同时出现的概率 |
Integrating Distributional Lexical Contrast into Word Embeddings for Antonym–Synonym Distinction |
训练能区分同义词的embedding |
代码地址 |
两种方法,方法一是在普通的词向量的目标函数中加上同义词和反义词的正则;方法二是给词向量的各个特征re-weight,能区分反义词的特征加大权重 |
Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases |
评测词向量捕捉到的各种语义或语法关系 |
代码地址 |
评测三种词向量在semantic task的效果;词向量捕捉的关系有8种 |
Refining Word Embeddings for Sentiment Analysis |
2018,使用单词极性信息进行有监督refine |
|
取一个词的top k个最近的词向量,然后使用对应的极性分数对词进行重排序,并将这个重排序的作用反馈到原向量 |
Retrofitting Word Vectors to Semantic Lexicons |
2014,使用辞典的语义信息进行词向量的refine |
代码地址 代码地址2 代码地址3 |
和这篇Refining Word Embeddings for Sentiment Analysis基本一样 |
Refining Pretrained Word Embeddings Using Layer-wise Relevance Propagation |
2018 |
其他地址 |
让特征相关分数按layer进行后向传播进行refine词向量; |
SeVeN: Augmenting Word Embeddings with Unsupervised Relation Vectors |
2018 |
|
1、基于PMI构造无监督特征,进而得到无监督关系;2、使用AutoEncoder进行去噪。 3、方法思路挺好,但是感觉效果提升不大 |
Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers |
2018 |
|
|
Adversarial Training for Relation Extraction |
2017,对抗学习 |
代码地址 |
使用对抗样本提高分类器的稳定性 |
Adversarial Feature Matching for Text Generation |
2017,文本生成 |
|
在标准的gan loss中加入了 Maximum Mean Discrepancy 项。通过将文本map到某个feature空间,然后再通过RKHS技术进行相似度的度量 |
From Word to Sense Embeddings: A Survey on Vector Representations of Meaning |
2018 |
|
写的比较凌乱 |
Jointly Learning Word Embeddings and Latent Topics |
2017,联合训练特定topic的embedding |
|
借鉴LDA的**,给标准的skip-gram模型加上主题因子,用EM算法进行迭代优化 |
Distilled Wasserstein Learning for Word Embedding and Topic Modeling |
2018 |
|
|
Multi-Task Label Embedding for Text Classification |
2018 |
|
多任务学习,label embedding |
Factors Influencing the Surprising Instability of Word Embeddings |
2018 |
|
|
The Interplay of Semantics and Morphology in Word Embedding |
2017 |
代码地址1 代码地址2 |
词的相似性分为语义相似性和词态相似性;若想增加semantic similarity,可以使用lemma,但会损害 morphology similarity |
Joint Embedding of Words and Labels for Text Classification |
2018 有监督 |
代码地址 |
将label嵌入到词向量空间,通过label和词的相似度矩阵构建attention权值 |
Domain Separation Networks |
2016 |
|
|
RAND-WALK: A Latent Variable Model Approach to Word Embeddings |
2016 |
代码地址 |
|
Linear Algebraic Structure of Word Senses, with Applications to Polysemy |
2016 |
代码地址 |
|
Querying Word Embeddings for Similarity and Relatedness |
2018 |
|
|
A Rank-Based Similarity Metric for Word Embeddings |
2018 |
|
|
Improving Word Embeddings with Convolutional Feature Learning and Subword Information |
2017 |
|
|
Computing Text Similarity using Tree Edit Distance |
2015 |
|
|
Normalisation of Historical Text Using Context-Sensitive Weighted Levenshtein Distance and Compound Splitting |
2013 |
|
|
Explicit Retrofitting of Distributional Word Vectors |
2018 |
|
使用nn拟合一个新的度量空间映射,使得在新的度量空间里满足同义反义词距离条件,且保持愿空间的拓扑结构;还可借鉴一种数据增强方法 |
Unsupervised Learning of Style-sensitive Word Vectors |
2018,可捕捉 Stylistic Similarity版的CBOW |
|
为捕捉 semantic and syntactic similarities,使用context预测target;为了捕捉Stylistic Similarity,使用非context预测target |
Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization |
2018,对抗学习,词向量微调 |
代码地址 |
它先使用外部辞典进行词向量refine,得到一部分词的refined词向量;利用前一步的结果作为训练集,使用gan进行原始词向量到refine词向量映射的学习 |
Using pseudo-senses for improving the extraction of synonyms from word embeddings |
2018 acl |
|
创新点在于将一句话拆成context-target的方式,在原有方式基础上,增加一个target在同一句话整体的context描述方式。refine词向量的方式有paragram和Retrofitting两种,它这里使用paragram |
Auto-Encoding Dictionary Definitions into Consistent Word Embeddings |
2018 emnlp |
|
仅仅使用辞典定义来训练词向量,可以更好捕获词的similarity;refine词向量时也可以考虑将辞典定义这种关系考虑进去 |
Gromov-Wasserstein Alignment of Word Embedding Spaces |
2018 emnlp |
|
|
Word Relation Autoencoder for Unseen Hypernym Extraction Using Word Embeddings |
2018 emnlp |
|
为了避免对训练样本过拟合(即在训练集出现对pair效果好),从对(w1,w2)建模转为 (w1,w2-w1)建模,并使用AutoEncoder |
Learning Gender-Neutral Word Embeddings |
2018 emnlp |
|
|
Specialising Word Vectors for Lexical Entailment |
2018 naacl |
|
使用外部关系词进行映射训练 |
DeepAlignment: Unsupervised Ontology Matching with Refined Word Vectors |
2018 naacl |
|
|
Enhanced Word Representations for Bridging Anaphora Resolution |
2018 naacl |
|
|
Relation Induction in Word Embeddings Revisited |
2018 coling |
|
|
Encoding Sentiment Information into Word Vectors for Sentiment Analysis |
2018 coling |
|
|
Word Sense Disambiguation Based on Word Similarity Calculation Using Word Vector Representation from a Knowledge-based Graph |
2018 coling |
|
|
Learning Hierarchical Similarity Metrics |
2012 |
|
|
Poincaré Embeddings for Learning Hierarchical Representations |
2017 |
|
|
PME: Projected Metric Embedding on Heterogeneous Networks for Link Prediction |
2018, 异构网络结构 embedding |
|
对于异构的关系,先将原始向量进行与relation相关的特定的映射;双向(一个edge的两个node)负采样 |
Scalable Graph Embedding for Asymmetric Proximity |
2017, 非对称结构的图embedding |
|
|
Hierarchical Embeddings for Hypernymy Detection and Directionality |
2017 acl |
|
|
Imposing Hard Constraints on Deep Networks: Promises and Limitations |
2017 |
|
|
Large-Scale Embedding Learning in Heterogeneous Event Data |
2018 aaai |
|
|
Relation Structure-Aware Heterogeneous Information Network Embedding |
2019 aaai |
|
将所有异构关系简化从属关系和交互关系;从属关系的距离直接使用欧里几得距离(相当于在同一类别);交互关系距离在原始L1距离中加入表征relation的向量Yr( |
Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces |
aaai 2019 |
|
先使用外部辞典图进行random walk,生成人造句子,进行词向量训练得到v1;再学习v1到普通词向量v2的映射,目标函数是最大化CCA(Canonical Correlation Analysis) |