-
Bert_EmbeddingLayer.ipynb : 벌트 개념정리
- Token Embedding
- Segment Embeddings (부분 임베딩?)
- Positional Embedding
- Positional Embedding
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/Bert_EmbeddingLayer.ipynb
-
BertChat_code.ipynb : Bert를 이용한 chat 프로그래밍 코드 입니다. 현재 코드 분석중
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/BertChat_code.ipynb -
Bert(Kaggle).ipynb : kaggle을 통해 Bert를 공부중입니다.
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/Bert(Kaggle).ipynb -
KoNLPy 개념및 간단 코드 정리
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/Konlpy.ipynb -
Lstm 코드로 정리
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/LSTM.py -
RNN01.ipynb : RNN 개념정리
- RNN 이란?
- RNN 으로 무엇을 할 수 있을까?
- RNN 학습하기
- RNN 의 확장된 모델들
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/RNN01.ipynb
-
RNN02_Vanishing_Gradient.ipynb : RNN 의 단점 기울기 손실에 대해 알아보자.
- BPTT (backpropagation through time
- Vanishing Gradient Problem
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/RNN02_Vanishing_Gradient.ipynb
-
SimpleRNN.ipynb : NN, RNN 구현 사실 신경망이 아닌 그냥 하드코딩, 그냥 RNN은 이전 정보를 가지고 새로운 정보를 도출해낸다는 느낌을 주기 위해 구현
-
VanilaRNN
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/VanilaRNN.ipynb -
Word_embedding_basic.ipynb
- 쉽게 쓰인 word2vec
- 단어 임베딩(Word Embedding) 맛보기
- Sparse vs. Dense Representataions
- Sparse representataion
- Dense representation
- Dense representation (distributed representaion의 장점)
- Word2Vec
- CBOW
-
WordEmbedding
- 영어 Word2vec 만들기
- 글로브(GloVe)
- Using pre-trained word embeddings in a Keras model
- ELMO
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/WordEmbedding.ipynb
-
fastText (FastText, Word representation using subword)
- Introduction
- Out of vocabuary, infrequent words (Word2Vec)
- Subword representation
- negative samples
- 한국어를 위한 FastText: 초/중/종성 분리 (code)
- Package (fastText)
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/WordEmbedding_fastText.ipynb
-
Word Piece Model
- Word piece, unit of words
- Word Piece Model (sentencepiece) tokenizer
- Byte-pair Encoding (BPE)
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/Word_Piece_Embedding.ipynb
-
Document Similarity : 문서 유사도
- 유클리드 거리
- 자카드 유사도
https://github.com/hansw90/NLP/blob/master/NLP/DocumentSimilarity.ipynb
-
BERT-BTC
- The A-to-Z guide on how you can use Google’s BERT for binary text classification tasks. I’ll be aiming to explain, as simply and straightforwardly as possible, how to fine-tune a BERT model (with PyTorch) and use it for a binary text classification task.
-
Attention 완전 뽀개기 https://github.com/hansw90/NLP/blob/master/NLP/Attention_PaperDetail.ipynb
-
A Structured Self-Attentive Sentence Embedding