자연어처리 학습 및 논문 정리를 위한 페이지입니다.

NLP 폴더

Bert_EmbeddingLayer.ipynb : 벌트 개념정리
- Token Embedding
- Segment Embeddings (부분 임베딩?)
- Positional Embedding
- Positional Embedding
  https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/Bert_EmbeddingLayer.ipynb
BertChat_code.ipynb : Bert를 이용한 chat 프로그래밍 코드 입니다. 현재 코드 분석중
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/BertChat_code.ipynb
Bert(Kaggle).ipynb : kaggle을 통해 Bert를 공부중입니다.
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/Bert(Kaggle).ipynb
KoNLPy 개념및 간단 코드 정리
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/Konlpy.ipynb
Lstm 코드로 정리
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/LSTM.py
RNN01.ipynb : RNN 개념정리
- RNN 이란?
- RNN 으로 무엇을 할 수 있을까?
- RNN 학습하기
- RNN 의 확장된 모델들
  https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/RNN01.ipynb
RNN02_Vanishing_Gradient.ipynb : RNN 의 단점 기울기 손실에 대해 알아보자.
- BPTT (backpropagation through time
- Vanishing Gradient Problem
  https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/RNN02_Vanishing_Gradient.ipynb
SimpleRNN.ipynb : NN, RNN 구현 사실 신경망이 아닌 그냥 하드코딩, 그냥 RNN은 이전 정보를 가지고 새로운 정보를 도출해낸다는 느낌을 주기 위해 구현
- SimpleNN
- SimpleRNN
  https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/SimpleRNN.ipynb
VanilaRNN
https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/VanilaRNN.ipynb
Word_embedding_basic.ipynb
- 쉽게 쓰인 word2vec
- 단어 임베딩(Word Embedding) 맛보기
- Sparse vs. Dense Representataions
- Sparse representataion
- Dense representation
- Dense representation (distributed representaion의 장점)
- Word2Vec
- CBOW

https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/Word_embedding_basic.ipynb

WordEmbedding
- 영어 Word2vec 만들기
- 글로브(GloVe)
- Using pre-trained word embeddings in a Keras model
- ELMO
  https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/WordEmbedding.ipynb
fastText (FastText, Word representation using subword)
- Introduction
- Out of vocabuary, infrequent words (Word2Vec)
- Subword representation
- negative samples
- 한국어를 위한 FastText: 초/중/종성 분리 (code)
- Package (fastText)
  https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/WordEmbedding_fastText.ipynb
Word Piece Model
- Word piece, unit of words
- Word Piece Model (sentencepiece) tokenizer
- Byte-pair Encoding (BPE)
  https://github.com/hansw90/NLP-natural-language-processing-/blob/master/NLP/Word_Piece_Embedding.ipynb
Document Similarity : 문서 유사도
- 유클리드 거리
- 자카드 유사도

https://github.com/hansw90/NLP/blob/master/NLP/DocumentSimilarity.ipynb

BERT-BTC
- The A-to-Z guide on how you can use Google’s BERT for binary text classification tasks. I’ll be aiming to explain, as simply and straightforwardly as possible, how to fine-tune a BERT model (with PyTorch) and use it for a binary text classification task.
Attention 완전 뽀개기 https://github.com/hansw90/NLP/blob/master/NLP/Attention_PaperDetail.ipynb
A Structured Self-Attentive Sentence Embedding

hansw90/nlp

자연어처리 학습 및 논문 정리를 위한 페이지입니다.

NLP 폴더