DSL-23-1-modeling--Img2Music-Music-Recommendation-System-based-on-sentiment-analysis-

감정 기반 이미지-음악 추천시스템

모델링 E조

팀명 : E = MusiC^2
팀원 : 8기 유채원 장준혁 최윤서 9기 김서진 서연우

Files

image_to_music.py : final end-to-end file to get the result
image_to_music_module.py : modules used in image_to_music.py
image_to_sentiment
- data
  - OASIS_with_minmaxscaling.csv
- dataset.py : get data and transform it into runnable format
- run.py : python file to receive arguments and run exp.py file
- exp.py : python file actually run to train
- utils.py : define functions for minor changes or transformations
- vgg.sh : pretrained VGG19_bn model
music_to_sentiment
- arousal_valence_prediction.py : file to get music's sentiment
- data
  - NRC-VAD-Lexicon.txt
  - da
  - song_lyric.csv
  - song_lyric_VA.csv
  - song_lyric_embedding_300.csv
  - song_lyric_embedding_pca_11.csv
  - song_music_8_characteristics.csv
  - song_normalized_VA_label.csv
- preprocess : preprocessing modules for VA model
  - api_module.py
  - da
  - lyric_VA.py
  - lyrics_to_vector.py
  - music_preprocessing.py
  - music_to_csv.py
  - preprocess_module.py
- result
  - arousal.pkl

More Explanations

0. Task

Presentation pdf
Presentation Youtube
영상 안의 다양한 이미지에 어울리는 음악을 추천.
Ekman's 6 feelings 에 기반한 Valence, Arousal 지표를 통해 이미지와 음악의 감성을 분석, image 2 music recommendation을 구현하고자 함.

1. Data

OASIS (Open Affective Standardized Image Set) : 900개의 이미지에 대해 274명의 사람이 valence, arousal을 평가
Spotify API : Spotify data 크롤링을 통해 아티스트, 앨범, 곡명 및 valence, arousal을 포함한 음악의 정량적 지표를 수집
Musixmatch API : Spotify API 기반으로 크롤링한 곡들의 가사를 수집
NRC VAD Lexicon : 캐나다 NRC 제작, 약 2만 여개의 단어에 대해 Valence, Arousal, Dominance 정보를 담고 있음.

2. Model

Overview
Image2Emotion : pretrained CNN model을 feature extractor로 사용해 valence, arousal 지표를 예측하도록 함
- model
  - Feature extractor : VGG19_bn
  - Classifier : Linear layer
  - Loss : MSE loss (valence + arousal)
Music2Emotion
- Regression task to predict VA with Spotify data's columns
  - Used 8 features(danceability, key, loudness, mode, speechiness, instrumentalness, liveness, tempo)
  - Preprocessed with log normalization and min-max scaling
- Lyric Embedding : Word2Vec + Weighted sum based on counts (NRC VAD Lexicon)
  - Used pretrained model(fse/word2vec-google-news-300(Hugging Face)) -> fine tuning -> PCA
- AutoML : chose best 3 model and stacked them to make new model
  - Valence model : Gradient Boost Regressor + Random Forest Regressor + Extra Trees Regressor
    - Input : music + lyric embedding PCA + lyric VA
  - Arousal model: Gradient Boost Regressor + Random Forest Regressor + LGBM Regressor
    - Input : music + lyric VA
Image2Music
- Used Euclidean distance as similarity measure(with sklearn)

3. Result

End-to-end simulation code(image_to_music.py)
result image

4. References

Matt McVicar, Bruno Di Giorgi, Baris Dundar, and Matthias Mauch. (2021). Lyric document embeddings for music tagging
James A. Russell. (1980). A Circumplex Model of Affect
Grekow, J. (2016). Music Emotion Maps in Arousal-Valence Space.
Fika Hastarita Rachman, RiyanartoSarno, ChastineFatichah. (2019). Song Emotion Detection Based on Arousal-Valencefrom Audio and Lyrics Using Rule Based Method
Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore, Xavier Serra. (2021). Emotion Embedding Spaces for Matching Music to Stories
Github hudsonbrendon/python-musixmatch
스포티파이 API로 음악 분석하기
추천시스템 이해
Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2018. Deep Learning based Recommender System: A Survey and New Perspectives. ACM Comput. Surv. 1, 1, Article 1 (July 2018), 35 pages.
Shankar, Devashish & Narumanchi, Sujay & Ananya, H & Kompalli, Pramod & Chaudhury, Krishnendu. (2017). Deep Learning based Large Scale Visual Recommendation and Search for E-Commerce.
카카오 AI 추천 : 카카오의 콘텐츠 기반 필터링
이승진. 2019. 음악 추천을 위한 가사정보 및 음악신호 기반 특성 탐색 연구. 서울대학교 융합과학기술대학원 석사학위 논문.
X. He and L. Deng, "Deep Learning for Image-to-Text Generation: A Technical Overview," in IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 109-116, Nov. 2017, doi: 10.1109/MSP.2017.2741510.
Pan, Y., Mei, T., Yao, T., Li, H., & Rui, Y. (2015). Jointly Modeling Embedding and Translation to Bridge Video and Language. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4594-4602.
Yin, Pei & Zhang, Liang. (2020). Image Recommendation Algorithm Based on Deep Learning. IEEE Access. PP. 1-1. 10.1109/ACCESS.2020.3007353.
Lee, S. J., Seo, B.-G., & Park, D.-H. (2018). Development of Music Recommendation System based on Customer Sentiment Analysis. Journal of Intelligence and Information Systems, 24(4), 197–217. https://doi.org/10.13088/JIIS.2018.24.4.197

bwnebs1/DSL-23-1-modeling-Img2Music

DSL-23-1-modeling--Img2Music-Music-Recommendation-System-based-on-sentiment-analysis-

모델링 E조

Files

More Explanations

0. Task

1. Data

2. Model

3. Result

4. References