/naver_news_to_word_cloud

Naver News Word Cloud 🌟✨ Transforming Naver news articles into captivating word clouds!

Primary LanguagePython

Naver News Word Cloud

Naver News Word Cloud is a program that crawls Naver news articles and converts them into word clouds.

You can download the executable (exe) file through this link: Download Naver News Word Cloud


1. Instructions for Executable File (exe) Users:

  1. Download the exe file.
  2. Before running the program, make sure to install JDK.
  3. After installation, set the JAVA_HOME environment variable.
    • For example:
      • System Properties -> Advanced -> Environment Variables -> System variables Edit -> New:
        • Name: JAVA_HOME
        • Value: C:\Program Files\Java\jdk-19\bin
      • System Properties -> Advanced -> Environment Variables -> System variables Edit -> Path:
        • Add: C:\Program Files\Java\jdk-19\bin
  4. Enjoy using the program!

2. Instructions for Python File (py) Users:

  1. Install Python 3.10
    • jpype only works with Python versions up to 3.10. Therefore, use Python 3.10.
    • Python can be installed from the following link: Python Downloads
  2. Install JDK.
    • JDK can be downloaded from the following link: Java SE Downloads
    • and set Jave Home environment
  3. install requirements:
pip install bs4, requests, pandas, lxml, konlpy, Cython, JPype1

Note: To install konlpy and JPype1, you need to install JDK first. and Below are the libraries that need to be downloaded and installed separately.

wordcloud: download it manually from here and install it using below command

pip install folder_path\wordcloud-1.8.1-cp311-cp311-win_amd64.whl
  1. To convert .py files to .exe using pyinstaller:
pyinstaller --onefile --add-data="C:\\python311\\Lib\site-packages\\konlpy\\;.\konlpy" --add-data="C:\\python311\\Lib\site-packages\\konlpy\\java;.\\konlpy\\java" --add-data="C:\\python311\\Lib\site-packages\\konlpy\\tag\\*;.konlpy\\tag" --add-data="__init__.py;wordcloud" --add-data="__main__.py;wordcloud" --add-data="_version.py;wordcloud" --add-data="color_from_image.py;." --add-data="DroidSansMono.ttf;wordcloud" --add-data="query_integral_image.pyd;wordcloud" --add-data="stopwords;wordcloud" --add-data="tokenization.py;wordcloud" --add-data="wordcloud.py;wordcloud" --add-data="wordcloud_cli.py;wordcloud" naver_news_word_cloud.py


#넀이버 λ‰΄μŠ€λ₯Ό ν¬λ‘€λ§ν•œ ν›„ 이λ₯Ό μ›Œλ“œν΄λΌμš°λ“œλ‘œ λ³€ν™˜ν•΄μ£ΌλŠ” ν”„λ‘œκ·Έλž¨μž…λ‹ˆλ‹€ μ—¬κΈ° 링크λ₯Ό 톡해 exe νŒŒμΌμ„ λ‹€μš΄λ°›μ„ 수 μžˆμŠ΅λ‹ˆλ‹€.

1. exe 파일 μ‚¬μš©μ‹œ

κ°€. exe νŒŒμΌμ„ λ‹€μš΄λ‘œλ“œ ν•΄μ£Όμ„Έμš”.

λ‚˜. μ‹€ν–‰ν•˜κΈ° 전에 JDK λ₯Ό μ„€μΉ˜ν•΄μ£Όμ„Έμš”.

λ‚˜-1. JDK λŠ” λ‹€μŒ 링크λ₯Ό 톡해 μ„€μΉ˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

λ‚˜-2. μ„€μΉ˜ μ™„λ£Œ ν›„ ν™˜κ²½λ³€μˆ˜ 섀정이 ν•„μš”ν•©λ‹ˆλ‹€

예λ₯Ό λ“€μ–΄,

μ‹œμŠ€ν…œ 속성 - κ³ κΈ‰ - ν™˜κ²½λ³€μˆ˜ -> μ‹œμŠ€ν…œ λ³€μˆ˜ νŽΈμ§‘ -> μƒˆλ‘œ λ§Œλ“€κΈ° ->이름: JAVA_HOME λ³€μˆ˜ κ°’: C:\Program Files\Java\jdk-19\bin

μ‹œμŠ€ν…œ 속성 - κ³ κΈ‰ - ν™˜κ²½λ³€μˆ˜ -> μ‹œμŠ€ν…œ λ³€μˆ˜ νŽΈμ§‘ -> Path -> C:\Program Files\Java\jdk-19\bin

λ‹€. μ¦κΈ°μ„Έμš”!

2. μ½”λ“œ 파일 (py) μ‚¬μš©μ‹œ

κ°€. 파이썬 μ—¬κΈ°μ„œ μ„€μΉ˜: jpype κ°€ 파이썬 3.10 λ²„μ „κΉŒμ§€μ—μ„œλ§Œ μž‘λ™ν•˜κΈ° λ•Œλ¬Έμ— 3.10 버전 μ‚¬μš© ν•„μš”

κ°€-1. 파이썬 μ„€μΉ˜λ²• = customize installation -> pip, tcl/tk and IDLE, for all users 3개 체크 -> add python to environment variables 체크 -> locaton 은 C:\python310\ 으둜 ν•˜κ³  install μ‹œμŠ€ν…œ 속성 - κ³ κΈ‰ - ν™˜κ²½λ³€μˆ˜ -> μ‹œμŠ€ν…œ λ³€μˆ˜ νŽΈμ§‘ -> Path -> C:\python311\ 와 C:\python311\Scripts μΆ”κ°€

λ‚˜. JDK μ—¬κΈ°μ„œ μ„€μΉ˜ ν™˜κ²½μ„€μ •λ„ ν•˜μ„Έμš” μ‹œμŠ€ν…œ 속성 - κ³ κΈ‰ - ν™˜κ²½λ³€μˆ˜ -> μ‹œμŠ€ν…œ λ³€μˆ˜ νŽΈμ§‘ -> μƒˆλ‘œ λ§Œλ“€κΈ° ->이름: JAVA_HOME λ³€μˆ˜ κ°’: C:\Program Files\Java\jdk-19\bin μ‹œμŠ€ν…œ 속성 - κ³ κΈ‰ - ν™˜κ²½λ³€μˆ˜ -> μ‹œμŠ€ν…œ λ³€μˆ˜ νŽΈμ§‘ -> Path -> C:\Program Files\Java\jdk-19\bin

λ‹€. cmd λ₯Ό 였λ₯Έμͺ½ λ²„νŠΌμ„ λˆŒλŸ¬μ„œ κ΄€λ¦¬μž κΆŒν•œμœΌλ‘œ μ‹€ν–‰ (execute cmd with administrative authority)

pip install bs4, requests, pandas

pip install lxml ==> λ§Œμ•½ μ—λŸ¬κ°€ 생긴닀면 μ—¬κΈ°μ„œ λ‹€μš΄λ°›κ³  pip install νŒŒμΌμœ„μΉ˜\lxml-4.9.0-cp311-cp311-win_amd64.whl

pip install wordcloud ==> λ§Œμ•½ μ—λŸ¬κ°€ 생긴닀면 μ—¬κΈ°μ„œ λ‹€μš΄λ°›κ³  pip install νŒŒμΌμœ„μΉ˜\wordcloud-1.8.1-cp311-cp311-win_amd64.whl

pip install konlpy ==> JDK κ°€ κΉ”λ € μžˆμ–΄μ•Ό 였λ₯˜κ°€ λ‚˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

  1. py νŒŒμΌμ„ pyinstaller 둜 exe file 둜 λ³€κ²½ν•˜λ €κ³  ν•œλ‹€λ©΄

κ°€. konlpy 와 wordcloud λ₯Ό --add-data ν•΄μ£Όμ–΄μ•Ό ν•©λ‹ˆλ‹€. 그렇지 μ•ŠμœΌλ©΄ stopword κ°€ μ—†λ‹€κ±°λ‚˜ konlpy μ—μ„œ tag νŒŒμΌμ„ 찾을 수 μ—†λ‹€λŠ” 였λ₯˜κ°€ λ‚˜μ˜¬ κ²ƒμž…λ‹ˆλ‹€.

κ°€-1. konlpy 와 wordcloud λŠ” 보톡 μ‚¬μš©μžμ˜ 파이썬 μ„€μΉ˜ μœ„μΉ˜μ˜ lib 폴더 μ•ˆμ— μžˆμŠ΅λ‹ˆλ‹€.

κ°€-2. μ €λŠ” μ•„λž˜μ™€ 같이 μž…λ ₯ν–ˆμœΌλ‹ˆ μ°Έκ³ ν•΄μ£Όμ„Έμš”.

pyinstaller --onefile --add-data="C:\\python311\\Lib\site-packages\\konlpy\\;.\konlpy" --add-data="C:\\python311\\Lib\site-packages\\konlpy\\java;.\\konlpy\\java" --add-data="C:\\python311\\Lib\site-packages\\konlpy\\tag\\*;.konlpy\\tag" --add-data="__init__.py;wordcloud" --add-data="__main__.py;wordcloud" --add-data="_version.py;wordcloud" --add-data="color_from_image.py;." --add-data="DroidSansMono.ttf;wordcloud" --add-data="query_integral_image.pyd;wordcloud" --add-data="stopwords;wordcloud" --add-data="tokenization.py;wordcloud" --add-data="wordcloud.py;wordcloud" --add-data="wordcloud_cli.py;wordcloud" naver_news_word_cloud.py

λ‚˜. ν•„μˆ˜λŠ” μ•„λ‹ˆμ§€λ§Œ 였λ₯˜κ°€ κ³„μ†λœλ‹€λ©΄ JPype 와 Cython μ„€μΉ˜λ₯Ό κ³ λ €ν•˜μ„Έμš”. JPype λŠ” μžλ°” ν”„λ‘œκ·Έλž¨(konlpy)을 μ‹€ν–‰μ‹œμΌœμ£ΌλŠ” λͺ¨λ“ˆ, Cython 은 C 기반 ν”„λ‘œκ·Έλž¨μ„ μ‹€ν–‰μ‹œν‚¬ 수 있게 ν•©λ‹ˆλ‹€.

pip install Cython, JPype1