/HTML_Table_Excel

Scrapping HTML Table and Input a Table Data to Excel

Primary LanguagePython

HTML_Table_Excel

Extract HTML Table and Input a Table Data to Excel

πŸ“š This Library is applied HTML_Table_Extractor

πŸ“Œ Library Name : Table_Excel

πŸ“Œ Created Date : 27/Aug/2020

πŸ“Œ Updated Date : 11/Mar/2021

πŸ“Œ Author : Minku Koo

πŸ“Œ E-Mail : corleone@kakao.com

πŸ“Œ Version : 1.1.4

πŸ“Œ Keywords : 'Excel', 'Table', 'HTML', 'Crawling', 'Selenium', 'Extractor'


βš™ How to Use?

from HTML_Table_Excel import Table_Excel

# ENG
TableExcel = Table_Excel( URL_list <type=(String)list>, ChromeDriver Path <type=String>)
TableExcel.makeExel_abs( Excel File Path <type=String>, Table Header Color by Hex <type=String> (Default=F8E0EC) )
TableExcel.makeExel_sep( Excel File Path <type=String> )

# KOR
TableExcel = Table_Excel( URL <리슀트>, 크둬 λ“œλΌμ΄λ²„ 경둜 <λ¬Έμžμ—΄>)
TableExcel.makeExel_abs( μ—‘μ…€ 파일 경둜 <λ¬Έμžμ—΄>, ν…Œμ΄λΈ” 헀더 색깔 - 16μ§„μˆ˜ <λ¬Έμžμ—΄> (Default=F8E0EC) )
TableExcel.makeExel_sep( μ—‘μ…€ 파일 경둜 <λ¬Έμžμ—΄> )

πŸ“ Explains

  • HTML table νƒœκ·Έμ˜ 데이터λ₯Ό μˆ˜μ§‘ 및 λ³€ν˜•ν•˜μ—¬ Excel 파일둜 λ§Œλ“€μ–΄μ£ΌλŠ” 라이브러리 μž…λ‹ˆλ‹€.
  • μ—‘μ…€ νŒŒμΌμ—λŠ” 링크, νŽ˜μ΄μ§€ 제λͺ©μ΄ ν¬ν•¨λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€.
  • ν•΄λ‹Ή μ›Ή νŽ˜μ΄μ§€μ˜ λͺ¨λ“  ν…Œμ΄λΈ”μ„ 수직으둜 μ •λ ¬μ‹œμΌœ ν‘œμ‹œν•©λ‹ˆλ‹€.
  • 각 ν…Œμ΄λΈ”μ˜ ν—€λ”λŠ” 색을 λ‹¬λ¦¬ν•˜μ—¬ ν‘œμ‹œν•΄μ€λ‹ˆλ‹€.

βœ” makeExel_sep() ν•¨μˆ˜λŠ” ν…Œμ΄λΈ”μ„ κ·ΈλŒ€λ‘œ λ³΄μ—¬μ€λ‹ˆλ‹€. rowspan, colspanμ—μ„œ 병합이 이루어지지 μ•ŠμŠ΅λ‹ˆλ‹€.

βœ” makeExel_abs() ν•¨μˆ˜λŠ” ν…Œμ΄λΈ”μ˜ 병합을 κ·ΈλŒ€λ‘œ κ΅¬ν˜„ν•©λ‹ˆλ‹€. rowspan, colspan의 병합이 μ—‘μ…€μ—μ„œλ„ λ™μΌν•˜κ²Œ μ΄λ£¨μ–΄μ§‘λ‹ˆλ‹€.

βœ” 쀑첩 ν…Œμ΄λΈ”, κ°€λ‘œ μ •λ ¬ ν…Œμ΄λΈ”λ„ λͺ¨λ‘ ν‘œμ‹œν•΄μ€λ‹ˆλ‹€.

πŸ“’ You should check this

  • You should check your ChromeDriver version

  • Also, You have to check, that your Chrome Browser Version and your ChromeDriver version is same


πŸ’‘ Here is Examples

πŸ“ Sample 1 (What is different between makeExel_sep() and makeExel_abs()?)

(URL : https://www.weather.go.kr/weather/observation/currentweather.jsp)

πŸ–₯ Web Page

weather-web2

πŸ” Table_Excel -> makeExel_sep()

seq-weather2

πŸ” Table_Excel -> makeExel_abs()

abs-weather2

πŸ“ Sample 2 (How about Table in table or horizontal arangement tables?)

(URL : http://www.kweather.co.kr/kma/kma_digital.html)

πŸ–₯ Web Page

weather-web

πŸ” Table_Excel -> makeExel_abs()

abs-weather

πŸ“ Sample 3 (Table in table case)

(path : ./sample_html/innerTable_Sample.html)

πŸ–₯ HTML

inner-html

πŸ” Table_Excel -> makeExel_abs()

abs-html

πŸ“ Sample 4 (Horizontal arangement tables case)

(path : ./sample_html/horizontal_table_sample.html)

πŸ–₯ HTML

horizon-html

πŸ” Table_Excel -> makeExel_abs()

abs-html2