
a simple python script that read the Microsoft Excel file.

excelReader Excel文件读取

a simple python script that read the Microsoft Excel file.


What it can do 它能干啥

parsing a single sheet *.xlsx file into a python list. An abstract usage:

把一个单表格的*.xlsx文件读取为一个python的列表。 下面是一个抽象一点的用法实例:

from ThisScript import TheReaderFunction
target_file = "path/to/file.xlsx"
table = TheReaderFunction(target_file)
C6 = table[5][2]
print C6   # type(C6) == str

For detailed sample, see the last part of the script.


Limits 不足

  • only supports single sheet file without any graph, excel programs. And the sheet should have default name sheet1 (Yes, the simplest type you could imagine)
  • 只支持单表格的文件,并且不能有任何图表、Excel编程。就是最简单的那种。并且那个表格的名字必须是默认的sheet1
  • not fully tested, may have some issues among different version of Excels. Good luck.
  • 未经过完整测试,可能在不同版本的Excel上有问题

xlsx file format xlsx文件格式

If you are just looking for a handy tool, skip this part.


xlsx is actually a zip file. The details could be found by using :


$ unzip sample.xlsx

The inner content contain some data. And we only need two of them. xl/sharedStrings.xml and xl/worksheets/*.xml. Well, maybe more than two.


Each *.xml under folder xl/worksheets/ stands for a sheet. By default, the sheet has name sheet1. Thus the xml file has name sheet1.xml.


All data in xlsx are regarded as strings, and stored in the sharedString.xml. Duplicated strings are dereplicated. Maybe that is just why it's called sharedString.xml.


The sheet1.xml specified the width and height of a sheet, and link every grid to a string in sharedString.xml. A sheet then could be retrieved.


Improvement 改进

I mean, improvement by YOUR hand. For me, this script is just enough for application.


Multi sheet support 多表格支持

This is simple. Find all the *.xml under xl/worksheets/, and return multiple list. Done.


More 可能还有其他的。。。