/xlrd

Library for developers to extract data from Microsoft Excel (tm) spreadsheet files

Primary LanguagePython

<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv='Content-Type' content='text/html; charset=us-ascii' />
<title>The xlrd Module -- README</title>
</head>
<body>

<h3>Python package "xlrd"</h3>

<p><b>Purpose</b>: Provide a library for developers to use to extract data
    from Microsoft Excel (tm) spreadsheet files.
    It is not an end-user tool.
</p>
<p><b>Author</b>: John Machin, Lingfo Pty Ltd (sjmachin@lexicon.net)
</p>
<p><b>Licence</b>: BSD-style (see licences.py)
</p>
<p><b>Version of xlrd</b>: 0.7.1 -- 2009-05-31
</p>
<p><b>Versions of Python supported</b>: 2.1 to 2.6.
</p>
<p><b>External modules required</b>:
</p>
<dl><dd> The package itself is pure Python with no dependencies on modules or packages
    outside the standard Python distribution.
</dd>
<dd> To run the demo script runxlrd.py with
    Python 2.2 or 2.1 requires the Optik module (version 1.4.1 or later) from
    http://optik.sourceforge.net/
</dd>
</dl>
<p><b>Versions of Excel supported</b>:
    2004, 2003, XP, 2000, 97, 95, 5.0, 4.0, 3.0, 2.1, 2.0.
    Support for Excel 2007 .xlsx files scheduled for version 0.7.1.
</p>
<p><b>Outside the current scope</b>: xlrd will safely and reliably ignore any of these
if present in the file:
</p>
<ul>
<li> Charts, Macros, Pictures, any other embedded object. WARNING: currently
      this includes embedded worksheets.
</li>
<li> VBA modules
</li>
<li> Formulas (results of formula calculations are extracted, of course).
</li>
<li> Comments
</li>
<li> Hyperlinks
</li>
<li> Autofilters, advanced filters, pivot tables, conditional formatting, data validation
</li>
</ul>
<p><b>Unlikely to be done</b>:
</p>
<ul><li> Handling password-protected (encrypted) files.
</li>
</ul>
<p><b>Particular emphasis (refer docs for details)</b>:
</p>
<ul><li> Operability across OS, regions, platforms
</li>
<li> Handling Excel's date problems, including the Windows / Macintosh
      four-year differential.
</li>
<li> Providing access to named constants and named groups of cells (from version 0.6.0)
</li>
<li> Providing access to "visual" information: font, "number format", background, border,
     alignment and protection for cells, height/width etc for rows/columns (from version 0.6.1)
</li>
</ul>
<p><b>Quick start</b>:
</p>
<pre><code>    import xlrd
    book = xlrd.open_workbook("myfile.xls")
    print "The number of worksheets is", book.nsheets
    print "Worksheet name(s):", book.sheet_names()
    sh = book.sheet_by_index(0)
    print sh.name, sh.nrows, sh.ncols
    print "Cell D30 is", sh.cell_value(rowx=29, colx=3)
    for rx in range(sh.nrows):
        print sh.row(rx)
    # Refer to docs for more details.
    # Feedback on API is welcomed.
</code></pre><p>
</p>
<p><b>Another quick start</b>: This will show the first, second and last rows of each
    sheet in each file:
</p>

<pre><code>    OS-prompt>python PYDIR/scripts/runxlrd.py 3rows *blah*.xls</code></pre>

<p><b>Installation</b>:
</p>
<ul><li> On Windows: use the installer.
</li>
<li> Any OS: Unzip the .zip file into a suitable directory,
    chdir to that directory, then do "python setup.py install".
</li>
<li> If PYDIR is your Python installation directory:
    the main files are in PYDIR/Lib/site-packages/xlrd
    (except for Python 2.1 where they will be in PYDIR/xlrd),
    the docs are in the doc subdirectory,
    and there's a sample script: PYDIR/Scripts/runxlrd.py
</li>
<li> If os.sep != "/": make the appropriate adjustments.
</li>
</ul>
<p><b>Download URLs</b>:
</p>
<ul><li> http://pypi.python.org/pypi/xlrd
</li>
<li> http://www.lexicon.net/sjmachin/xlrd.htm
</li>
</ul>
<p><b>Acknowledgements</b>:
</p>
<ul><li> This package started life as a translation from C into Python
of parts of a utility called "xlreader" developed by David Giffin.
"This product includes software developed by David Giffin &lt;david@giffin.org&gt;."
</li>
<li> OpenOffice.org has truly excellent documentation of the Microsoft Excel file formats
and Compound Document file format, authored by Daniel Rentz. See http://sc.openoffice.org
</li>
<li> U+5F20 U+654F: over a decade of inspiration, support, and interesting decoding opportunities.
</li>
<li> Ksenia Marasanova: sample Macintosh and non-Latin1 files, alpha testing
</li>
<li> Backporting to Python 2.1 was partially funded by Journyx - provider of
timesheet and project accounting solutions (http://journyx.com/).
</li>
<li> Provision of formatting information in version 0.6.1 was funded by Simplistix Ltd
   (http://www.simplistix.co.uk/)
</li>
<li> &lt;&lt; a growing list of names; see HISTORY.html &gt;&gt;: feedback, testing, test files, ...
</li></ul>

</body>
</html>