scicloj/tablecloth

document better how to read xlsx files (and support multisheet xlsx files)

Opened this issue · 2 comments

Maype point here:

https://techascent.github.io/tech.ml.dataset/tech.v3.libs.fastexcel.html

But in any case "pure tablecloth" will then only read xlsx files with one sheet.

tech.ml.dataset supports multi sheet , in this way, reading first sheet for example.

(ns xxxx
  (:require [tablecloth.api :as tc]
            [tech.v3.libs.fastexcel]))

(->
 (tech.v3.libs.fastexcel/input->workbook "my-file-with-mutiple-sheets.xlsx")
 first
 (tech.v3.dataset.io.spreadsheet/sheet->dataset {}))

Maybe we can add additional small namespace to cover this case? I think about adding two functions: workbook and sheet->dataset + an alias in deps.edn with required dependency. Is there anything more worth importing?

Maybe we want that "tablecloth" has excel and arrow import working out of the box.
But then it would need to declare depdendencies on it, which dataset avoided to do.

I think it is a "showstoper" for a beginner to not be able to load an excel file without additional dependencies. (same for arrow)