scicloj/tablecloth

let-dataset .. and more ?

Opened this issue · 4 comments

Just found out about let-dataset ... great feature. Now perhaps you want to expa d let-dataset by adding add-colums-let .. same macro but it will get a dataset to operate on as first parameter ... the second parameter is unchanged (the vector of let bindings). What it will do is create a binding of all column names (so keyword to symbol mapping) so that they can be used in the binding. And second it will add all binding names to columns (similar to how it is done in let-dataset

; make ds1 .. with x y z columns
(def ds1
(tc/let-dataset [x (range 1 6)
y 1
z (dfn/+ x y)]))

; add a column to ds1
; note x y are the :x :y columns in the dataset
(tc/add-columns-let ds1 [a (dfn/+ x y)])

This is my current approach:

https://github.com/clojure-quant/techml.vector-math/blob/main/test/syntax.clj

(s/calc d [x (+ a b)
y (+ x c)
z [y 1]
])
This is the macro that adds bindings to all columns in the dataset:
https://github.com/clojure-quant/techml.vector-math/blob/main/src/cquant/vmath/syntax/column.clj

My goal is to be able to enter vector math in a format that has identical syntax to a scalar only math. So (* a b) in vector mode means (let [a (:a ds1) b (:b ds1)] (dfn/* a b)) or in scalar mode just (let [a 1 b 2] (* a b)). I feel such math is better placed in tablecloth.

Generally there is a subproject by @ezmiller which lifts all columnar / vector operations to its own namespace.

I think there is a space for a macro you've proposed. The only change would be explicit column accessor, column name can be anything, not only keywords.

(tc/let-add-columns ds [a :a b :b z (dfn/+ a b)])

or maybe add one more arity to let-dataset?