a easy-to-use transducer based data analysis tools for the clojure programming language.
any finitely complicated problem can be made infinitely complicated by a finite number of macros, so why not write macros that write (macro based)meander code that would generate transducers functions?
…wait, but why not just use a meander?
because meander.epsilon/scan
is slow, and because transducers are super
composable and can be combined into endless sequences
(require '[taoensso.encore :as enc])
(require '[ribelo.danzig :as dz :refer [=>>]])
(def data (vec (repeatedly 1000000 (fn [] {:a (* (rand-int 100) (if (enc/chance 0.5) 1 -1))
:b (* (rand-int 100) (if (enc/chance 0.5) 1 -1))
:c (* (rand-int 100) (if (enc/chance 0.5) 1 -1))}))))
because it is super concise and pleasing to the eye
(defn q1 []
(into []
(comp
(map (fn [{:keys [a b] :as m}] (assoc m :d (+ a b))))
(filter (fn [{:keys [c]}] (pos? c)))
(map (fn [m] (update m :a inc)))
(filter (fn [{:keys [a b c]}] (= a b c))))
data))
(defn q2 []
(=>> data
(dz/with :d [+ :a :b])
(dz/where :c pos?)
(dz/with :a [+ :a 1])
(dz/where [= :a :b :c])))
(= (q1) (q2))
;; => true
because it is much faster than handwritten code
(enc/qb 1 (q1) (q2))
;; => [309.46 145.71] - in ms
the most basic function is the fat arrow which replaces the tread last
(=>> [1 2 3 4 5]
(map inc)
(filter even?))
;; => [2 4 6]
(macroexpand '(=>> [1 2 3 4 5] (map inc) (filter even?)))
;; => (clojure.core/into [] (ribelo.danzig/comp-some (map inc) (filter even?)) [1 2 3 4 5])
(=>> [1 2 3 4 5]
(map inc)
(when false
(filter even?)))
;; => [2 3 4 5 6]
fat arrows can be mixed with other arrows
(=>> [1 2 3 4 5]
(map inc)
(->> (mapv inc)))
;; => [3 4 5 6 7]
you can also use first and last
(=>> [1 2 3 4 5]
(map inc)
first)
;; => 2
(=>> [1 2 3 4 5]
(map inc)
(last))
;; => 6
where can take the function
(=>> data
(dz/where (fn [{:keys [a]}] (= a 1)))
(take 1))
;; => [{:a 1, :b 87, :c -27}]
the assumption is that we have a collection of maps, so we can query the key value
(=>> data (dz/where :a 1) (take 1))
;; => [{:a 1, :b 87, :c -27}]
if we need to search for a key, we must use '
(=>> [{:a :some/key} {:a :other/key}] (dz/where [= :a ':other/key]))
;; => [{:a :other/key}]
or keys
(=>> data (dz/where {:a 1 :b 1}) (take 1))
;; => [{:a 1, :b 1, :c 74}]
or keys and functions
(=>> data (dz/where {:a even? :b odd?}) (take 1))
;; => [{:a 40, :b 39, :c -76}]
we can use a vector, where the first argument is the function
(=>> data (dz/where [= :a :b :c]) (take 1))
;; => [{:a 27, :b 27, :c 27}]
(=>> data (dz/where [= :a 1]) (take 1))
;; => [{:a 1, :b 87, :c -27}]
ask for the key that meets the condition
(=>> data (dz/where even? :a) (take 1))
;; => [{:a -96, :b -84, :c -76}]
(=>> data (dz/where :a even?) (take 1))
;; => [{:a -96, :b -84, :c -76}]
square clojure is still clojure
(=>> data (dz/where [= [+ :a :b] :c]) (take 1))
;; => [{:a 0, :b 2, :c 2}]
(=>> data (dz/where [= [+ :a :b] [+ :c :a]]) (take 1))
;; => [{:a 75, :b -43, :c -43}]
meander just works
(=>> data (dz/where {:a ?x :b ?x :c ?x}) (take 1))
;; => [{:a -32, :b -32, :c -32}]
(require '[meander.epsilon :as m])
(=>> data (dz/where {:a (m/pred pos?)}) (take 1))
;; => [{:a 92, :b -64, :c -96}]
is as fast as the fine-tuned hand-written code
(enc/qb 1
(=>> data (filter (fn [{:keys [a]}] (= a 1))))
(=>> data (filter (fn [m] (= 1 (:a m)))))
(=>> data (dz/where :a 1))
(=>> data (dz/where {:a 1})))
;; => [81.88 54.14 48.77 52.16]
you can change an individual value at i
element
(=>> data (dz/with 0 :a 999) (take 1))
;; => [{:a 999, :b 23, :c 32}]
a map can be used
(=>> data (dz/with 0 {:a 999 :b -999}) (take 1))
;; => [{:a 999, :b -999, :c 32}]
function
(=>> data (dz/with :d (fn [{:keys [a b]}] (+ a b 10))) (take 1))
;; => [{:a 24, :b 23, :c 32, :d 57}]
square clojure still behaves like clojure
(=>> data (dz/with :d [+ :a :b [- :c 10]]) (take 1))
;; => [{:a 92, :b -64, :c -96, :d -78}]
a whole column can be added
(=>> data (dz/with :d 5) (take 3))
;; => [{:a 24, :b 23, : c 32, :d 5}
;; {:a 53, :b 69, :c -99, :d 5}
;; {:a -4, :b 80, :c -16, :d 5}]
many things in one go
(=>> data (dz/with {:a 5 :b 10}) (take 1))
;; => [{:a 5, :b 10, :c -69}]
(=>> data (dz/with {:a [+ :a 1000] :b [+ :b 1000]}) (take 1))
;; => [{:a 927, :b 905, :c -69}]
conditional with
(=>> data
(dz/with 0 :a -999)
(dz/with :when [= :a -999] {:a 999 :b 999 :c 999})
(dz/where :a 999)
(dz/row-count))
;; => [1]
wip
wip
wip