Replies: 2 comments 1 reply
-
I don't think most people actually store the serialized HLLs into a table. cc: @electrum if he's thought about interop in the past. |
Beta Was this translation helpful? Give feedback.
1 reply
-
My use case is nearly 1T events aggregated down into a reasonable grain which the BI team can then further aggregate across many dimensions. I don't believe it's an unusual use case. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm interested in using spark to write HLL data structures that are compatible with Trino (and implicitly aws athena). From what I can tell, trino uses HLL datastructures from the associated airlift project. Spark leverages the apache datasketches library for HLL.
I'm hitting a dead end when trying to find anyone doing this. Before I go down a rabbit hole of writing a custom spark aggregator to use the airlift HLL datastructure, does anyone know of prior art or if there are plans in the trino pipeline to improve this compatibility?
Beta Was this translation helpful? Give feedback.
All reactions