Documentation party
Closed this issue · 1 comments
danthegoodman1 commented
- What is Icedb?
- Who uses it
- How to use it
- Arch
- How the log works (data formats)
- How the data parts work
- How merging works
- How tombstone cleaning works
- Why icedb, what makes it different
- Why not bigquery
- Why not Athena
- Why not spark/emr
- Why not clickhouse/timescale/redshift/etc
- Performance
- Cost comparison to bigquery for the same dataset and queries
- When to merge and tombstone clean
- Parameters
- Tips and tricks
- Merge and tombstone coordination for multiple ingestion nodes of the same table
- Large batch inserts
- Schema validation before insert (need to make sure is consistent, easiest if single host manages a table exclusively), see tips in #85 for detecting changes to schema in cache and checking against serializable tx
- Pair with RedPanda for ingest works well
- Self-batching like in https://github.com/danthegoodman1/icedb/blob/main/examples/api-full.py
danthegoodman1 commented
Performance testing, compare to other solutions like BigQuery, Athena, ClickHouse, Spark/EMR