ClickHouse Query Decomposition for Faster Metrics
Reduced heavy metrics query latency from ~8s to ~2s
Problem
A Clickhouse query was taking ~8 seconds to execute which is long enough that most users would assume it was broken. The underlying query combined filtering, DISTINCT ON on multiple columns in one go, and no amount of index tuning would help.
Constraints
- Response time needed to drop significantly without waiting for a full data-model redesign.
- The data set was large enough that "textbook one-query optimization" was not enough in this case.
- The API had to keep producing the same business output for clients.
- The solution needed to be practical and safe to ship quickly.
Solution
- After some tinkering, replaced one heavyweight query with a two-step query plan.
- First query fetched only unique
addressvalues from the candidate set usingDISTINCT. - Second query fetched full rows using that reduced address list.
- Chose this intentionally over a single-query "clean" approach because real execution behavior was better with decomposition in this workload.
Outcome
- Reduced query time from roughly ~8s to around 2s.
- Improved API usability and consistency for endpoints consumed by clients.
- Shipped a pragmatic optimization that outperformed the theoretically cleaner option.
- Demonstrated decision quality based on measured behavior, not only query-style preferences.
Stack
- SQL
- ClickHouse