Back to overview
Degraded

Drop in reported Hive usage metrics

Jan 26 at 12:00am CET
Affected services
Usage reports processing

Resolved
Jan 29 at 04:00pm CET

A customer sent malformed data, leading to unnoticed ClickHouse insert errors.
Lack of monitoring for asynchronous inserts prevented early detection.

We did the following measures to resolve this issue:
- Implemented stricter input validation on the usage endpoint.
- Deployed fixes, restoring expected metric levels.
- Integration tests and alerts for asynchronous insert errors were added.

Updated
Jan 29 at 08:00am CET

We completed the post-mortem. Please reach out to us for more information.

Created
Jan 26 at 12:00am CET

A customer-reported drop in usage metrics led to an investigation, revealing that invalid data (negative duration values) caused insert failures in ClickHouse. The issue affected all customers due to batch metric writes.