Flushing Live ContentBlockingLogs on Anti-Tracking DB Query

The top-K tracker query reads from the anti-tracking database, but recent block events live in an in-memory ContentBlockingLog that hasn’t been written out yet. This patch flushes the live logs into the database before the query runs, so the widget actually shows what just happened.

Bug 2030052 | D295196 (flush) + D295197 (browser test) | Reviewer: timhuang | Status: under review

The Problem

The Privacy Metrics Widget asks the anti-tracking database “what are the top trackers blocked recently.” The database is the source of truth for the widget’s UI. The widget renders, the user sees their top blockers.

The catch: live block events don’t go straight into the database. They accumulate in ContentBlockingLog objects in memory, attached to the page they were observed on, and get persisted later — on tab close, on idle, or via batched writes. So if you query the database right now, the answer is almost current, but missing the events from the last few seconds or minutes of browsing.

For most use cases that’s fine. For a widget whose entire point is showing the user what their browser just did for them, “almost current” is a noticeable bug. Open the dashboard right after blocking happens, see nothing new, conclude the widget doesn’t work.

The Fix

Two-part stack:

Part 1. Flush the live ContentBlockingLog entries into the anti-tracking database at the moment the database is queried. The flush is bounded — only the live, not-yet-persisted entries — and runs synchronously in the query path. That guarantees the database has the most current view possible at the moment the widget reads from it.

Part 2. Browser test that exercises the flush-on-query behavior. The test stages a block event, immediately runs the top-K query, and verifies the staged event shows up in the result. Without the flush, the test fails. With it, it passes. This is the kind of test that locks in the contract: any future refactor that breaks flush-on-query will trip the test.

The Design Tension

There are two reasonable views on where this flush should live.

View A: at the storage layer. The query is asking the database a question; the database should make sure its view is fresh before answering. Flush-on-read is a database-level invariant.

View B: at the caller. The widget knows it cares about freshness; the widget should explicitly request a flush before querying. Other callers might not care and shouldn’t pay the cost.

I went with View A for this patch, because the freshness expectation is implicit in what “top trackers” means and pushing that responsibility to every caller is a recipe for someone forgetting. But View B is defensible. It’s the kind of thing where reasonable engineers disagree and the answer comes down to which invariant is easier to maintain over time.

Tim’s review is in flight. If he pushes back on the layering, I’ll revisit.

What I Learned

“Eventually consistent” sounds fine until a UI depends on it. In-memory accumulation with periodic flushing is a perfectly good pattern for telemetry, logs, anything where bounded staleness is acceptable. It’s not OK for a widget whose whole job is reflecting what just happened.
Tests that pin down freshness contracts are valuable. The browser test in part 2 isn’t testing a feature, it’s pinning down an invariant. Future refactors that quietly break flush-on-query will fail this test, which is exactly what I want.
Database-layer vs. caller-layer is a real design choice. Both work. The reason I prefer database-layer is that it makes the freshness guarantee a property of the API rather than an obligation on every consumer.