Why CS

In most knowledge-intensive fields, the capacity to identify what matters and the capacity to build what matters are structurally separated. The person who sets the direction is not the person who executes it. This is not arbitrary. The division reflects a real assumption: that judgment and production are distinct skills, requiring different training and rewarding different temperaments, and that conflating them produces worse outcomes than keeping them in separate roles.

CS collapses this. The gap between deciding what to build and building it compresses, at the limit, to almost nothing. You can identify what matters in the morning and have it running by evening. That specific property is why I am here. Everything else is evidence for it.


I train fifteen hours a week. I know what it feels like to be in the middle of a hard session, heart rate at threshold, and find that the effort is not the cost but the point. That the exertion itself is the reward. Around week three at Mozilla I felt something structurally identical in front of a screen. Not comfort. Not flow-state contentment. The same adrenaline. I would ship something and immediately want to go deeper into the problem. I would get to 2am and not feel the hour. I would wake up thinking about the next layer of the architecture before I had reached for my phone. I had not felt that before, not in debate, not in years of coursework I was good at and cared about. The distinction between effort that depletes and effort that feeds itself is real and not subtle. At Mozilla, for the first time, the work was feeding itself.


The starting point was PPE. I had offers from top programs and was serious about taking them. Philosophy, politics, economics: a coherent intellectual project organized around how to reason, how power works, and how to model incentives. I was not hedging toward CS. I was a PPE student.

PPE trains judgment about systems you cannot directly touch. None of it produces anything. Debate, which I competed in at the national level, added adversarial triage on top of that: which claims are load-bearing, which evidence is doing real work versus performing it. Still judgment. Still no lever.

I chose CS over those programs. The reason I can state now is cleaner than the reason I had at the time, but it has something to do with what I said at the top: CS is the one field where the gap between identifying what matters and acting on it can close. Everything I had built in PPE and debate was analytical capacity with nowhere to go. CS was the lever.

What I noticed when I entered CS, across the first year, was a pattern I could not immediately explain. I kept winning hackathons. Not because I was the best engineer in the room. I was not, certainly not in the first year. The first time, I noticed I had converged on what the problem actually required faster than most teams, and built toward that rather than toward what seemed most technically impressive. The second and third times, the same thing happened. Across eight competitions, including 1st out of 260 teams at Google HQ and 1st among 330 competitors at UofTHacks, the pattern held.

I did not, at the time, fully understand why. What I noticed was that the philosophical and debate training seemed to transfer in ways I had not predicted: the ability to distinguish what a problem actually required from what it appeared to require was doing more work than my technical execution, which was limited. The math, when I eventually ran it, confirmed what the pattern had suggested: under the null hypothesis that placements are determined by execution skill, expected wins across eight competitions of this scale is approximately 0.03. The joint probability of the two largest fields alone is 1/85,800. More pointedly, a first-year student should have a below-average win rate if execution is binding. 8/8 is anomalous in the wrong direction for that hypothesis.

What the evidence showed, over time, was that the relevant skill was a filtration one developed through years of philosophy and debate in a different vocabulary. CS gave me a lever for it. The gap between identifying what mattered and building the thing that mattered collapsed. That was what had been missing.


The Mozilla work is the more controlled test of what this looks like at scale and over time.

Browser vendors are in the middle of a multi-year architectural dispute about tracking and privacy. Chrome is building its Privacy Sandbox. Safari has Intelligent Tracking Prevention. Firefox has Enhanced Tracking Protection. The dispute is partly philosophical and partly empirical. The empirical question, what tracker blocking actually costs users in performance and what it actually protects them from in precise quantitative terms, was largely unanswered. The decision about which trackers to block by default, how to communicate privacy protection to users, how to make the case to regulators. All of it was being made without a cost model. The question of whether you can measure what actually matters in a system as complex as browser privacy is not a simple one.

The project I built at Mozilla tried to answer a piece of it. A tracker performance cost model trained on 348 million HTTP requests across 4,592 domains, comparing ten architectures including character-level URL CNNs, the model has to predict the performance cost of blocking a tracker from pre-response features alone, before the network responds, because that is the only point where intervention is possible. Deployed as an ONNX model doing live per-request inference in Firefox’s anti-tracking engine for 250 million users. 39.4% MAE improvement over the static domain-level baselines it replaced.1 When you can quantify what protection costs and what it saves at the per-request level, it becomes infrastructure for decisions that were previously intuitive. The output is not just a number. It is legibility where there was none.

What I did not expect was how far the same problem would pull me. The research led into C++ security work: six zero-day vulnerabilities in Gecko’s engine, reverse-engineering Rice decoding, I/O serialization, and clipboard data flow across content process boundaries, fixes shipped across 500 million daily sessions. It led into systems architecture: the new tab privacy widget, a 3-layer cross-process service with C++ XPCOM storage, JS aggregation, IPC boundaries, private browsing isolation preventing data leakage across contexts. Three layers, entered from the same starting question about what the product actually needed.

A lean environment with no product function meant the judgment about what to build and the execution of building it were not separate steps. I find it hard to describe this accurately without it sounding like a pitch. What I mean is something more specific: the thing I was building toward, without fully understanding it, was the ability to operate in exactly this way. To identify what matters and close the distance to it myself, in the same motion.


Three things, stated plainly: I am obsessed with this work in a way I have not been obsessed with anything else. I have judgment about what to build, developed through years of training a different kind of filtration skill in a different domain. And I can build at whatever layer the problem requires. The question of whether that combination is correctly calibrated for what comes next is the one I cannot resolve in advance. It is also the one I most want answered.

Footnotes

  1. The architecture comparison mattered less than the loss function. Tracker request data is zero-inflated and right-skewed: most requests have near-zero performance cost, a small fraction are expensive. Standard regression losses produce predictions reasonable in expectation but wrong in the cases that matter for user-facing output. Tweedie loss is distribution-aware. Selecting it required identifying the distributional structure of the problem before selecting a model, which is a judgment call before it is a technical one.