How much data does LHCb collect?

What if we ask this obvious question, how much data does LHCb collect? We can think of this question in another form. What is the trigger rate of LHCb? And more importantly, how does that compare to other LHC experiments, particularly ATLAS/CMS ? These are all valid questions. But the answers are not that straightforward. Let's try to understand this in some detail.

At first glance, LHCb appears to be collecting more data per second than both ATLAS/CMS during Run 3. LHCb’s trigger system is unique, it operates with a fully software-based trigger that reads out the entire 30 MHz bunch crossing rate. Its final data output reaches around 10 GB per second, corresponding to hundreds of thousands of events per second. In contrast, ATLAS/CMS use a traditional two-level trigger system, a hardware trigger reducing input rates to about 100 kHz and a software High-Level Trigger (HLT) further reducing the event rate to a few kHz (ATLAS around 3 kHz, CMS roughly 2.6 to 5 kHz). This numerical comparison might lead one to conclude that LHCb is recording more data than its larger counterparts.

However, there’s more to the story. LHCb operates at an instantaneous luminosity about ten times lower than ATLAS/CMS. With approximately $2 \times 1 0^{33} {cm}^{- 2} s^{- 1}$ compared to $2 \times 1 0^{34} {cm}^{- 2} s^{- 1}$ , LHCb sees far fewer collisions per second, and importantly, much less pileup, that is, fewer overlapping proton collisions per bunch crossing. While ATLAS/CMS experience something around 60 pileup interactions, LHCb usually contends with around 6. This difference means that although LHCb records more events overall, these events are simpler and less crowded with overlapping data.

This difference is reflected in the trigger and data acquisition architecture. LHCb’s software-only trigger allows it to stream nearly all collisions for detailed filtering, prioritizing rare b-hadron decay signatures in a relatively clean environment. ATLAS/CMS, faced with exceedingly high collision rates and complex overlapping events, rely on stringent hardware triggers to reduce this flood before software processing. Thus, they end up recording fewer events per second, but these individual events carry much more complex information due to high pileup.

So while raw numbers indicate that LHCb saves more data per second, these figures are not directly comparable to ATLAS/CMS due to differences in luminosity, pileup, detector design, and physics goals. The volume of recorded data alone does not capture the richness or complexity of the information retained by each experiment.

To put it another way, LHCb’s seemingly higher data rate reflects its specialized focus and trigger design tailored for a lower-luminosity environment, whereas ATLAS/CMS record fewer but more intricate events suited to their high-luminosity, wide-physics programs. Understanding these subtleties helps paint a nuanced picture of data collection across the LHC experiments in Run 3, reminding us that numbers need context, especially in the world of particle physics.