A Formula 1 car has over 300 sensors. Every one of them fires at up to 100 times per second. That’s 1.1 million data points per second from a single car. With 20 cars on track, a race weekend generates roughly 160 terabytes of raw telemetry.

This isn’t a big data problem. It’s a fast data problem.

The engineers on the pit wall have seconds — sometimes fractions of a second — to decide whether to pit, switch tire compound, or adjust the car’s energy deployment strategy. Those decisions are made on data that’s less than 50 milliseconds old.

Here’s how the data pipeline behind all of that actually works.

What Gets Measured

Every F1 car is essentially a rolling data center. The sensors break down into six categories:

Positioning — GPS coordinates, inertial measurement units (6-axis IMU), and individual wheel speed sensors. These tell the team exactly where the car is on track, how fast it’s moving, and how each wheel is behaving independently. Updated 100 times per second.

Perception — In autonomous racing series like the Indy Autonomous Challenge, LiDAR point clouds run at 10–30Hz and generate massive data volumes. In Formula 1, radar and ultrasonic sensors handle proximity detection.

Powertrain — Engine RPM, throttle position, fuel flow rate, and the state of the Energy Recovery System (ERS). The hybrid power unit alone has dozens of sensors monitoring temperatures, pressures, and electrical states. All at 100Hz.

Thermal — Infrared arrays measure tire surface temperature across multiple zones of each tire. Brake disc temperatures can exceed 1,000°C during heavy braking. Coolant and oil temperatures are monitored continuously. This data directly feeds tire strategy decisions.

Dynamics — Suspension travel at each corner, steering angle, lateral and longitudinal G-forces. This tells engineers how the car is handling and whether the driver is pushing beyond the setup’s optimal window.

Aerodynamics — Pitot tubes for airspeed, pressure sensors across the bodywork, and ride height sensors. These validate wind tunnel and CFD predictions against real-world performance.

The Streaming Layer

All of this data needs to leave the car, travel across the circuit, and arrive at the pit wall and the team’s factory in near real-time.

The industry standard for this is Apache Kafka — specifically Confluent Cloud in most modern implementations. Kafka’s publish-subscribe model is built for exactly this scenario: thousands of data points per second from multiple producers (cars), consumed by multiple downstream systems (dashboards, ML models, strategy simulations).

Data is partitioned by car ID and session type. Each sensor category gets its own Kafka topic. This allows different downstream consumers to subscribe only to the data they need — the tire engineer doesn’t need LiDAR data, and the strategist doesn’t need raw suspension telemetry.

Rivian, which operates one of the largest connected vehicle fleets in the world (150,000+ vehicles processing 5,500+ signal types), uses exactly this architecture — Kafka with Flink for real-time telemetry processing.

Real-Time Processing

Raw sensor data is noisy and unstructured. Apache Flink handles the stream processing — cleaning, windowing, and enriching the data as it flows through.

Typical Flink operations include:

  • Windowed aggregations — Computing lap averages, sector times, and rolling statistics over 1-second and 5-second windows
  • Tire degradation rate — A rolling regression over the last N laps that predicts when grip will fall below the threshold for a pit stop
  • Anomaly detection — Flagging brake temperature spikes or sudden changes in suspension behavior that might indicate damage
  • Gap calculations — Real-time computation of gaps to the leader, the car ahead, and the car behind
  • Energy deployment tracking — Monitoring battery state and ERS harvesting efficiency lap by lap

Flink’s advantage over batch processing (like Spark) is latency. Flink processes events as they arrive — there’s no waiting for a batch to fill. For pit-wall decisions, the difference between 50ms and 5 seconds can be the difference between winning and losing a position.

Storage: Time-Series First

Telemetry data is inherently time-series — every data point has a timestamp, a sensor ID, and a value. The storage layer needs to handle extremely high write throughput and efficient time-range queries.

InfluxDB is the most common choice in motorsport and IoT telemetry. It’s purpose-built for time-series data with built-in downsampling, retention policies, and a query language optimized for temporal patterns.

For teams on Azure, Azure Data Explorer (ADX) is increasingly popular — Grafana has published detailed guides on building F1 analytics stacks with ADX as the backend.

Raw data also flows to a data lake (typically S3 on AWS or Azure Data Lake Storage Gen2) in Parquet format for long-term storage and batch analytics.

Batch Analytics and ML

Real-time processing handles the race. Batch analytics handles preparation.

Databricks (Apache Spark) is the standard for large-scale analysis:

  • Lap simulation — Red Bull Racing runs over 1,000 race simulations per weekend using Oracle Cloud, testing different pit strategies against probabilistic models of competitor behavior
  • Tire modeling — ML models trained on historical degradation curves predict optimal tire compound selection for given track conditions
  • Driver comparison — Overlaying telemetry traces from different drivers on the same track to identify braking points, turn-in angles, and throttle application differences
  • Setup optimization — Correlating suspension settings, wing angles, and tire pressures with lap time performance

AWS SageMaker is used for model training in the official F1 ecosystem, powering the “F1 Insights” graphics that fans see during broadcasts.

The Pit Wall Dashboard

Everything converges on a bank of screens at the pit wall. Grafana is the most common open-source visualization layer, connected to InfluxDB or ADX for real-time data.

Key dashboard panels include:

  • Tire temperature heatmap — Four tires, multiple surface zones, updated every second
  • Lap time delta — Current lap vs. personal best vs. theoretical best (combining best sectors)
  • Energy deployment timeline — Battery state of charge and ERS harvesting across a lap
  • Pit window calculator — When to pit based on current tire degradation, traffic, and undercut/overcut projections
  • Weather radar — Rain probability overlaid on strategy models

Professional motorsport teams also use Cosworth Pi Toolbox — the industry-standard telemetry analysis software used in IndyCar and WEC.

Beyond Racing

Racing telemetry is one of the most demanding data engineering use cases. But the architecture is identical to problems across industries:

RacingOther Industries
1.1M sensor readings/secFinancial market tick data
Sub-50ms pit strategy decisionsReal-time fraud detection
20 cars × 300 sensorsIoT fleet management (thousands of devices)
Tire degradation predictionPredictive maintenance in manufacturing
Lap simulation (1,000+ scenarios)Monte Carlo risk modeling in finance

If you can build a pipeline that handles Formula 1 telemetry, you can handle anything.

The Stack

LayerToolWhy
IngestionKafka (Confluent Cloud)Pub-sub at scale, partitioned by car/session
ProcessingApache FlinkTrue real-time stream processing, sub-50ms
Time-series DBInfluxDB or Azure Data ExplorerPurpose-built for high-frequency sensor data
Data LakeAWS S3 / Azure Data Lake (Parquet)Long-term storage, batch replay
Analytics / MLDatabricks, SageMakerLap simulation, tire modeling, strategy optimization
DashboardsGrafanaReal-time pit-wall visualization
CloudAWS (F1 official), Oracle (Red Bull), Google Cloud (McLaren)Compute, storage, ML infrastructure

Simba Hu helps companies make better decisions with data and AI — from strategy to implementation. Based in Tokyo, serving clients globally. Book a strategy call or visit simbahu.com.