11 years. 390,000 finishers. One result stands out: how fast you start relative to your ability group predicts your second-half collapse far more reliably than your raw pace. Here is the evidence.
390,000 finishers. 11 race years. One key question: does it matter how you pace the first half — not just how fast?
The short answer is yes — and the effect is larger than most runners expect. Runners who start 10% faster than their ability-group peers lose an average of 11 extra minutes in the second half, compared to those who start evenly. That finding holds across every finish band from sub-3:00 to 6:00+.
Built with Python and pandas for data engineering (scraping, cleaning, feature engineering), statistical regression in NumPy, and Chart.js for interactive visualisation. The key methodological choice: normalising each runner's first-half pace relative to their finish-time peers rather than using raw pace — this within-group approach is what unlocks the r = −0.59 signal and makes the predictor meaningful across all ability levels.
Six robust findings from 390,000 finishers (n = 389,122 with complete split data) across 11 race years.
The vast majority slow in the second half. Negative splits are genuinely rare — achieved by fewer than 6% of finishers.
The typical runner's second half is ~13% slower than their first. The fastest finishers are near 1.03×.
The biggest pace drop across the field occurs in the 30–35km segment — the classic marathon wall, quantified.
The pacing gap between the fastest and slowest finish bands is large and consistent across all 11 years.
Women's median split ratio is lower than men's within every finish time band — a consistent gap across the full dataset.
Going out 10% faster than your ability group costs ~11 extra minutes in the second half. Raw pace is a weak predictor (r = 0.21) — relative pace is not (r = −0.59).
Select your target finish band to see how runners like you typically pace — and what the data suggests.
Based on the regression model (r = −0.59), this shows what the data observed for runners who started at different paces relative to their ability group. Set your target time, adjust the slider, and see the predicted outcome.
Based on linear regression across 390,000 finishers (r = −0.59, slope = −110 pp per unit relative pace). Predictions are population averages — individual results vary with fitness, conditions, and nutrition. "Ability group" means runners who finished in the same time band as your target.
These three charts capture the main story. For the full breakdown across all 10 charts — gender, age, yearly trends, segment pacing, and pacing consistency — see the detailed findings page.
The distribution of split ratios (second-half ÷ first-half) peaks just above 1.0 with a long rightward tail. The median is ~1.13 — the typical runner's second half takes 13% longer than their first. Negative splits are achieved by fewer than 6% of finishers.
Takeaway: Even pacing is not the norm. Starting conservatively and holding pace is rare, not routine — even among experienced runners.
Hover any bar for exact count. Blue = negative split, red = positive split. Toggle Men / Women above to see how the distribution shifts.
Sub-3:00 runners show split ratios near 1.02–1.03. Runners finishing in 5:00–6:00 hours regularly exceed 1.18–1.20. The gap is large and consistent across all 11 years of data.
Caution: This is an association. Faster runners have greater aerobic capacity — the data cannot say whether even pacing causes faster finishing or simply reflects underlying fitness.
Hover any bar to see median, IQR, and runner count. Toggle Men / Women above to compare pacing by gender.
Raw first-half pace is a weak predictor (r = 0.21). But controlling for ability — expressing each runner's pace as a ratio to their finish-band median — the correlation strengthens dramatically to r = −0.58.
Going out 10% faster than your peer group is associated with ~11 extra percentage points of second-half slowdown. This is the project's strongest quantitative finding. This is a correlation — the data cannot prove that a slower start would have produced a better finish for those specific runners.
Hover bars to highlight the corresponding zone on the regression chart above.
Data sources, processing decisions, and assumptions. Understanding the methodology is essential for interpreting the findings correctly.
Results scraped from the official TCS London Marathon results platform (results.tcslondonmarathon.com), powered by mika:timing GmbH. Split times collected from individual runner detail pages.
Mass event finishers with a recorded finish time. Elite runners are excluded from mass charts. Virtual runners (different courses) are excluded entirely.
Split ratio = second-half time ÷ first-half time. Values above 1.0 indicate a positive split. Half time is cumulative elapsed time at the halfway mat (~21.1km).
Median used throughout (not mean) because finish times are right-skewed. Regression used only where it adds meaning. No clustering applied without data-driven justification.
Known Limitations
The analysis is in good shape. These are the remaining open questions worth pursuing.
A few lighter observations from the dataset — patterns that don't belong in the main analysis but are too interesting to leave out.
More runners finish with a 1.07 split ratio than any other — 14,740 of them. If you had to pick one number that describes how London usually goes wrong, this is it: a second half exactly 7% slower than the first.
Runners in their early forties pace more evenly (1.117×) than those under 40 (1.132×). Every age group above 54 fades progressively more — but the 40–44 bracket quietly outperforms the youngest runners by a measurable margin.
The 2018 race ran in 23°C+ heat. The median finish time was 23 minutes above the 10-year average, and 98.6% of runners posted a positive split — the highest rate in the dataset by a wide margin.
The gender pacing gap is small among fast finishers and peaks in the 4:30–5:00 band, where men fade 8.6 percentage points more than women on average. Hover to see the gap across every band.