Gear & Tech

Which numbers on your fitness tracker are real, and which to ignore

Trackers are good at counting things with a clean physical signal and bad at guessing the rest. Here is which numbers to act on, which to read as a trend, and which to treat with a fistful of salt.

Close-up of a smartwatch on a person's wrist showing a heart rate reading of 104 beats per minute
Photo: Ahmed Akeri / Pexels

Your fitness tracker is a brilliant pedometer wearing the costume of a lab. Some of the numbers on its screen are genuinely accurate; others are confident guesses dressed up as measurements. The single rule that predicts which is which: trackers are good at counting things that make a clean physical signal, and bad at modelling things they have to infer. A stride, a pulse beat and a satellite fix are countable. Your calorie burn, your sleep stage and how stressed you feel are not, so the device estimates them, and that is where the numbers wobble.

Here is the trust ranking, sorted by how hard the evidence backs each metric, plus what to actually do with it.

Tier 1: trust the number

These are the metrics the device measures more or less directly. Act on them.

Steps. A 2025 living meta-analysis of Apple Watch studies put step error at 8.17% on average, comfortably inside the 10% threshold researchers treat as valid. Steps are a clean mechanical signal, so the watch counts them well. The catch is real and worth knowing: wrist trackers undercount badly when your arm is not swinging. Pushing a stroller, a shopping trolley or a wheelchair can hide 35% to 95% of your steps, and very slow walking gets undercounted too, with error climbing sharply at a crawl. Your steps are not missing because you are unfit. They are missing because your wrist sat still.

Steady-state heart rate. This is the quiet success story of wearables. In the seminal 2017 Stanford evaluation of seven wrist devices, six measured heart rate to within 5% error. The 2025 Apple Watch meta-analysis landed at 4.43%. At a steady effort, optical heart rate is close enough to trust. The asterisk: it degrades during intervals and sprints, when arm motion and poor skin contact smear the signal, and the error climbs with intensity. So the number is trustworthy when you are jogging at a constant pace and shakier when you are doing hill repeats.

GPS pace and distance. A validation study of eight positioning-enabled sport watches found distance error of 3.2% to 6.1%, with only Polar's receivers landing under 5% overall. Distance tends to be underestimated, and accuracy drops in cities and forests, where buildings and tree cover scatter the satellite signal. Running produces more error than walking or cycling. For an easy park run on open ground, the distance is solid; for a route threading between HDB blocks in the CBD, knock off a little faith.

Tier 2: trust the trend, not the digit

Resting heart rate and sleep duration sit one rung down. The device measures them reliably enough to flag your own week-to-week changes, but not precisely enough to compare against a friend or a textbook figure.

For sleep, modern trackers nail the basic question. Across Oura, Apple Watch and Fitbit, sleep-versus-wake sensitivity runs at 95% or higher. The watch genuinely knows roughly how long you slept. Your resting heart rate creeping up over a stressful fortnight, or your sleep duration sliding after a month of late nights, is a signal worth reading. Just read the direction of travel, not the third decimal place.

Read the direction of travel, not the third decimal place.

Tier 3: treat with a fistful of salt

Now the guesses. These metrics are inferred from the raw signals, and the inference is where accuracy goes to die.

Calorie burn. This is the single least accurate number on the device, full stop. In the 2017 Stanford study, not one of the seven wrist wearables measured energy expenditure accurately; the best was off by 27% and the worst by 93%. Nearly a decade later, the 2025 meta-analysis found the Apple Watch's calorie error sitting at 27.96%, with cycling worse (around 52%) than walking or running (around 31%). The error has barely moved across hardware generations, which tells you this is a limit of guessing calories from a wrist signal, not a software bug a firmware update will fix.

The practical fallout: do not eat back the calories your watch says you burned. If the device tells you that spin class torched 600 calories and the truth is closer to 300, treating the screen as gospel is a reliable way to stall fat loss. The calorie figure is a vibe, not an accounting entry.

Sleep-stage breakdowns. Knowing how long you slept is easy. Sorting that sleep into light, deep and REM is hard. A 2024 study against polysomnography (the clinical gold standard) found deep-sleep sensitivity of 79.5% for Oura, 61.7% for Fitbit and 50.5% for Apple Watch. An 11-device multicentre study covering 349,114 epochs of sleep found the best device managed a staging score (macro F1) of just 0.69, and the worst 0.26. So when your app announces you got 47 minutes of deep sleep, treat that as a loose estimate of a quantity even sleep labs find fiddly. The hypnogram is decorative.

Stress scores and HRV "readiness". These are the boldest guesses of all, because they take an already-noisy input (heart-rate variability) and slap an emotional label on it. In a large real-world study of information workers, tracker HRV explained about 2.2% of the variance in how stressed people actually felt, which is roughly a correlation of 0.15. The researchers explicitly cautioned against calling HRV "stress" without proper validity data. A low score does not mean you are objectively stressed or that you must skip training. Treat it as a vague mood ring, not a verdict.

Readiness, recovery and "body battery" scores deserve the same scepticism, because they are proprietary models built largely on the same shaky HRV and sleep-stage inputs. They can be a rough nudge to take it easier. They are not a number to obey.

Why better hardware can't fully fix it

Two of the Tier 3 problems are baked into the physics. Wrist heart rate uses optical sensors (PPG) that shine green light into your skin and read the reflection. Motion blurs that signal, and so does anything that absorbs the light: a 2025 study found error grows with exercise intensity and is larger for darker skin tones, because melanin absorbs green light. Tattoos under the sensor and higher BMI degrade it too. Steady-state heart rate is the safe zone; intervals, sprints and inked wrists are not.

The deeper issue is the gap between measuring a signal and inferring a quantity from it. The watch can read your pulse. It cannot read your metabolism, your brainwaves or your mood, so it models them from proxies, and a model is only ever as good as its weakest assumption. No price tag closes that gap.

The cheat sheet

MetricTrust levelWhat to do with it
StepsHighAct on it (mind the still-arm undercount)
Steady-state heart rateHighAct on it (not for intervals)
GPS pace and distanceHighAct on it (worse in cities and forests)
Resting heart rateTrend onlyWatch your own week-to-week shift
Sleep durationTrend onlyReliable enough to spot patterns
Calorie burnLowDo not eat it back
Sleep stagesLowLoose estimate at best
Stress / HRV / readinessLowA nudge, never a verdict

The Singapore angle is almost reassuring here. The National Steps Challenge, run through the Health Promotion Board's Healthy 365 app, rewards your daily step count and syncs with HPB-issued trackers plus Fitbit, Garmin, Huawei, Polar and Samsung. It runs on steps, the one metric that is genuinely trustworthy, with the still-arm caveat noted. Swing your arms on the way to the MRT and the count holds up.

Bottom line
Believe your steps, steady heart rate and GPS distance. Read resting HR and sleep duration as trends. And treat calorie burn, sleep stages and stress scores as rough suggestions, never instructions.

Sources

The Catalyst Feed
Content TeamIndependent, hands-on coverage of health, fitness & the tech that tracks it. Reviews you can trust — no hype.
#fitness-trackers#wearables#heart-rate-accuracy#calorie-burn#sleep-tracking#hrv#gps-accuracy#gear-tech