Docs › Sync & Dedup Engine Sync & Dedup Engine
This page documents the deduplication engine, source priority order, and how rides flow from each provider. For the rider-facing version, see Connecting Strava, Garmin & Wahoo.
Why this is hard
A single ride recorded on a Garmin watch can arrive as a Garmin webhook, a Strava webhook (because Garmin auto-syncs to Strava), and a manual FIT upload. Each source reports slightly different timestamps, distances, and durations. The dedup engine has to recognize them as the same ride and decide what to keep.
Three-level dedup
- Provider ID match — unique indexes on
strava_activity_id, garmin_activity_id, etc. Same provider re-importing the same ride is rejected instantly. - Dedup key match — every ride gets a fingerprint:
{start_minute}||{distance_m}, stored and indexed. Same fingerprint = same ride. - Fuzzy match — when the fingerprint doesn't match exactly, RideTool checks for rides within ±2 minutes start time, ±15% duration, ±10% distance. Catches cross-source duplicates where each provider reported slightly different numbers.
Source priority
Higher priority wins on conflict:
- Manual FIT upload (4) — raw device file, richest data
- Garmin (3) — device-direct webhook
- Wahoo (2) — device-direct when recorded on Wahoo hardware
- Strava (1) — aggregator, least detail
When a duplicate is detected:
- Higher priority incoming →
action: "replace" (delete existing, insert new) - Lower priority incoming →
action: "skip"
Known gaps
- Replace = delete + insert, not field-level merge. When a higher-priority source replaces a lower one, enrichment from the original (Strava ride name, social data) is lost.
- Race condition — if two webhooks arrive simultaneously for the same ride, both may pass dedup before either is inserted.
- Fuzzy thresholds — duplicates with timestamps or distances differing by more than the tolerances slip through.
When fitness recomputes
- After each ride upload — immediate recalculation.
- Nightly — around midnight in the user's local timezone (fatigue decays even on rest days).
- Manual rebuild — from Account → Recompute Fitness Stats.
Key files in the codebase
backend/services/dedup_service.py — core engine (DedupService, make_dedup_key)backend/services/mongo_strava_sync_service.py — Strava sync with dedupbackend/services/mongo_garmin_sync_service.py — Garmin sync (async)backend/services/mongo_wahoo_sync_service.py — Wahoo synctests/test_dedup_service.py — unit tests