Smoke-testing the just-merged provenance PR against live logs revealed
the new ID-match block was silently no-opping: no [ExtID Match] /
[Provenance Match] log lines despite the code path being live. Tracing
revealed two related gaps in extract_external_ids' source detection:
1. **Underscore-prefixed key.** Deezer / Discogs / Hydrabase clients
tag normalized track dicts with ``_source`` (underscore prefix —
convention used in 8+ places across core/). The extractor only
looked for ``provider`` and ``source``, so Deezer-sourced tracks
silently returned no IDs.
2. **No provider field at all.** Spotify and iTunes raw API responses
carry ``id`` but no provider/source key of any kind. The extractor
couldn't disambiguate the native ``id``, so Spotify-primary scans
would have hit the same silent miss once the user switched primary
sources.
Two-part fix:
- ``extract_external_ids`` now recognizes ``_source`` as another
candidate provider field.
- New optional ``source_hint`` parameter lets the caller supply the
configured primary source as a fallback when the track dict has no
provider field of its own. Track-side provider field still wins
when present (defensive against a wrong hint).
Watchlist scanner now passes ``get_primary_source()`` as the hint so
both naming conventions (Deezer-style _source, Spotify-style no-tag)
get handled uniformly.
6 new regression tests cover:
- _source recognized for Deezer
- _source recognized for Hydrabase (cross-provider mapping)
- _source recognized for Discogs (no library column — verifies
graceful no-crash)
- source_hint disambiguates raw tracks for spotify/itunes/deezer
- track-side provider takes precedence over hint
- None hint defaults safely
Full pytest 1630 passed; ruff clean. After this lands and the server
restarts, watchlist scans should produce [ExtID Match] /
[Provenance Match] log lines for tracks already on disk regardless of
which metadata source the user has configured as primary.