SoulSync

Commit Graph

Author	SHA1	Message	Date
Broque Thomas	34ba26f5c8	Persist source IDs at download time + backfill onto tracks on sync Followup to fix/watchlist-external-id-match. The companion PR closed the demand side — the watchlist scanner asks for tracks by external IDs before falling back to fuzzy. But for users on Plex / Jellyfin / Navidrome the supply side was still broken: tracks.spotify_track_id (and the other ID columns) only got populated by the asynchronous enrichment workers, sometimes hours after the file was actually written. During that window the ID match fell through to fuzzy and the bug returned. We were already collecting every ID during post-processing — they live in the `pp` dict in core/metadata/source.py:embed_source_ids and get embedded into file tags. We just dropped the in-memory copy afterwards. This PR persists them and uses them: - Schema migration adds spotify_track_id / itunes_track_id / deezer_track_id / tidal_track_id / qobuz_track_id / musicbrainz_recording_id / audiodb_id / soul_id / isrc columns + indexes to the existing track_downloads table (already keyed by file_path). - core/metadata/source.py:embed_source_ids exposes pp["id_tags"] and the resolved ISRC back to the import context as _embedded_id_tags / _isrc. - core/imports/side_effects.py:record_download_provenance reads those context fields and passes them to db.record_track_download, which now accepts the new ID kwargs and persists them. - New db.get_provenance_by_file_path with exact + basename-suffix fallback (handles container mount-root differences between download-time path and media-server-reported path). - New db.backfill_track_external_ids_from_provenance copies IDs from track_downloads onto a tracks row idempotently — COALESCE on every column preserves any value the enrichment worker already wrote (enrichment is more authoritative for late binding). - database/music_database.py:insert_or_update_media_track (the single insertion point used by every Plex / Jellyfin / Navidrome sync) calls the backfill immediately after each INSERT/UPDATE. - New core/library/track_identity.py:find_provenance_by_external_id used as a second-tier fallback in watchlist_scanner.is_track_missing _from_library — catches the window between download and media-server sync. Caller checks os.path.exists on the provenance file_path before treating it as "already in library" so a deleted file doesn't prevent re-download. Effect: freshly downloaded files become ID-recognizable to the watchlist on the very next scan, no enrichment-wait window. 19 regression tests in tests/test_provenance_id_persistence.py: - Schema migration adds expected columns + indexes - record_track_download persists every ID kwarg - record_track_download backward-compat (old kwargs still work) - get_provenance_by_file_path: exact match, basename fallback for mount-root differences, multi-record latest-wins, defensive None - backfill: copies all IDs, preserves existing via COALESCE, no-op when no provenance exists - find_provenance_by_external_id: per-ID lookup, ISRC cross-bridge, OR semantics, latest-wins on multiple matches Out of scope: backfilling provenance for files downloaded BEFORE this PR (their track_downloads rows don't carry the new IDs). Those continue to wait for enrichment. Acceptable — only affects historical files; new downloads benefit immediately. Full pytest 1625 passed; ruff clean.	4 weeks ago
elmerohueso	cd19aa0301	revert tidal artist/track id name for hifi downloads Co-authored-by: Copilot <copilot@github.com>	4 weeks ago
elmerohueso	4ddb86522c	name tidal and hifi tags the same way	4 weeks ago
elmerohueso	e78dd7f593	get tidal tags during download, without needing to go through the enrichment pipeline	4 weeks ago
elmerohueso	1f4e8e5e3b	get hifi tags during download, without needing to go through the enrichment pipeline	4 weeks ago
elmerohueso	b363afe195	bpm for tidal, copyright and bpm for hifi	4 weeks ago
elmerohueso	f9f47f978e	fix post-download tagging, and enable tagging for hifi	4 weeks ago
Antti Kettunen	a759f778b6	Move metadata API into package - add package-owned metadata API, cache, registry, and lookup modules - keep legacy metadata_service and metadata_cache paths as explicit shims - update metadata call sites and tests to use package-owned helpers	1 month ago
Broque Thomas	c121582557	MusicBrainz genres: fall back to release then artist when recording is empty User report: SoulSync was only pulling MusicBrainz genres from the recording (track-level) endpoint. Most MB recordings don't carry genres at the track level — they live on the release (album) or artist. So the MB tier was contributing nothing to the genre merge for the overwhelming majority of tracks. Fix: - Added `'genres'` to the release-detail `includes` (was missing). - After release-detail processing, if pp['mb_genres'] is still empty, populate from release_detail['genres'] (sorted by count desc). - If still empty AND artist_mbid is set, fetch artist with `includes=['genres']` and use those. No extra API call when the recording (or release) already had genres — the artist fetch only fires when both upstream tiers came back empty. The downstream genre merge in _embed_metadata_genres is unchanged; this just makes the MB feed into it richer. Tests: 4 new (recording present, recording empty → release, recording + release empty → artist, all empty → []). Full suite 873 passing. Ruff clean. Reported by @kcaoyef421 in Discord.	1 month ago
Antti Kettunen	02305096a3	Tighten metadata and import safety - Normalize album import track display handling so queue labels and match rows stay consistent - Bound MusicBrainz caches and avoid caching transient lookup failures - Stop swallowing programmer errors in source enrichment helpers - Restore import config test seams without reintroducing lazy imports - Guard task completion calls and fix the Windows path test expectation - Keep file lock tracking from growing without bound	1 month ago
Antti Kettunen	9e496397da	Move shared metadata helpers into package - Relocate the shared metadata helper module from core/metadata_common.py into core/metadata/common.py. - Update the new metadata package, the import pipeline, and the web entrypoint to use the package-scoped helper. - Keep the shared config, mutagen, file-lock, and tag-writing helpers centralized without touching unrelated files.	1 month ago
Antti Kettunen	9656dbd46a	Thread runtime through metadata enrichment - Pass the live runtime bundle into the shared metadata facade so worker-backed source enrichment can actually run. - Forward runtime from the import pipeline and web-server wrapper into embed_source_ids. - Add a regression test that verifies the runtime object reaches the source-ID embedding path.	1 month ago
Antti Kettunen	8319c6679f	Move new metadata helpers into a package - Keep existing metadata_cache and metadata_service at the top level for now - Move the new branch-local metadata helpers under core/metadata - Share MusicBrainz release cache state from core.metadata.source and update import sites	1 month ago

13 Commits (03a7ccd74a511cfa79af9726dec698dc982a2fe5)