Honor manually-matched source IDs in per-source enrichment workers

GitHub issue #501 (@Tacobell444). After manually matching an album to
a specific source ID via the match-chip UI, clicking "Enrich" on that
album would fuzzy-search by name and overwrite the manual match with
whatever the search returned — or revert the match status to
``not_found`` if name search missed. Reorganize then read the now-
wrong ID and moved files to the wrong destination.

Root cause was in the per-source enrichment workers'
``_process_*_individual`` methods. Several workers (Spotify, iTunes)
ran search-by-name unconditionally with no check for an existing
stored ID. Others (Deezer, Tidal, Qobuz) skipped on existing-ID but
without refreshing metadata — preserved the ID but didn't actually
honor the user's intent of "use this match to pull fresh data".

Cin-shape lift: same fix needed in 5 workers, so extracted the shared
behavior into ``core/enrichment/manual_match_honoring.py``:

    honor_stored_match(
        db, entity_table, entity_id, id_column,
        client_fetch_fn, on_match_fn, log_prefix,
    ) -> bool

Per-worker variability (DB column name, client fetch method, response
shape) plugs in via callbacks. Workers call the helper at the top of
``_process_album_individual`` / ``_process_track_individual``; if it
returns True, the manual match was honored and the search-by-name
fallback is skipped. If False (no stored ID, fetch failed, or empty
response), the worker's existing search-by-name flow runs as before.

Workers wired:

- spotify_worker — album + track (was overwriting; now honors)
- itunes_worker — album + track (was overwriting; now honors)
- deezer_worker — album + track (was skip-on-id; now refreshes)
- tidal_worker — album + track (was skip-on-id; now refreshes)
- qobuz_worker — album + track (was skip-on-id; now refreshes)

Workers left alone (already correct):

- discogs_worker — already had inline stored-ID fast path that
  refreshes metadata. Same behavior, just inline; refactoring to use
  the shared helper would be churn for zero behavior change.
- audiodb_worker — same — inline fast path with full metadata refresh.
- musicbrainz_worker — preserves existing MBID and marks status,
  which is the correct behavior for MB (the MBID itself is the match
  payload — no separate metadata fetch).
- lastfm_worker / genius_worker — name-based services with no
  source-specific IDs to honor. Inherent re-search per call.

Reorganize fixed indirectly — it always honored stored IDs correctly
via ``library_reorganize._extract_source_ids``. The "Reorganize broken"
symptom was downstream of broken Enrich corrupting the stored ID.

Tests:

- ``tests/enrichment/test_manual_match_honoring.py`` — 11 tests
  pinning the shared helper contract: stored-ID fast path, no-ID
  fallthrough, empty-string treated as no ID, missing row, fetch
  exception caught and falls through, fetch returns None falls
  through, callback exceptions propagate, configurable table +
  column, defensive table-name whitelist.

- Per-worker wiring NOT tested individually — the workers depend
  on live DB / client objects that are heavy to mock. The shared
  helper's contract is pinned; per-worker call sites are short
  enough to verify by code review.

2173/2173 full suite green.

Closes #501.
pull/510/head
Broque Thomas 1 week ago
parent 4dd0640799
commit cceffbd8ec

@ -9,6 +9,7 @@ from utils.logging_config import get_logger
from database.music_database import MusicDatabase
from core.deezer_client import DeezerClient
from core.worker_utils import interruptible_sleep, set_album_api_track_count
from core.enrichment.manual_match_honoring import honor_stored_match
logger = get_logger("deezer_worker")
@ -383,11 +384,33 @@ class DeezerWorker:
self.stats['not_found'] += 1
logger.debug(f"No match for artist '{artist_name}'")
def _refresh_album_via_stored_id(self, album_id, stored_id, full_album_dict):
"""Issue #501 callback. Stored ID exists → fetched full Deezer
album payload. Use it as both args to ``_update_album`` (search-
result and full-data shapes overlap on the fields we need
artist verification skipped since manual match presumably
already vetted)."""
self._update_album(album_id, full_album_dict, full_album_dict)
def _refresh_track_via_stored_id(self, track_id, stored_id, full_track_dict):
"""Issue #501 callback for tracks — same pattern as albums."""
self._update_track(track_id, full_track_dict, full_track_dict)
def _process_album(self, album_id: int, album_name: str, artist_name: str, item: Dict[str, Any]):
"""Process an album: search Deezer, verify, fetch full details, store metadata"""
existing_id = self._get_existing_id('album', album_id)
if existing_id:
logger.debug(f"Preserving existing Deezer ID for album '{album_name}': {existing_id}")
# Issue #501: honor manual matches. Pre-fix this method just
# SKIPPED when a stored ID was present (preserved the ID but
# never refreshed metadata). Now it goes through the full
# refresh path via the stored ID, picking up label / genres /
# explicit updates without ever overwriting the manual match.
if honor_stored_match(
db=self.db, entity_table='albums', entity_id=album_id,
id_column='deezer_id',
client_fetch_fn=self.client.get_album_raw,
on_match_fn=self._refresh_album_via_stored_id,
log_prefix='Deezer',
):
self.stats['matched'] += 1
return
result = self.client.search_album(artist_name, album_name)
@ -430,9 +453,15 @@ class DeezerWorker:
def _process_track(self, track_id: int, track_name: str, artist_name: str, item: Dict[str, Any]):
"""Process a track: search Deezer, verify, fetch full details for BPM, store metadata"""
existing_id = self._get_existing_id('track', track_id)
if existing_id:
logger.debug(f"Preserving existing Deezer ID for track '{track_name}': {existing_id}")
# Issue #501: honor manual matches (see _process_album).
if honor_stored_match(
db=self.db, entity_table='tracks', entity_id=track_id,
id_column='deezer_id',
client_fetch_fn=self.client.get_track_raw,
on_match_fn=self._refresh_track_via_stored_id,
log_prefix='Deezer',
):
self.stats['matched'] += 1
return
result = self.client.search_track(artist_name, track_name)

@ -0,0 +1,128 @@
"""Honor manually-matched source IDs in per-source enrichment workers.
GitHub issue #501 (@Tacobell444): every per-source enrichment worker's
``_process_*_individual`` method ran a fuzzy text search on the album /
track name and overwrote the stored source ID with whatever the search
returned. If the user had manually matched an album to a specific source
ID (e.g. set ``albums.spotify_album_id = 'ABC'`` via the match-chip UI),
the next "Enrich" click would search by name pick a different result
overwrite the manual match with the wrong ID, OR fail to match
anything and revert the status to ``not_found``.
This module lifts the "honor stored ID" fast path into one shared
helper. Each per-source worker (Spotify / iTunes / Deezer / Discogs /
MusicBrainz / AudioDB / Tidal / Qobuz) calls it before falling back
to its existing search-by-name flow. Same fix in 8 workers gets
exactly one implementation; per-worker variability (column name,
client fetch method, response shape) plugs in via callbacks.
Lift what's truly shared. Caller knows its own column + client
method + update logic; the helper just orchestrates.
"""
from __future__ import annotations
from typing import Any, Callable, Optional
from utils.logging_config import get_logger
logger = get_logger("enrichment.manual_match_honoring")
def _read_id_column(db, entity_table: str, entity_id, id_column: str) -> Optional[str]:
"""Read the stored source ID for one entity. Returns None when the
column is empty / unset."""
if entity_table not in ('albums', 'tracks', 'artists'):
# Defensive: we only operate on these three. Avoids SQL injection
# via a bad table name (id_column is also restricted to known
# column names by callers but defense in depth never hurts).
return None
conn = db._get_connection()
try:
cursor = conn.cursor()
cursor.execute(
f"SELECT {id_column} FROM {entity_table} WHERE id = ?",
(entity_id,),
)
row = cursor.fetchone()
finally:
conn.close()
if not row:
return None
value = row[0] if not hasattr(row, 'keys') else row[id_column]
return str(value).strip() if value else None
def honor_stored_match(
*,
db,
entity_table: str,
entity_id,
id_column: str,
client_fetch_fn: Callable[[str], Any],
on_match_fn: Callable[[Any, str, Any], None],
log_prefix: str = '',
) -> bool:
"""Fast-path enrichment via a stored source ID — preserves manual
matches.
Args:
db: ``MusicDatabase`` instance (for the column read).
entity_table: ``'albums'``, ``'tracks'``, or ``'artists'``.
entity_id: Library DB ID of the entity to enrich.
id_column: Column on ``entity_table`` that stores the source-
specific ID (``spotify_album_id`` / ``itunes_album_id`` /
``deezer_id`` / etc).
client_fetch_fn: Callable taking the stored ID and returning
the source's raw response (Album dataclass, dict, or
whatever the client returns). Typically
``self.client.get_album`` or ``self.client.get_track``.
on_match_fn: Worker callback invoked with
``(entity_id, stored_id, api_response)`` to apply the
metadata refresh. Worker knows the response shape; helper
doesn't.
log_prefix: Display name for log lines (``'Spotify'`` /
``'iTunes'`` / etc).
Returns:
True if a stored ID was found AND the fetch returned data AND
the on-match callback ran. Caller skips its search-by-name
flow and counts a match.
False if no stored ID is set, the fetch failed, or the fetch
returned empty. Caller falls through to its existing search-
by-name flow (the legacy behavior for un-matched entities).
Notes:
- Exceptions in ``client_fetch_fn`` are caught and logged at
warning level caller falls through to search.
- Exceptions in ``on_match_fn`` propagate (those are real
DB errors the worker should know about).
"""
stored_id = _read_id_column(db, entity_table, entity_id, id_column)
if not stored_id:
return False
try:
api_data = client_fetch_fn(stored_id)
except Exception as exc:
logger.warning(
f"[{log_prefix}] Stored-ID fetch failed for "
f"{entity_table[:-1]} #{entity_id} (id={stored_id}): {exc}"
)
return False
if not api_data:
logger.debug(
f"[{log_prefix}] Stored ID {stored_id} for "
f"{entity_table[:-1]} #{entity_id} returned empty data — "
f"falling through to search-by-name"
)
return False
on_match_fn(entity_id, stored_id, api_data)
logger.info(
f"[{log_prefix}] Honored manual match: "
f"{entity_table[:-1]} #{entity_id}{id_column}={stored_id}"
)
return True

@ -3,12 +3,14 @@ import re
import threading
import time
from difflib import SequenceMatcher
from types import SimpleNamespace
from typing import Optional, Dict, Any, List
from datetime import datetime, timedelta
from utils.logging_config import get_logger
from database.music_database import MusicDatabase
from core.itunes_client import iTunesClient
from core.worker_utils import interruptible_sleep, set_album_api_track_count
from core.enrichment.manual_match_honoring import honor_stored_match
logger = get_logger("itunes_worker")
@ -526,11 +528,49 @@ class iTunesWorker:
# ── Individual fallback processing ─────────────────────────────────
def _refresh_album_via_stored_id(self, album_id, stored_id, api_album_dict):
"""Issue #501 callback. Convert ``client.get_album()`` dict into
the Album-shaped object ``_update_album`` expects, then call it.
Preserves the manual match never overwrites the stored ID
with a different name-search result."""
images = api_album_dict.get('images') or []
image_url = ''
if images and isinstance(images[0], dict):
image_url = images[0].get('url', '') or ''
adapter = SimpleNamespace(
id=api_album_dict.get('id') or stored_id,
name=api_album_dict.get('name', ''),
image_url=image_url,
album_type=api_album_dict.get('album_type', 'album'),
release_date=api_album_dict.get('release_date', ''),
total_tracks=api_album_dict.get('total_tracks', 0),
)
self._update_album(album_id, adapter)
def _refresh_track_via_stored_id(self, track_id, stored_id, api_track_dict):
"""Track-level callback — track update only writes ID + status,
no metadata backfill, so the dict shape is irrelevant beyond
carrying the stored ID through."""
adapter = SimpleNamespace(id=api_track_dict.get('id') or stored_id)
self._update_track_from_search(track_id, adapter)
def _process_album_individual(self, item: Dict[str, Any]):
album_id = item['id']
album_name = item['name']
artist_name = item.get('artist', '')
# Issue #501: honor manual matches (see SpotifyWorker for full
# explanation — same pattern across every per-source worker).
if honor_stored_match(
db=self.db, entity_table='albums', entity_id=album_id,
id_column='itunes_album_id',
client_fetch_fn=self.client.get_album,
on_match_fn=self._refresh_album_via_stored_id,
log_prefix='iTunes',
):
self.stats['matched'] += 1
return
query = f"{artist_name} {album_name}" if artist_name else album_name
results = self.client.search_albums(query, limit=5)
@ -561,6 +601,17 @@ class iTunesWorker:
track_name = item['name']
artist_name = item.get('artist', '')
# Issue #501: honor manual matches.
if honor_stored_match(
db=self.db, entity_table='tracks', entity_id=track_id,
id_column='itunes_track_id',
client_fetch_fn=self.client.get_track_details,
on_match_fn=self._refresh_track_via_stored_id,
log_prefix='iTunes',
):
self.stats['matched'] += 1
return
query = f"{artist_name} {track_name}" if artist_name else track_name
results = self.client.search_tracks(query, limit=5)

@ -9,6 +9,7 @@ from utils.logging_config import get_logger
from database.music_database import MusicDatabase
from core.qobuz_client import _qobuz_is_rate_limited
from core.worker_utils import interruptible_sleep
from core.enrichment.manual_match_honoring import honor_stored_match
logger = get_logger("qobuz_worker")
@ -426,12 +427,26 @@ class QobuzWorker:
self.stats['not_found'] += 1
logger.debug(f"No match for artist '{artist_name}'")
def _refresh_album_via_stored_id(self, album_id, stored_id, full_album_dict):
"""Issue #501 callback. Same shape as Tidal/Deezer — pass the
full-album dict in both arg slots."""
self._update_album(album_id, full_album_dict, full_album_dict)
def _refresh_track_via_stored_id(self, track_id, stored_id, full_track_dict):
self._update_track(track_id, full_track_dict, full_track_dict)
def _process_album(self, album_id: int, album_name: str, artist_name: str, item: Dict[str, Any]):
"""Process an album: search Qobuz, verify, fetch full details, store metadata"""
existing_id = self._get_existing_id('album', album_id)
if existing_id:
logger.debug(f"Preserving existing Qobuz ID for album '{album_name}': {existing_id}")
self._mark_status('album', album_id, 'matched')
# Issue #501: honor manual matches. Pre-fix this just marked
# status='matched' without refreshing metadata.
if honor_stored_match(
db=self.db, entity_table='albums', entity_id=album_id,
id_column='qobuz_id',
client_fetch_fn=self.client.get_album,
on_match_fn=self._refresh_album_via_stored_id,
log_prefix='Qobuz',
):
self.stats['matched'] += 1
return
result = self.client.search_album(artist_name, album_name)
@ -484,10 +499,15 @@ class QobuzWorker:
def _process_track(self, track_id: int, track_name: str, artist_name: str, item: Dict[str, Any]):
"""Process a track: search Qobuz, verify, fetch full details, store metadata"""
existing_id = self._get_existing_id('track', track_id)
if existing_id:
logger.debug(f"Preserving existing Qobuz ID for track '{track_name}': {existing_id}")
self._mark_status('track', track_id, 'matched')
# Issue #501: honor manual matches.
if honor_stored_match(
db=self.db, entity_table='tracks', entity_id=track_id,
id_column='qobuz_id',
client_fetch_fn=self.client.get_track,
on_match_fn=self._refresh_track_via_stored_id,
log_prefix='Qobuz',
):
self.stats['matched'] += 1
return
result = self.client.search_track(artist_name, track_name)

@ -3,12 +3,14 @@ import re
import threading
import time
from difflib import SequenceMatcher
from types import SimpleNamespace
from typing import Optional, Dict, Any, List
from datetime import datetime, date, timedelta
from utils.logging_config import get_logger
from database.music_database import MusicDatabase
from core.spotify_client import SpotifyClient, SpotifyRateLimitError
from core.worker_utils import interruptible_sleep, set_album_api_track_count
from core.enrichment.manual_match_honoring import honor_stored_match
logger = get_logger("spotify_worker")
@ -630,11 +632,53 @@ class SpotifyWorker:
# ── Individual fallback processing ─────────────────────────────────
def _refresh_album_via_stored_id(self, album_id, stored_id, api_album_dict):
"""``honor_stored_match`` callback. Wraps the dict from
``client.get_album(stored_id)`` in a SimpleNamespace adapter
with the attributes ``_update_album`` reads, then calls it.
Preserves the manual match never reaches search-by-name."""
images = api_album_dict.get('images') or []
image_url = ''
if images and isinstance(images[0], dict):
image_url = images[0].get('url', '') or ''
adapter = SimpleNamespace(
id=api_album_dict.get('id') or stored_id,
name=api_album_dict.get('name', ''),
image_url=image_url,
album_type=api_album_dict.get('album_type', 'album'),
release_date=api_album_dict.get('release_date', ''),
total_tracks=api_album_dict.get('total_tracks', 0),
)
self._update_album(album_id, adapter)
def _refresh_track_via_stored_id(self, track_id, stored_id, api_track_dict):
"""``honor_stored_match`` callback for tracks. The track-level
update only writes the ID + match status no metadata
backfill, so the dict shape is irrelevant beyond carrying the
stored ID through."""
adapter = SimpleNamespace(id=api_track_dict.get('id') or stored_id)
self._update_track_from_search(track_id, adapter)
def _process_album_individual(self, item: Dict[str, Any]):
album_id = item['id']
album_name = item['name']
artist_name = item.get('artist', '')
# Issue #501: honor manual matches. If the user has already
# set spotify_album_id on this album row (via match-chip UI),
# refresh metadata via that ID and skip search-by-name — which
# would otherwise overwrite the manual match with whatever
# name-search returned.
if honor_stored_match(
db=self.db, entity_table='albums', entity_id=album_id,
id_column='spotify_album_id',
client_fetch_fn=self.client.get_album,
on_match_fn=self._refresh_album_via_stored_id,
log_prefix='Spotify',
):
self.stats['matched'] += 1
return
query = f"{artist_name} {album_name}" if artist_name else album_name
results = self.client.search_albums(query, limit=5)
@ -672,6 +716,17 @@ class SpotifyWorker:
track_name = item['name']
artist_name = item.get('artist', '')
# Issue #501: honor manual matches (see _process_album_individual).
if honor_stored_match(
db=self.db, entity_table='tracks', entity_id=track_id,
id_column='spotify_track_id',
client_fetch_fn=self.client.get_track_details,
on_match_fn=self._refresh_track_via_stored_id,
log_prefix='Spotify',
):
self.stats['matched'] += 1
return
query = f"{artist_name} {track_name}" if artist_name else track_name
results = self.client.search_tracks(query, limit=5)

@ -8,6 +8,7 @@ from utils.logging_config import get_logger
from database.music_database import MusicDatabase
from core.tidal_client import TidalClient
from core.worker_utils import interruptible_sleep
from core.enrichment.manual_match_honoring import honor_stored_match
logger = get_logger("tidal_worker")
@ -435,12 +436,30 @@ class TidalWorker:
self.stats['not_found'] += 1
logger.debug(f"No match for artist '{artist_name}'")
def _refresh_album_via_stored_id(self, album_id, stored_id, full_album_dict):
"""Issue #501 callback. Stored ID exists → fetched full Tidal
album call ``_update_album`` with the dict in both arg slots
(search-result and full-data shapes overlap on the fields we
need)."""
self._update_album(album_id, full_album_dict, full_album_dict)
def _refresh_track_via_stored_id(self, track_id, stored_id, full_track_dict):
"""Issue #501 callback for tracks — same pattern as albums."""
self._update_track(track_id, full_track_dict, full_track_dict)
def _process_album(self, album_id: int, album_name: str, artist_name: str, item: Dict[str, Any]):
"""Process an album: search Tidal, verify, fetch full details, store metadata"""
existing_id = self._get_existing_id('album', album_id)
if existing_id:
logger.debug(f"Preserving existing Tidal ID for album '{album_name}': {existing_id}")
self._mark_status('album', album_id, 'matched')
# Issue #501: honor manual matches. Pre-fix this just marked
# status='matched' without refreshing metadata. Now goes
# through the full refresh path via the stored ID.
if honor_stored_match(
db=self.db, entity_table='albums', entity_id=album_id,
id_column='tidal_id',
client_fetch_fn=self.client.get_album,
on_match_fn=self._refresh_album_via_stored_id,
log_prefix='Tidal',
):
self.stats['matched'] += 1
return
result = self.client.search_album(artist_name, album_name)
@ -486,10 +505,15 @@ class TidalWorker:
def _process_track(self, track_id: int, track_name: str, artist_name: str, item: Dict[str, Any]):
"""Process a track: search Tidal, verify, fetch full details, store metadata"""
existing_id = self._get_existing_id('track', track_id)
if existing_id:
logger.debug(f"Preserving existing Tidal ID for track '{track_name}': {existing_id}")
self._mark_status('track', track_id, 'matched')
# Issue #501: honor manual matches.
if honor_stored_match(
db=self.db, entity_table='tracks', entity_id=track_id,
id_column='tidal_id',
client_fetch_fn=self.client.get_track,
on_match_fn=self._refresh_track_via_stored_id,
log_prefix='Tidal',
):
self.stats['matched'] += 1
return
result = self.client.search_track(artist_name, track_name)

@ -0,0 +1,302 @@
"""Tests for ``core.enrichment.manual_match_honoring.honor_stored_match``.
The helper is the shared "fast-path enrichment via stored source ID"
used by every per-source enrichment worker (Spotify / iTunes / Deezer
/ Discogs / MusicBrainz / AudioDB / Tidal / Qobuz). It reads the
stored ID from a configurable column, fetches via a caller-supplied
client method, and invokes a caller-supplied update callback. Pin
the contract so per-worker wiring can rely on uniform semantics.
Issue #501: enrichment workers were running fuzzy name search and
overwriting manually-set source IDs. This helper is the lift point
all 8 workers will plug in the same way.
"""
from __future__ import annotations
import sqlite3
from unittest.mock import MagicMock
import pytest
from core.enrichment.manual_match_honoring import honor_stored_match
# ---------------------------------------------------------------------------
# Fake DB fixture (just enough to exercise _read_id_column)
# ---------------------------------------------------------------------------
class _FakeConn:
"""Minimal sqlite3.Connection-like that the helper can use."""
def __init__(self, real_conn):
self._real = real_conn
def cursor(self):
return self._real.cursor()
def close(self):
# Don't actually close — tests share the connection.
pass
class _FakeDB:
"""Stand-in MusicDatabase. Supports ``_get_connection()`` returning
a wrapper that doesn't close, so per-test in-memory state survives
across helper invocations."""
def __init__(self):
self._conn = sqlite3.connect(":memory:")
self._conn.row_factory = sqlite3.Row
cur = self._conn.cursor()
cur.execute("""
CREATE TABLE albums (
id INTEGER PRIMARY KEY,
spotify_album_id TEXT,
deezer_id TEXT,
itunes_album_id TEXT
)
""")
cur.execute("""
CREATE TABLE tracks (
id INTEGER PRIMARY KEY,
spotify_track_id TEXT,
deezer_id TEXT
)
""")
self._conn.commit()
def insert_album(self, album_id, **id_columns):
cols = ['id'] + list(id_columns.keys())
placeholders = ','.join('?' for _ in cols)
values = [album_id] + list(id_columns.values())
self._conn.execute(
f"INSERT INTO albums ({','.join(cols)}) VALUES ({placeholders})",
values,
)
self._conn.commit()
def insert_track(self, track_id, **id_columns):
cols = ['id'] + list(id_columns.keys())
placeholders = ','.join('?' for _ in cols)
values = [track_id] + list(id_columns.values())
self._conn.execute(
f"INSERT INTO tracks ({','.join(cols)}) VALUES ({placeholders})",
values,
)
self._conn.commit()
def _get_connection(self):
return _FakeConn(self._conn)
@pytest.fixture
def db():
return _FakeDB()
# ---------------------------------------------------------------------------
# Stored-ID fast path (the new behavior)
# ---------------------------------------------------------------------------
def test_honors_stored_id_when_present(db):
"""Pin: stored ID found → fetch → on_match called → returns True.
Caller skips its search-by-name flow."""
db.insert_album(42, spotify_album_id='SP-ABC')
api_payload = {'id': 'SP-ABC', 'name': 'Real Album'}
fetch = MagicMock(return_value=api_payload)
on_match = MagicMock()
result = honor_stored_match(
db=db, entity_table='albums', entity_id=42,
id_column='spotify_album_id',
client_fetch_fn=fetch, on_match_fn=on_match,
log_prefix='Spotify',
)
assert result is True
fetch.assert_called_once_with('SP-ABC')
on_match.assert_called_once_with(42, 'SP-ABC', api_payload)
def test_returns_false_when_no_stored_id(db):
"""Pin: no stored ID → returns False, no fetch attempted, no
callback. Caller proceeds with its search-by-name fallback."""
db.insert_album(42, spotify_album_id=None)
fetch = MagicMock()
on_match = MagicMock()
result = honor_stored_match(
db=db, entity_table='albums', entity_id=42,
id_column='spotify_album_id',
client_fetch_fn=fetch, on_match_fn=on_match,
log_prefix='Spotify',
)
assert result is False
fetch.assert_not_called()
on_match.assert_not_called()
def test_returns_false_when_stored_id_empty_string(db):
"""Pin: empty string treated same as NULL — no fetch, return False."""
db.insert_album(42, spotify_album_id='')
fetch = MagicMock()
on_match = MagicMock()
result = honor_stored_match(
db=db, entity_table='albums', entity_id=42,
id_column='spotify_album_id',
client_fetch_fn=fetch, on_match_fn=on_match,
)
assert result is False
fetch.assert_not_called()
def test_returns_false_when_entity_not_in_db(db):
"""Pin: missing row → returns False, no fetch, no callback."""
fetch = MagicMock()
on_match = MagicMock()
result = honor_stored_match(
db=db, entity_table='albums', entity_id=999,
id_column='spotify_album_id',
client_fetch_fn=fetch, on_match_fn=on_match,
)
assert result is False
fetch.assert_not_called()
# ---------------------------------------------------------------------------
# Failure paths — fall through to search instead of crashing the worker
# ---------------------------------------------------------------------------
def test_returns_false_when_fetch_raises(db):
"""Pin: client.get_X(stored_id) raises → caught, logged at warning,
returns False so caller falls through to search. Worker must not
crash on a transient API failure."""
db.insert_album(42, spotify_album_id='SP-ABC')
fetch = MagicMock(side_effect=RuntimeError("API down"))
on_match = MagicMock()
result = honor_stored_match(
db=db, entity_table='albums', entity_id=42,
id_column='spotify_album_id',
client_fetch_fn=fetch, on_match_fn=on_match,
)
assert result is False
on_match.assert_not_called()
def test_returns_false_when_fetch_returns_none(db):
"""Pin: stored ID points at a removed/invalid catalog entry →
fetch returns None falls through to search instead of writing
junk to DB."""
db.insert_album(42, spotify_album_id='SP-DEAD')
fetch = MagicMock(return_value=None)
on_match = MagicMock()
result = honor_stored_match(
db=db, entity_table='albums', entity_id=42,
id_column='spotify_album_id',
client_fetch_fn=fetch, on_match_fn=on_match,
)
assert result is False
on_match.assert_not_called()
def test_returns_false_when_fetch_returns_empty_dict(db):
"""Pin: empty dict treated same as None — falsy result skips
callback."""
db.insert_album(42, spotify_album_id='SP-EMPTY')
fetch = MagicMock(return_value={})
on_match = MagicMock()
result = honor_stored_match(
db=db, entity_table='albums', entity_id=42,
id_column='spotify_album_id',
client_fetch_fn=fetch, on_match_fn=on_match,
)
assert result is False
on_match.assert_not_called()
def test_on_match_exceptions_propagate(db):
"""Pin: exceptions inside on_match (DB write errors) propagate to
the worker they're real errors the worker should surface, not
swallowed silently."""
db.insert_album(42, spotify_album_id='SP-ABC')
fetch = MagicMock(return_value={'id': 'SP-ABC'})
on_match = MagicMock(side_effect=ValueError("bad write"))
with pytest.raises(ValueError, match="bad write"):
honor_stored_match(
db=db, entity_table='albums', entity_id=42,
id_column='spotify_album_id',
client_fetch_fn=fetch, on_match_fn=on_match,
)
# ---------------------------------------------------------------------------
# Per-table / per-column wiring
# ---------------------------------------------------------------------------
def test_works_with_tracks_table(db):
"""Pin: ``entity_table='tracks'`` works the same as 'albums' — the
helper is generic across both."""
db.insert_track(7, spotify_track_id='SP-T-1')
fetch = MagicMock(return_value={'id': 'SP-T-1'})
on_match = MagicMock()
result = honor_stored_match(
db=db, entity_table='tracks', entity_id=7,
id_column='spotify_track_id',
client_fetch_fn=fetch, on_match_fn=on_match,
)
assert result is True
fetch.assert_called_once_with('SP-T-1')
def test_works_with_alternate_columns(db):
"""Pin: ``id_column`` is configurable so each worker reads its own
column (deezer_id, itunes_album_id, etc)."""
db.insert_album(42, deezer_id='12345', itunes_album_id='IT-99')
fetch = MagicMock(return_value={'id': '12345'})
on_match = MagicMock()
# Read the deezer_id column even though spotify_album_id exists too.
result = honor_stored_match(
db=db, entity_table='albums', entity_id=42,
id_column='deezer_id',
client_fetch_fn=fetch, on_match_fn=on_match,
)
assert result is True
fetch.assert_called_once_with('12345')
def test_rejects_invalid_table_name(db):
"""Pin: defensive — only known tables (albums/tracks/artists)
accepted. Avoids SQL injection via crafted table name even though
every caller is hard-coded."""
fetch = MagicMock()
on_match = MagicMock()
result = honor_stored_match(
db=db, entity_table='not_a_real_table; DROP TABLE albums',
entity_id=1, id_column='spotify_album_id',
client_fetch_fn=fetch, on_match_fn=on_match,
)
assert result is False
fetch.assert_not_called()

@ -3432,6 +3432,7 @@ const WHATS_NEW = {
'2.4.2': [
// --- post-2.4.1 dev work — entries hidden by _getLatestWhatsNewVersion until the build version bumps ---
{ date: 'Unreleased — 2.4.2 dev cycle' },
{ title: 'Fix: Enrich Honors Manual Album Matches', desc: 'github issue #501 (tacobell444): if you manually matched an album to a specific source ID via the match-chip UI, then clicked "enrich" on that album, the worker would search by name and overwrite your manual match with whatever the search returned (or revert status to "not_found" if it found nothing). reorganize then read the now-wrong id and moved files to the wrong destination. fix: extracted a shared `core/enrichment/manual_match_honoring.py` helper. every per-source enrichment worker (spotify / itunes / deezer / tidal / qobuz) now reads its stored id column at the top of `_process_*_individual` — if present, it fetches via `client.get_album(stored_id)` directly and refreshes metadata without touching the id. fuzzy name search only runs as fallback for never-matched entities. discogs / audiodb / musicbrainz already had inline stored-id fast paths and are left alone. lastfm / genius are name-based and don\'t store ids. cin-shape lift: same fix in 5 workers gets exactly one helper, per-worker variability (column name, client method, response shape) plugs in via callbacks. 11 new helper tests pin: stored-id fast-path, no-id fallthrough, fetch-failure fallthrough, table/column whitelist, callback contract.', page: 'library' },
{ title: 'Fix: "no such table: hifi_instances" When Adding HiFi Instance', desc: 'github issue #503 (hadshaw21): adding a hifi instance via downloader settings popped up `no such table: hifi_instances` even though the connection test and "check all instances" both worked. root cause: `_initialize_database` runs every CREATE TABLE + every migration step inside one sqlite transaction. python\'s sqlite3 module doesn\'t autocommit DDL by default, so if any later migration step throws on a user\'s specific DB shape (e.g. an old volume from a prior soulsync version with quirky schema state), the WHOLE batch rolls back — including the hifi_instances CREATE that ran successfully. user\'s next boot retries init, hits the same migration failure, rolls back again. table never lands. fix: defensive lazy-create. every hifi_instances CRUD method now runs `CREATE TABLE IF NOT EXISTS hifi_instances (...)` immediately before its operation. idempotent — costs one PRAGMA-level no-op when the table is already present, fully recovers from a broken init. read methods (`get_hifi_instances`, `get_all_hifi_instances`) now return empty instead of raising when init failed. write methods (`add`, `remove`, `toggle`, `reorder`, `seed`) work end-to-end. doesn\'t paper over the underlying init issue (still worth tracking down which migration breaks for which users) but makes hifi instance management self-healing. 7 new tests pin the lazy-create behavior — every method works against a DB that\'s missing the table.', page: 'settings' },
{ title: 'Plex: "All Libraries (Combined)" Mode', desc: 'github issue #505 (popebruhlxix): users with multiple plex music libraries (e.g. one per plex home user) only saw one library inside soulsync because the connection settings forced you to pick a single library section. now there\'s a new "all libraries (combined)" option in settings → connections → plex → music library dropdown. picking it flips the plex client into a server-wide read mode where every read method (`get_all_artists` / `get_all_album_ids` / `search_tracks` / `get_library_stats` / etc) dispatches through `server.library.search(libtype=...)` instead of querying a single section. one api call, plex handles the aggregation. cross-section dedup applied at the listing layer — same-name artists across sections collapse to a canonical entry (the one with more tracks), so plex home families with overlapping music tastes don\'t see "drake" twice. removal-detection id enumeration stays raw on purpose — deduping there would falsely prune tracks linked to non-canonical ratingKeys. write methods (genre / poster / metadata updates) are unaffected and operate on plex objects via ratingKey directly — write-back targets one section\'s copy of an artist if it exists in multiple, document and revisit if it matters. trigger_library_scan + is_library_scanning fan out across every music section in the new mode. backward compatible — existing users with a real library name saved see no behavior change. the "all libraries" option only appears in the dropdown when more than one music library exists on the server. 29 new tests pin both modes (single-section preserved, all-libraries dispatches through server-wide search, dedup keeps canonical, id enumeration stays raw).', page: 'settings' },
{ title: 'Fix: Download Discography Showed Wrong Artist\'s Albums', desc: 'clicked download discography on 50 cent → modal showed young hot rod\'s albums. clicked weird al → modal showed the beatles. cause: the endpoint received whichever single artist id the frontend happened to pick (spotify or itunes or deezer or library db id) and dispatched it as-is to whichever source it queried. when the picked id didn\'t match the queried source\'s id format, lookup either returned wrong-artist results (numeric collisions — db id 194687 was a real deezer artist for someone else) or fell back to a fuzzy name search that picked a wrong artist. the per-source id dispatch mechanism (`MetadataLookupOptions.artist_source_ids`) already existed and the watchlist scanner already used it; the on-demand discography endpoint just wasn\'t wired to it. fixed: when the url artist_id matches a library row by ANY stored id (db id, spotify_artist_id, itunes_artist_id, deezer_id, musicbrainz_id), backend pulls every stored provider id and dispatches the right id to each source. each source gets its OWN stored id regardless of what the url carries. when the url id is a non-library source-native id and the row lookup misses entirely, behavior is identical to before (single-id fallback). also fixed two log-namespace bugs: enhance quality and multi-source search were writing through `getLogger(__name__)` which resolves to `core.artists.quality` / `core.metadata.multi_source_search` — neither under the soulsync handler — so every diagnostic line was silently dropped. switched both to `get_logger()` from utils.logging_config so they actually land in app.log.', page: 'library' },
@ -3778,6 +3779,19 @@ const WHATS_NEW = {
// Section shape: { title, description, features: [bullet strings],
// usage_note?: 'optional hint shown at the bottom' }
const VERSION_MODAL_SECTIONS = [
{
title: "Enrich Now Honors Manual Album Matches",
description: "github issue #501 (tacobell444): manually matching an album then clicking enrich would overwrite your manual match with whatever the worker\'s name-search returned, or revert status to \"not found\". reorganize then read the wrong id and moved files to the wrong destination.",
features: [
"• every per-source enrichment worker (spotify / itunes / deezer / tidal / qobuz) now reads its stored id column at the top of `_process_*_individual` — if present, fetch directly via that id and refresh metadata without touching the id",
"• fuzzy name search only runs as fallback for entities that have never been matched",
"• discogs / audiodb / musicbrainz already had inline stored-id fast paths and are left alone — same correct behavior, just inline",
"• lastfm / genius are name-based and don\'t store ids — no-op for those",
"• cin-shape lift: same fix in 5 workers gets exactly one shared helper at `core/enrichment/manual_match_honoring.py`, per-worker variability (column name / client method / response shape) plugs in via callbacks",
"• reorganize fixed indirectly — it always honored stored ids correctly, the bug was upstream in enrich corrupting the id",
],
usage_note: "no settings to change — applies on next click of enrich on a manually-matched album or track",
},
{
title: "HiFi Instance Add No Longer Errors With \"no such table\"",
description: "github issue #503 (hadshaw21): adding a hifi instance from downloader settings popped up `no such table: hifi_instances` even when the connection test passed.",

Loading…
Cancel
Save