Reject AcoustID matches whose version disagrees with the expected track

Discord report (corruption [BWC]): downloads coming through as the instrumental cut when a vocal track was requested. The verification step's `_normalize` function strips parentheticals and version-suffix tags ("(Instrumental)", "- Live", etc) so legitimate name variations don't false-fail the title-similarity check. That also means "In My Feelings" and "In My Feelings (Instrumental)" both normalize to "in my feelings", title similarity is 1.0, and the wrong cut passes verification. Detect the version label on each side BEFORE normalization runs. If the expected and matched recordings disagree on version (one is original, the other is instrumental / live / acoustic / remix / etc), return FAIL — the fingerprint identified a real song, just not the version the caller asked for. Reuses `MusicMatchingEngine.detect_version_type` so the same regex patterns the pre-download Soulseek matcher applies also drive post-download verification. No duplicated tables. Also gates the secondary fallback scan, so a wrong-version variant sitting in the same fingerprint cluster can't win the loop after the best match has already been version-rejected. 6 tests pin the behavior: - instrumental returned for vocal request → FAIL - vocal returned for instrumental request → FAIL - live vs acoustic → FAIL - matching versions on both sides → PASS - original-to-original happy path → PASS (regression guard) - secondary scan skips wrong-version recordings → not PASS 2194/2194 full suite green (was 2188 + 6 new).
3 weeks ago · 01c528fd5f
parent caef3dc9f1
commit 01c528fd5f
3 changed files with 240 additions and 0 deletions
--- a/core/acoustid_verification.py
+++ b/core/acoustid_verification.py
@ -15,6 +15,7 @@ from typing import Optional, Dict, Any, Tuple, List
 from enum import Enum
 from utils.logging_config import get_logger
 from core.acoustid_client import AcoustIDClient
+from core.matching_engine import MusicMatchingEngine
 from core.musicbrainz_client import MusicBrainzClient

 logger = get_logger("acoustid.verification")
@ -24,6 +25,25 @@ MIN_ACOUSTID_SCORE = 0.80       # Minimum AcoustID fingerprint score to trust
 TITLE_MATCH_THRESHOLD = 0.70    # Title similarity needed to consider a match
 ARTIST_MATCH_THRESHOLD = 0.60   # Artist similarity needed to consider a match

+# Single matching-engine instance so version detection reuses the same patterns
+# used by the pre-download Soulseek matcher (remix / live / acoustic /
+# instrumental / etc). detect_version_type doesn't use self state, so one
+# shared instance is fine.
+_match_engine_for_version = MusicMatchingEngine()
+
+
+def _detect_title_version(title: str) -> str:
+    """Return version label for a track title.
+
+    Returns ``'original'`` when no version marker is detected, otherwise one
+    of the labels produced by ``MusicMatchingEngine.detect_version_type``
+    (``'instrumental'``, ``'live'``, ``'acoustic'``, ``'remix'``, etc).
+    """
+    if not title:
+        return 'original'
+    version_type, _ = _match_engine_for_version.detect_version_type(title)
+    return version_type
+

 class VerificationResult(Enum):
    """Possible outcomes of audio verification."""
@ -286,6 +306,33 @@ class AcoustIDVerification:
                f"(title_sim={title_sim:.2f}, artist_sim={artist_sim:.2f})"
            )

+            # Step 4b: Version-mismatch gate.
+            #
+            # The ``_normalize`` step deliberately strips parentheticals and
+            # version tags ("(Instrumental)", "- Live", etc) so that legit
+            # name variations don't fail the title-similarity comparison.
+            # That same stripping made it impossible to tell a vocal track
+            # apart from its instrumental: "In My Feelings" and "In My
+            # Feelings (Instrumental)" both normalize to "in my feelings",
+            # the title sim ends up 1.0, and the file passes verification
+            # even though it's the wrong cut.
+            #
+            # Detect the version on each side BEFORE normalization runs.
+            # If the expected track and the AcoustID-matched recording
+            # disagree on version (one is original, the other is
+            # instrumental / live / remix / acoustic / etc), reject — the
+            # fingerprint identified a real song but it's not the one the
+            # caller asked for.
+            expected_version = _detect_title_version(expected_track_name)
+            matched_version = _detect_title_version(matched_title)
+            if expected_version != matched_version:
+                msg = (
+                    f"Version mismatch: expected '{expected_track_name}' ({expected_version}) "
+                    f"but file is '{matched_title}' ({matched_version})"
+                )
+                logger.warning(f"AcoustID verification FAILED (version mismatch) - {msg}")
+                return VerificationResult.FAIL, msg
+
            # Step 5: Decide pass/fail based on similarity
            if title_sim >= TITLE_MATCH_THRESHOLD and artist_sim >= ARTIST_MATCH_THRESHOLD:
                msg = (
@ -343,9 +390,15 @@ class AcoustIDVerification:

            # Title doesn't match — check ALL recordings for any title/artist match
            # (the best combined match might not be the right one if there are many results)
+            # Skip recordings whose version (instrumental/live/etc) disagrees with
+            # what the caller asked for — the version mismatch above checked
+            # only the best recording, but a wrong-version variant could still
+            # win this fallback scan if its bare title matched.
            for rec in recordings:
                t = rec.get('title') or ''
                a = rec.get('artist') or ''
+                if _detect_title_version(t) != expected_version:
+                    continue
                if (_similarity(expected_track_name, t) >= TITLE_MATCH_THRESHOLD and
                        _similarity(expected_artist_name, a) >= ARTIST_MATCH_THRESHOLD):
                    msg = (
--- a/tests/test_acoustid_version_mismatch.py
+++ b/tests/test_acoustid_version_mismatch.py
@ -0,0 +1,186 @@
+"""AcoustID verification rejects version mismatches (instrumental / live / etc).
+
+Discord report (corruption [BWC]): downloads coming through as instrumentals
+when the user expected vocal versions. Slipped past AcoustID verification
+because the ``_normalize`` step strips parentheticals and version-suffix
+tags ("(Instrumental)", "- Live", etc) so that legitimate name variations
+don't fail the title-similarity check. Side effect: "Song Name" and
+"Song Name (Instrumental)" both normalize to "song name", title sim is
+1.0, file passes verification despite being the wrong cut.
+
+Fix: detect the version label on each side BEFORE normalization runs.
+If the expected and matched versions disagree, return FAIL — the
+fingerprint did identify a real song, just not the version the caller
+asked for.
+
+Reuses ``MusicMatchingEngine.detect_version_type`` so the same patterns
+that the pre-download Soulseek matcher uses also drive post-download
+verification (no duplicated regex tables).
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from core.acoustid_verification import (
+    AcoustIDVerification,
+    VerificationResult,
+)
+
+
+@pytest.fixture
+def verifier():
+    """Verifier with the network/fingerprint side stubbed so tests can
+    drive the title/artist comparison logic directly."""
+    v = AcoustIDVerification()
+
+    class _StubClient:
+        def is_available(self):
+            return True, "available"
+
+        def fingerprint_and_lookup(self, path):
+            return None  # tests inject via _stub_lookup below
+
+    v.acoustid_client = _StubClient()
+    return v
+
+
+def _stub_lookup(verifier, *, recordings, best_score=0.95):
+    verifier.acoustid_client.fingerprint_and_lookup = lambda path: {
+        "recordings": recordings,
+        "best_score": best_score,
+        "recording_mbids": [r.get("id") for r in recordings if r.get("id")],
+    }
+
+
+# ---------------------------------------------------------------------------
+# The headline bug — instrumental returned where vocal was expected.
+# ---------------------------------------------------------------------------
+
+
+def test_instrumental_returned_for_vocal_request_fails(verifier):
+    """User asked for a vocal track; file's fingerprint matched an
+    instrumental version of the same song. Old normalizer stripped
+    "(Instrumental)" and let it pass. Must FAIL now."""
+    _stub_lookup(verifier, recordings=[
+        {"title": "In My Feelings (Instrumental)", "artist": "Drake"},
+    ])
+
+    result, msg = verifier.verify_audio_file(
+        "/fake/path.flac",
+        "In My Feelings",
+        "Drake",
+    )
+    assert result == VerificationResult.FAIL
+    assert "version mismatch" in msg.lower()
+    assert "instrumental" in msg.lower()
+
+
+def test_instrumental_request_with_vocal_file_fails(verifier):
+    """Symmetric case: user asked for the instrumental cut explicitly,
+    file's fingerprint matched the regular vocal version. Also FAIL —
+    they're different recordings."""
+    _stub_lookup(verifier, recordings=[
+        {"title": "In My Feelings", "artist": "Drake"},
+    ])
+
+    result, _ = verifier.verify_audio_file(
+        "/fake/path.flac",
+        "In My Feelings (Instrumental)",
+        "Drake",
+    )
+    assert result == VerificationResult.FAIL
+
+
+# ---------------------------------------------------------------------------
+# Different-version mismatches (live vs acoustic, etc) — also FAIL.
+# ---------------------------------------------------------------------------
+
+
+def test_different_versions_disagree_fails(verifier):
+    """Caller asked for the live cut; file is the acoustic cut. Both
+    are non-original versions, but they're different non-original
+    versions — must FAIL."""
+    _stub_lookup(verifier, recordings=[
+        {"title": "Hello (Acoustic)", "artist": "Adele"},
+    ])
+
+    result, _ = verifier.verify_audio_file(
+        "/fake/path.flac",
+        "Hello (Live at Wembley)",
+        "Adele",
+    )
+    assert result == VerificationResult.FAIL
+
+
+# ---------------------------------------------------------------------------
+# Regression guards — version match must NOT cause false-FAIL.
+# ---------------------------------------------------------------------------
+
+
+def test_original_to_original_passes(verifier):
+    """Plain track to plain track — no version on either side. Verify
+    the version gate doesn't get in the way of the normal happy path."""
+    _stub_lookup(verifier, recordings=[
+        {"title": "Bohemian Rhapsody", "artist": "Queen"},
+    ])
+
+    result, _ = verifier.verify_audio_file(
+        "/fake/path.flac",
+        "Bohemian Rhapsody",
+        "Queen",
+    )
+    assert result == VerificationResult.PASS
+
+
+def test_matching_versions_pass(verifier):
+    """Both expected and matched are the live version of the same song.
+    Versions agree — must PASS."""
+    _stub_lookup(verifier, recordings=[
+        {"title": "Hello (Live at Wembley)", "artist": "Adele"},
+    ])
+
+    result, _ = verifier.verify_audio_file(
+        "/fake/path.flac",
+        "Hello (Live at Wembley)",
+        "Adele",
+    )
+    assert result == VerificationResult.PASS
+
+
+# ---------------------------------------------------------------------------
+# Secondary scan path — version gate must apply to fallback recordings too.
+# ---------------------------------------------------------------------------
+
+
+def test_secondary_scan_skips_wrong_version_recordings(verifier):
+    """When the best AcoustID recording's title doesn't match strongly
+    enough, verify scans through the rest of the recordings list looking
+    for a better candidate. That fallback path must also reject
+    wrong-version variants — otherwise an instrumental from the same
+    fingerprint cluster could win the scan and pass verification."""
+    _stub_lookup(verifier, recordings=[
+        # Best by combined score: a different track entirely. Same
+        # artist, completely different title. Original version (no
+        # version mismatch on this one).
+        {"title": "Some Other Song", "artist": "Drake"},
+        # Fallback candidate: instrumental version of the requested
+        # song. Without the scan-loop version gate, this would PASS
+        # (title matches after stripping "(Instrumental)", artist
+        # matches). With the gate, it gets skipped and the loop falls
+        # through.
+        {"title": "In My Feelings (Instrumental)", "artist": "Drake"},
+    ])
+
+    result, _ = verifier.verify_audio_file(
+        "/fake/path.flac",
+        "In My Feelings",
+        "Drake",
+    )
+    # Best-recording version check passes (both 'original'), then the
+    # main pass/fail bucket misses (title doesn't match), fallback scan
+    # skips the instrumental, no other valid recording → falls through
+    # to the final unmatched logic. Either FAIL or SKIP is acceptable
+    # here; the critical assertion is "did NOT pass via the
+    # instrumental-version recording".
+    assert result != VerificationResult.PASS
--- a/webui/static/helper.js
+++ b/webui/static/helper.js
@ -3432,6 +3432,7 @@ const WHATS_NEW = {
    '2.4.2': [
        // --- post-2.4.1 dev work — entries hidden by _getLatestWhatsNewVersion until the build version bumps ---
        { date: 'Unreleased — 2.4.2 dev cycle' },
+        { title: 'Fix: AcoustID Verification Let Instrumentals Pass As Vocal Tracks', desc: 'discord report (corruption [BWC]): downloads coming through as instrumental versions when the user expected the vocal cut. slipped past acoustid verification because the title-similarity normalizer strips parentheticals and version-suffix tags ("(Instrumental)", "- Live", etc) so legit name variations don\'t false-fail the comparison. side effect: "in my feelings" and "in my feelings (instrumental)" both normalize to "in my feelings", title sim is 1.0, file passes verification despite being the wrong cut. fix: detect the version label on each side BEFORE normalization runs — if expected and matched disagree (one is original, the other is instrumental / live / acoustic / remix / etc), reject as version mismatch. reuses `MusicMatchingEngine.detect_version_type` so post-download verification uses the same patterns the pre-download soulseek matcher already applies (no duplicated regex tables). also gates the secondary fallback scan, so a wrong-version variant in the same fingerprint cluster can\'t win the loop after the best match is rejected. 6 new tests pin the four direction cases (instrumental returned for vocal request → fail, vocal returned for instrumental request → fail, live vs acoustic → fail, matching versions on both sides → pass) plus the original-to-original happy path and the secondary-scan gate.', page: 'downloads' },
        { title: 'Fix: Search Picker Defaulted to Spotify on Non-Admin Profiles', desc: 'github issue #515 (jaruca): admin sets primary metadata source to deezer / itunes / discogs, but every non-admin profile saw spotify as the active source on the search page and global search popover, requiring manual click each time. cause: `shared-helpers.js` resolved the active source by fetching `/api/settings` — that endpoint is `@admin_only` because it returns full config including credentials, so non-admin profiles got 403 and silently fell back to the hardcoded `spotify` default. fix: read from `/status` instead, which is public and already returns `metadata_source` for the dashboard. one-line scope change, behavior preserved for admins (same value, different endpoint), non-admins now see the real configured source.', page: 'search' },
        { title: 'Internal: Stop Swallowing Exceptions Silently', desc: 'github issue #369 (johnbaumb): the codebase had ~300 `except Exception: pass` blocks — and another ~30 bare `except: pass` ones — across web_server.py, every metadata client, every download/import worker, the repair jobs, and most service modules. when one of those paths failed at runtime, the failure was completely invisible: no log line, no telemetry, nothing. you\'d see "downloads stopped working after a few hours" or "enrichment never finishes" and there was nothing to grep for in app.log because the exception had been thrown straight into the void. swept all of them. converted to `except Exception as e: logger.debug("<context>: %s", e)` so failures land in the log with enough context to grep. bare `except:` cases (which also swallow KeyboardInterrupt and SystemExit — actively wrong) got upgraded to `except Exception:` first so ctrl-c works correctly. ~14 cleanup-path sites (atexit handlers, finally-block conn.close calls) were intentionally left silent with explicit `# noqa: S110` comments — logging during shutdown can itself crash because file handles get torn down before the handler fires. and added ruff S110 to the lint config so this pattern fails CI going forward — drift fails at PR review instead of at runtime against a wedged worker thread. zero behavior change to any happy path; just made the failure paths inspectable. test suite (2188 tests) green throughout the sweep.' },
        { title: 'Fix: Repair Job Card "X Findings" Badge Was Misleading After Bulk-Fix', desc: 'discord report: duplicate detector card said "372 findings" and cover art filler said "60 findings", but clicking the findings tab pending filter showed 0 — read like a bug ("findings aren\'t being created"). actual cause: job-card badge displayed `last_run.findings_created` (historical "found in last scan") which doesn\'t reflect current state when those findings have since been bulk-fixed and moved to status="resolved". fix: api response now includes `pending_findings_count` per job (current pending count from a single sql aggregation). badge now shows "X pending" when pending count > 0 (urgent red color), or "X found in last scan" with a muted grey color when pending = 0 but the last scan did surface something. user can tell at a glance whether something needs review vs whether it\'s a historical reminder. 3 new tests pin the per-job pending count helper.', page: 'stats' },