Cin-pass on the #524 + multi-disc fixes. Pre-merge polish.
Lifts: `core/imports/album_matching.py`
`AutoImportWorker._match_tracks` was a 100+-line method buried in a
1400-line class. Testing it required monkey-patching `_read_file_tags`
+ mocking the metadata client just to exercise the matching algorithm.
Per Cin's "lift logic out of monolithic classes" pattern (same shape
as the album-info builders / discography / quality scanner lifts),
moved the dedup + scoring into `core/imports/album_matching.py` as
pure functions over already-fetched data.
Helper exposes:
- Constants for every match weight (TITLE_WEIGHT, ARTIST_WEIGHT,
POSITION_WEIGHT, NEAR_POSITION_WEIGHT, CROSS_DISC_POSITION_WEIGHT,
ALBUM_WEIGHT, MATCH_THRESHOLD). Magic numbers killed.
- `dedupe_files_by_position(audio_files, file_tags, *, quality_rank)` —
position-keyed quality dedup.
- `score_file_against_track(file_path, file_tags, track, *,
target_album, similarity)` — pure per-(file, track) scorer.
- `match_files_to_tracks(audio_files, file_tags, tracks, *,
target_album, similarity, quality_rank)` — full matching with
greedy best-per-track + first-come-first-serve over deduped files.
Worker shrinks from 100 lines of inline algorithm to 8 lines that
fetch tags + delegate to the helper.
Tests added (26 new across 3 files):
`tests/imports/test_album_matching_helper.py` (19 tests):
- Constants pin: weights sum to 1.0, threshold above position-only
- `dedupe_files_by_position`: quality wins, cross-disc preserved,
tag-less files passed through, first-wins on equal quality
- `score_file_against_track`: perfect-agreement = 1.0, position
needs both disc+track, near-position only same-disc, missing
artist tags handled, disc field aliases (Spotify/Deezer/iTunes),
filename fallback when title tag missing
- `match_files_to_tracks`: happy path, file used at-most-once,
below-threshold left unmatched
- Edge case Cin would flag: tag-less file with strong filename title
matches multi-disc album track via title alone (perfect-name
scenario works); tag-less file with weak filename title against
multi-disc API correctly stays unmatched (the behavior delta from
the disc-aware fix — pinned so future readers see it's intentional)
`tests/test_import_album_match_endpoint.py` (3 tests):
- Backend warning fires when source missing from match POST
- No warning fires on the legit path (catches noisy-warning regression)
- Endpoint actually forwards source/name/artist to the payload
builder (catches "logging the right warning but doing the wrong
lookup" regression)
`tests/test_import_page_album_lookup_pattern.py` (4 tests):
- Source-text guard for the import-page #524 fix in stats-automations.js.
Until the file is modularized enough for a behavioral JS test (under
the existing tests/static/*.mjs pattern), regex-based assertions pin:
the `_albumLookup` field exists, the click handler reads from it,
both card renderers populate it before emitting onclick, and the
cache stores `source` per entry. Caveat documented in the test
module docstring.
Verification:
- All 26 new tests pass.
- Existing multi-disc tests (test_auto_import_multi_disc_matching.py)
still pass after the lift — proves the helper is behavior-equivalent
to the inline implementation it replaced.
- Full suite: 2293 passed, 1 flaky-timing failure
(test_library_reorganize_orchestrator.py::test_watchdog_warns_about_stuck_workers
— passes in isolation, fails only in full-suite runs, pre-existing,
unrelated to this PR).
- Ruff clean.
Notes for the reviewer:
- The frontend stats-automations.js JS test is structural-only.
Behavioral JS testing for that file requires modularizing the
~7k-line monolith first — out of scope for this fix.
- The cross-disc 5% consolation bonus is a small behavior change for
users with weak/missing tag info on multi-disc albums. Pinned
explicitly in `test_tagless_file_with_weak_title_unmatched_in_multidisc`
so the trade-off is visible: correct multi-disc matching wins over
optimistic position-only matching that produced wrong-disc files.