Two pre-existing parity gaps in `record_soulsync_library_entry` that
the prior parity commits left untouched. Both close real holes
between auto-import writes and what the soulsync_client deep scan
would have produced.
# Gap 1: Album duration was the first-imported track's duration
`record_soulsync_library_entry` is called once per track. The album
INSERT only fires for the FIRST track of a new album (subsequent
tracks find the album row already exists). The INSERT was passing
`duration_ms` — `track_info["duration_ms"]` — as the album's
`duration` column. That's the duration of one track, not the album
total. Compare to `SoulSyncAlbum.duration` in soulsync_client which
is `sum(t.duration for t in self._tracks)`.
Fix:
- Worker computes `album_total_duration_ms = sum(...)` across every
matched track and threads it onto context as
`album.duration_ms`.
- side_effects reads that value (or falls back to the per-track
duration for legacy non-auto-import callers) and writes it as the
album row's `duration`.
# Gap 2: Re-imports of the same artist/album were insert-only
When the SELECT-by-id or SELECT-by-name found an existing soulsync
artist or album row, the function skipped completely — no UPDATE
path. Meant: artist genres / thumb / source-id reflected ONLY
whatever the FIRST imported album supplied, never refreshing as
more albums by that artist landed. Ten more imports later, the
artist row still held whatever the first random import wrote.
Conservative fix: when an existing row matches, run an UPDATE that
fills only the columns whose current value is NULL or empty. Never
overwrites populated values — protects manual edits +
enrichment-worker writes the same way the scanner UPDATE path
preserves enrichment columns.
Implementation note: the empty-check happens in Python, NOT SQL.
Initial pass tried `COALESCE(NULLIF(col, ''), NULLIF(col, 0), ?)`
but SQLite's `NULLIF(text_col, 0)` returns the original text value
instead of NULL — different types, no coercion. So the SQL-only
conditional was unreliable on text columns. New helper does
`SELECT cols FROM table WHERE id`, compares each column in Python,
and emits UPDATE clauses only for the ones that need filling.
Allowlist defense: f-string column names go through
`_SOULSYNC_FILLABLE_COLUMNS` validation before interpolation.
Misuse adding new columns without an allowlist update fails closed
(logger.debug + skip).
# Tests added (4)
- `test_album_duration_uses_album_total_not_single_track` —
album with single-track context carrying explicit
`album.duration_ms = 2_500_000` writes 2_500_000 to the album row,
not the per-track 200_000 fallback.
- `test_re_import_fills_empty_artist_fields` — first import lands
artist with empty thumb + empty genres; second import for same
artist with thumb + genres present updates the existing row.
- `test_re_import_does_not_clobber_populated_artist_fields` —
first import writes rich genres + thumb; second import with
worse / different metadata leaves the existing row untouched.
- `test_re_import_fills_empty_source_id_when_missing` — first
import had no source artist ID; second import does — fills the
empty `spotify_artist_id` column on the existing row.
# Verification
- 10/10 side-effects tests pass (including 4 new + 4 from prior
parity commit + 2 history/provenance)
- 217 imports tests pass (no regression)
- 2369 full suite passes (+4 from prior, +22 PR-total from baseline 2347)
- 1 pre-existing flake (`test_watchdog_warns_about_stuck_workers`,
passes in isolation, unrelated)
- Ruff clean