mirror of https://github.com/Nezreka/SoulSync.git
video
main
dev
fix/disable-beatport-features
johnbaumb-discover-redesign
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4.0
2.4.1
2.4.2
2.5.0
2.5.1
2.5.2
2.5.3
2.5.4
2.5.5
2.5.6
2.5.7
2.5.9
2.6.0
2.6.1
2.6.2
2.6.3
2.6.4
2.6.5
2.6.6
2.6.7
2.6.8
2.6.9
2.7.0
2.7.1
2.7.2
2.7.3
2.7.4
v0.65
${ noResults }
3590 Commits (2fcdfd3145d415355b9202c406d8985fda8eb003)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
2fcdfd3145 |
Canonical findings: include as much (free) data as possible
Per request, pack each finding with everything available WITHOUT extra API calls (kettui: reuse what's already fetched, read the album row we already loaded, degrade per-field, keep it tested): - Pinned release's track titles — already fetched during scoring, so free (capped at 60 to bound details_json). - From the album row (free): year, DB track count, total duration, genres-free context, and the album's currently-linked source IDs. - file_track_titles (your library's titles) for a side-by-side with the release. - Artist + album thumbs (artist via the guarded lookup) and names. _describe_pin now renders: "Artist — Album (year)", the fit breakdown, "Currently linked: … → pinning X", "Beat: <alternatives>", and the release tracklist — so the card is judge-able at a glance, and the structured fields are in details for a richer UI. NOT included (would cost an extra per-album API fetch, left as opt-in): the *release's* own year/type/cover/URL from get_album_for_source, vs the library's. Tests: _describe_pin rich-render (year/linked/tracklist), resolver release-titles, orchestration free-context fields. 94 canonical + reorganize regression pass. |
2 weeks ago |
|
|
03d099fb1d |
Canonical findings: add artist image (guarded, schema-safe)
Findings now carry artist_thumb_url alongside album_thumb_url (same key the track-repair findings use, so the findings UI already renders it). Fetched via a guarded _lookup_artist_thumb() — checks the artists table has a thumb_url column first and swallows any error — rather than adding ar.thumb_url to the shared load_album_and_tracks SELECT. The shared-loader approach was tried first and REVERTED: it crashed reorganize on schemas whose artists table has no thumb_url column (caught by 40 orchestrator tests). The lookup only runs for albums that actually resolve, so it adds no cost to the no-source-id short-circuit majority. Tests: orchestration test asserts artist_name + album_thumb_url + artist_thumb_url flow through. 47 canonical + 104 canonical/reorganize regression tests pass. |
2 weeks ago |
|
|
ec8091caad |
Canonical: richer, judge-able findings (the why behind a pin)
Live-run feedback: "Best-fit release: deezer (665666731), score 1.0" is too thin
to trust/accept. Each finding now explains WHY:
- score_release_detail() exposes the per-signal breakdown (count/duration/title)
instead of just the blended score.
- resolve_canonical_for_album returns an enriched result: the breakdown,
file_track_count vs release_track_count, and a `candidates` list of every
source it scored (so a finding can show what the winner beat).
- resolve_and_store adds album/artist/thumb context from the row it already
loaded (no extra query). Storage still only reads source/album_id/score.
- The job builds a real description via _describe_pin(), e.g.:
"Pin deezer release 665666731 (confidence 100%).
Fit to your library: 11 files vs 11 tracks on this release — track count
100%, durations 100%, titles 100%.
Beat: spotify 65% (17 tk)."
and a clearer title ("Pin deezer as canonical: <artist> — <album>").
Tests: resolver enrichment (breakdown + candidate comparison fields), and
_describe_pin (judge-able text incl. the beaten alternatives, and honest "n/a"
for a missing signal). 42 canonical tests pass.
Note: the description string carries the judge-able info regardless of UI; how
the findings tab renders the extra details keys (thumb image, candidates table)
is still UI-dependent and unverified.
|
2 weeks ago |
|
|
57e039e34d |
Canonical: make source selection a job setting (default active-preferred)
Feedback from the live dry-run: the job was pinning whichever source best fit the files regardless of which source it was, which was surprising — users expect it to respect their active metadata source. Made it a per-job setting instead of a baked-in policy. source_selection (default 'active_preferred'): - active_preferred — use the active/primary metadata source's release when the album has an ID for it AND it clears the score floor; otherwise fall back to the best-fit among the other sources. Respects the configured source but self-heals when that link is clearly broken (below floor / no ID). - active_only — only ever the active source; never considers others. - best_fit — previous behavior: whichever source matches the files best. resolve_canonical_for_album gains mode + primary_source; the orchestration threads the primary source through; the job reads source_selection from its settings. Note: active_preferred respects the active source as long as it clears the floor, so it will NOT override a deluxe-vs-standard mismatch on the primary (#767-Bug2) — that's what best_fit is for; the choice is now the user's. Tests: per-mode coverage in test_canonical_resolver.py (active_preferred uses primary when it fits, falls back when primary is below floor, keeps primary even when another fits better; active_only pins primary / never falls back; best_fit unchanged), orchestration default-mode test, and the setting default. 39 canonical tests pass. |
2 weeks ago |
|
|
e40b328a94 |
docs: canonical-album-version design spec
The staged design doc for this branch (#765 + #767-Bug2): the match-your-files canonical rule, the additive/dormant rollout, and the stage-by-stage plan the 6 implementation commits followed. Kept on the branch as its reference; not relevant to dev/main. |
2 weeks ago |
|
|
f9271c0cd8 |
Canonical album version — backfill job (the opt-in activation)
The populate trigger that turns the (until now dormant) feature on. Until a user enables and runs this job, no album has a canonical -> both read sides (Stages 3-4) fall back -> zero behavior change. So the whole feature ships safely off. - core/repair_jobs/canonical_version_resolve.py — "Resolve Canonical Album Versions". Iterates the active server's albums, skips ones already pinned, and calls the tested resolve_and_store_canonical_for_album per album. Opt-in (default_enabled=False) and dry-run-by-default: resolving compares an album's candidate releases across sources (metadata-source API calls, once per album), so it's deliberately user-triggered. Dry run reports a finding per album it would pin; live mode stores. Registered in _JOB_MODULES. - core/metadata/canonical_resolver.py — resolve_and_store gains store=True; the job's dry run passes store=False to resolve-without-writing. Tests: tests/test_canonical_version_job.py (6) — registered, opt-in + dry-run defaults, live resolves+stores (auto_fixed), dry run creates findings without persisting, already-pinned albums skipped. Registry loads all 19 jobs cleanly. 145 tests across the full feature + reorganize/track-repair/DB regression pass. |
2 weeks ago |
|
|
f5752e3dc0 |
Canonical album version — Stage 4: Track Number Repair prefers canonical (read)
_resolve_album_tracklist gains a Fallback -1: if the album has a pinned canonical (source, album_id), use it before the existing 6-level cascade — so Track Number Repair resolves the SAME release the Reorganizer does (Stage 3) and the two stop contradicting each other (#765, the Spotify-4 vs MusicBrainz-3 conflict). Gated + additive: the entire existing cascade is untouched for albums without a canonical, so this job's all-01-album rescue (which relies on the MusicBrainz/ AudioDB fallbacks for albums with no DB source ID) is fully preserved — that's the regression we explicitly refused to take in a reactive fix. New helper _lookup_canonical_from_db() mirrors _lookup_album_ids_from_db (file-path -> track -> album), returns None when no DB / no match / columns absent / unresolved. Tests: tests/test_track_repair_canonical.py (4) — returns canonical when pinned, None when unresolved / file untracked / no DB. Existing track_number_repair tests still pass (no regression). |
2 weeks ago |
|
|
ecdfde03c6 |
Canonical album version — Stage 3: Reorganizer prefers pinned canonical (read)
_resolve_source now prefers the album's pinned canonical (source, album_id) when set, before the source-priority walk. So once an album's canonical is resolved, reorganize agrees with Track Number Repair (Stage 4) and stops mislabelling a standard album as deluxe (#767-Bug2). Gated + side-effect-free: only changes behavior for albums that already carry a canonical (none do until the populate step runs), an explicit user source pick (strict_source) still wins over the canonical, and a failed canonical fetch falls through to today's priority walk. So this stage is behavior-neutral until canonical is populated. Tests: tests/test_reorganize_canonical_source.py (4) — canonical preferred over priority, fetch-failure falls back, strict_source ignores canonical, no-canonical unchanged. 113 reorganize-orchestrator/tag-source/unknown-artist tests still pass (no regression). |
2 weeks ago |
|
|
43878b4d3d |
Canonical album version — Stage 2 (trigger): resolve+store orchestration
Completes Stage 2's populate path. Still dormant — no consumer calls it yet.
- resolve_and_store_canonical_for_album(db, album_id, ...): loads the album's
source IDs + its tracks' (duration_ms, title) from the DB via the SAME
loader the Reorganizer uses (load_album_and_tracks + _extract_source_ids), so
the canonical is chosen over exactly the source IDs the reorganizer sees;
scores off the DB track rows (the library's view of the files — no per-file
disk reads), resolves the best fit, and persists it. Returns the stored result
or None when unresolved.
- default_fetch_tracklist(): production fetcher wrapping
get_album_tracks_for_source, normalising to {title, track_number, duration_ms}
(duration best-effort; sec->ms; absent -> scorer leans on count+title).
Design note: chose LAZY resolution (Stages 3-4 consumers call this when they hit
an album with no canonical) over a standalone backfill repair job — no new
scheduling/UI surface, resolves only when a tool actually needs it, and stays
gated (NULL canonical = today's behavior).
Tests: tests/test_canonical_orchestration.py (5) — end-to-end on a real temp DB
(11 files pick the 11-track release over a 17-track deluxe and persist it),
no-source-ids -> None, missing-album -> None, and default_fetch_tracklist
normalization (dict items, seconds->ms) + failure -> None. All canonical +
DB-migration tests green.
|
2 weeks ago |
|
|
f37bc34082 |
Canonical album version — Stage 2 (core): resolver + persistence (dormant)
Turns the Stage-1 scorer into an end-to-end resolver + persists the result.
Still DORMANT — no consumer reads it yet, so zero behavior change.
- core/metadata/canonical_resolver.py — resolve_canonical_for_album(): builds
candidate releases from the album's per-source IDs (in source-priority order),
fetches each tracklist via an INJECTED fetch_tracklist (so it's unit-testable
without live APIs), scores them with pick_canonical_release, and returns the
best-fit {source, album_id, score}. Skips sources with no id / failed fetch;
returns None when there are no files, no candidates, or nothing clears the
confidence floor.
- database/music_database.py — set_album_canonical() / get_album_canonical()
write/read the Stage-1 columns. get returns None when unresolved, which every
consumer will treat as "fall back to today's behavior".
Tests: tests/test_canonical_resolver.py (7) — best-fit beats priority, priority
breaks true ties, skips missing-id/failed-fetch sources, None on
no-candidates/no-files/below-floor, score rounding. tests/test_canonical_db.py
(4) — set/get round-trip incl. timestamp, unresolved -> None, overwrite,
missing-album -> False. 34 canonical + DB-migration tests pass.
Remaining for Stage 2 (the trigger): read on-disk file durations/titles for an
album, gather its source IDs, call the resolver, store — wired via a backfill
repair job + an enrichment hook. Then Stages 3-4 wire the Reorganizer and Track
Number Repair to READ the pinned canonical.
|
2 weeks ago |
|
|
818c4f0bff |
Canonical album version — Stage 1: schema + pure scorer (dormant)
First stage of the canonical-album-version fix (#765 + #767-Bug2). Pins ONE canonical (source, album_id) per album, chosen by best-fit to the user's actual files, so the Reorganizer, Track Number Repair, and tagging stop re-resolving independently and contradicting each other. Ships DORMANT — nothing reads or writes the new data yet, so zero behavior change. Later stages populate (Stage 2) and consume (Stages 3-4) it. - core/metadata/canonical_version.py — pure scorer (the testable heart): score_release_against_files() rates a candidate release by track-count fit + duration alignment (greedy nearest within ±3s) + title overlap, dropping and renormalizing missing signals so it never crashes on sparse metadata. pick_canonical_release() takes candidates in source-priority order, picks the best fit, breaks ties toward the earlier (higher-priority) candidate so the choice is DETERMINISTIC — that determinism is what makes every tool agree (#765), while count/duration fit picks the right EDITION (#767-Bug2). A confidence floor (default 0.5) means a low-confidence guess is never pinned. - database/music_database.py — additive, nullable columns on albums (canonical_source / canonical_album_id / canonical_score / canonical_resolved_at), guarded by the existing PRAGMA-table_info pattern. NULL = unresolved = every consumer falls back to today's behavior. Tests: tests/test_canonical_version.py (11) — edition discrimination (11 files -> standard, 17 -> deluxe), deterministic priority tiebreak, duration disambiguation on count ties, graceful degradation (no durations / counts only / fuzzy titles), confidence floor, empty-input safety. tests/test_canonical_ columns_migration.py (4) — fresh DB has the columns, they're nullable w/ NULL default, migration is idempotent, and it ALTERs them onto an old albums table. 60 DB/schema regression tests still pass. |
2 weeks ago |
|
|
cd9e4abc7c |
#766 follow-on: source rows borrow their matched server track's cover
A source row with no art of its own (e.g. a YouTube source, which provides
none at mirror time) now borrows the cover from its MATCHED server track, so
both sides of the sync editor show an image.
The endpoint already had a borrow fallback (_server_art_map), but it matched by
an exact normalized "{artist}|{title}" key — so a YouTube-shaped row like
"Arctic Monkeys - Do I Wanna Know?" never matched the library's "Do I Wanna
Know?" and stayed blank even though the server had the cover. This borrow is
keyed off the ACTUAL source<->server pairing the reconcile already computed, so
it works for those rows once #768's canonical matching pairs them.
Done in the pure reconcile_playlist (final pass), so no frontend change is
needed — the editor already renders source_track.image_url. Guarded so it only
fills an EMPTY source image (Spotify/CDN art is never overwritten) and only when
the matched server track actually has a thumb.
Composes with the rest: #766 made the server cover URL work, #768 made the
YouTube row match, this makes the matched source row borrow that cover — so an
artless YouTube row matched to a Navidrome track with art shows on both sides.
Tests: tests/test_playlist_reconcile.py (+4) — artless source borrows the
matched cover; source with its own art keeps it; unmatched source has nothing to
borrow; borrow skipped when the server track has no thumb. 15 reconcile + 59
sync/navidrome tests pass.
|
2 weeks ago |
|
|
89b438974f |
Fix #766: Navidrome album covers blank in the sync editor (+ other modals)
The sync editor renders server covers as <img src="/api/navidrome/cover/{id}">,
but no Flask route ever served that path — so every Navidrome cover 404'd, on
every album, art or not. The source (left) side then went blank too: a source
row with no native art (e.g. YouTube, which provides none at mirror time) falls
back to borrowing the matched server track's cover — i.e. that same dead route.
So both sides collapsed to nothing.
Fix:
- New NavidromeClient.build_cover_art_url(cover_id) — builds the absolute,
Subsonic-authenticated getCoverArt URL (base_url + token/salt), keeping
credentials server-side. Uses a FIXED cover-art salt so the URL is
deterministic for a given (server, password, cover_id): a rotating salt (as
in _generate_auth_params) would make every request a unique URL → image-cache
miss every time + a dead, never-reused cache row per fetch. Token auth doesn't
require a unique salt, and the password is never exposed (only its salted md5).
- New route /api/navidrome/cover/<cover_id> — resolves that URL and streams the
image through the shared image cache (same pattern as /api/image-proxy), with
a private max-age so the browser caches by the stable route URL.
Effect: server side works for any album that has art in Navidrome; matched
source rows with no native art now borrow the (now-working) server cover.
Unmatched YouTube rows stay blank — no image exists anywhere to show.
Tests: tests/test_navidrome_cover_url.py (8) — URL structure + salted-token auth
(never the raw password), determinism (same id -> same URL so the cache hits;
different id/password -> different URL), optional size, and the not-connected /
no-id / no-credentials guards.
Caveats: not executed against a live Navidrome (no server in CI) — the URL
builder is unit-tested; the route's cache→HTTP→bytes round-trip is read-verified
only. Scope is the sync editor's Navidrome route; Plex/Jellyfin server-cover
branches and any other modals using a different mechanism are untouched.
|
3 weeks ago |
|
|
3b49ac8280 |
Fix #767: Library Organizer dry run no longer creates folders
The reorganize preview (dry run) was physically creating destination album folders, littering the library with empty dirs and making "changes" before the user ever hit Apply. Cause: preview_album_reorganize calls build_final_path_for_track purely to COMPUTE the destination path string — but that shared helper has 9 os.makedirs side effects (it's also the live download/import path builder, where creating the dir is correct). So computing the preview path created "Lenka (Expanded Edition)/" on disk. Fix: build_final_path_for_track gains create_dirs=True; all 9 makedirs now route through a gated helper. The reorganize PREVIEW passes create_dirs=False, so a dry run computes the exact destination path with zero filesystem side effects. Everything else keeps the default True: - the download/import post-process flow (still writes files into the dir), - retag, - the reorganize APPLY path — verified it goes through post_process_fn (the real pipeline → build_final_path_for_track with create_dirs=True), so live moves still create their destination dirs. The gate only silences the dry run. Tests: tests/imports/test_import_paths.py — create_dirs=False computes the correct path (matching the reported "01 - The Show.flac") but writes NOTHING to disk (not even the Transfer root); create_dirs=True still creates folders; both yield an identical path. Updated two reorganize-orchestrator test doubles to accept the new kwarg. 148 reorganize/paths/retag/pipeline tests pass. Does NOT fix the second half of #767 (Expanded Edition picked over the standard album). That is NOT a reorganizer bug: the library album row was linked to the deluxe release at enrichment time (its stored spotify_album_id/itunes_album_id/ deezer_id points at "Lenka (Expanded Edition)"), and the reorganizer faithfully reorganizes to whatever the album is linked to. The real fix is in album enrichment's edition preference — tracked separately. |
3 weeks ago |
|
|
bba0836324 |
Fix #768: playlist sync editor refusing to match certain tracks
Three compounding bugs hit tracks whose source metadata is YouTube/streaming- shaped — title "Artist - Song", artist "Official Artist"/"Artist - Topic"/ "ArtistVEVO" (reported: "Arctic Monkeys - Do I Wanna Know?" by "Official Arctic Monkeys"). Server-agnostic — affects Plex/Jellyfin/Navidrome, not just the reporter's Navidrome. Bug A — the match fails. The confidence scorer and the editor's reconcile both compared the raw "Artist - Song" title against the library's clean "Song"; the length-ratio penalty + floor drove it to ~0.18 (NO-MATCH), so the track showed unmatched while its server copy showed as an orphan "extra". New pure core/text/source_title.py (clean_source_artist / strip_artist_prefix / canonical_source_track) strips the channel/video decoration, applied at BOTH matching seams: services/sync_service._find_track_in_media_server (tries raw then canonical, keeps the best) and the editor reconcile. Conservative: a title prefix is stripped only when it equals the artist, so "Self-Titled", "Jay-Z", and "Marvin Gaye" (by another artist) are untouched, and the canonical form is an additional best-of candidate so it can only help. Bug B — manual matches never persisted. get_server_playlist_tracks built the per-source entry WITHOUT source_track_id, so "Find & add" posted an empty id and _persist_find_and_add_match returned early. The match reverted to "extra" on reload and re-adding looped. The editor's 3-pass matcher is now lifted to a pure, tested core.sync.playlist_reconcile.reconcile_playlist that includes source_track_id (the frontend at pages-extra.js:1836 already reads + sends it). Bug C — manual match duplicated + delete wiped all copies. "Find & add" always inserted, so linking a source to an already-present server track appended a duplicate (pos 72, 73...); remove filtered out EVERY entry with the target id. New pure core.sync.playlist_edit (plan_playlist_add: link-don't-duplicate when the target is already present; remove_one_occurrence: drop a single copy) wired into the Plex/Jellyfin/Navidrome add + remove branches. Tests (extreme): tests/test_source_title.py (35), tests/test_playlist_reconcile.py (11 — incl. the reported case, parity for override/exact/fuzzy/extra, and duplicate-server handling), tests/test_playlist_edit.py (12). 286 matching/sync tests still pass. Caveats: the sync_service change and the add/remove/editor endpoints are read-verified, not executed against a live media server (none in CI). The pure cores they call are exhaustively unit-tested; output-shape parity of the reconcile lift is covered. Delete removes the first matching copy (duplicates are identical, so harmless). |
3 weeks ago |
|
|
174513d351 |
Fix #769: playlist sync matched wrong same-artist track with high confidence
Tracks NOT in the library were matched to a DIFFERENT song by the SAME artist
and reported with high confidence instead of as missing — e.g. "Dani
California" -> "Californication" (Red Hot Chili Peppers), "Under The Bridge"
-> "Around the World".
Root cause: _calculate_track_confidence scores 0.5*title + 0.5*artist. A
same-artist comparison always yields artist = 1.0, so the title score is the
only thing that can tell two of an artist's songs apart — but that score is a
SequenceMatcher CHARACTER ratio, which over-credits unrelated titles that
share a long substring ("californi…" = 0.67) or just a stopword ("the" =
0.62). With the flat 0.5 artist term, anything clearing the weak 0.6 char
floor lands at ~0.81-0.83, well over the 0.7 sync threshold. Reproduced on
dev: both reported pairs score 0.81/0.83.
Fix: new core/text/title_match.py:titles_plausibly_same, called in
_calculate_track_confidence right before the floor. It accepts a pair only
when it's near-identical char-wise (>=0.85, so typos / punctuation / casing
like "Beleive"->"Believe", "HUMBLE."->"Humble" still match) OR the titles
share at least one significant (non-stopword) word. Two different songs by the
same artist share no content word, so they're rejected and the real track is
correctly reported missing. ("the" is a stopword — that's what leaked "Under
The Bridge"/"Around the World".)
Scoped deliberately: the word-overlap test fires ONLY when at least one side
has 2+ content words. For single-word titles there is no other word to share,
so it defers to the existing char floor — otherwise legitimate stylized
spellings ("Grey"/"Gray", "Tonite"/"Tonight", "4ever"/"Forever") would become
new false-negatives. Verified those still match. The few single-word variants
that do score low (Ok/Okay, Thru/Through) were already rejected by the
pre-existing length-ratio penalty, not by this gate.
Both reported false positives now score 0.33/0.31 -> missing. Does NOT address
the harder case of two different same-artist songs that DO share a content
word (e.g. "Believe"/"Believer") — pre-existing and unworsened. Any residual
error fails safe: a false-missing is re-downloaded/wishlisted, vs the old
behavior which silently substituted the wrong song.
Tests: tests/test_title_match_guard.py (14) — pure-guard unit tests + a
13-pair battery driving the REAL _calculate_track_confidence (genuine matches
stay >=0.7, same-artist different songs drop below), plus an explicit
no-regression test for stylized single-word spellings. 292 matching/sync tests
pass.
|
3 weeks ago |
|
|
3c15041b88 |
Fix #764: manual import reported quarantined files as a successful "Done"
The manual-import routes (album + singles) call post_process_matched_download directly. When the pipeline quarantines a file — integrity / AcoustID / FLAC bit-depth — or hits the race guard, it sets a context flag and RETURNS NORMALLY (it only marks the task failed + notifies when there's a task_id, which manual imports don't have). So the inner pipeline raised no exception, and routes.py counted `processed += 1` for a file that had just been moved to ss_quarantine, not the library. Result: the UI shows a green "Done" while the track silently vanished — exactly the #764 report (Coldplay - Yellow.flac -> ss_quarantine, but "Done"). The download path already handles this in post_process_matched_download_with_verification (it reads the same flags and marks the task failed); only the manual-import routes were missing the check. Fix: new pure helper import_rejection_reason(context) returns a human-readable reason for any terminal rejection (_integrity_failure_msg / _acoustid_quarantined / _bitdepth_rejected / _race_guard_failed) or None for a clean import. Both manual-import routes now consult it: album_process reports the track in `errors` instead of counting it processed; process_single_import_file returns ("error", reason) instead of ("ok", ...). Verified every move_to_quarantine call site (4, all in pipeline.py) sets one of those flags, so no quarantine path slips through. This also delivers the "direct display of the error" the reporter asked for — the reason now surfaces in the response `errors` list. Does NOT address the reverse symptom ("failed even though it moved correctly") — not yet root-caused — nor the separate bit-depth hole on the download-path wrapper. Tests: tests/imports/test_import_rejection_reason.py (10) — each trigger detected, falsy flags ignored, deterministic ordering, plus two route-level tests driving the REAL process_single_import_file (quarantine -> "error"; clean -> "ok"). |
3 weeks ago |
|
|
3dfec8a157 |
Fix #764: import no longer destroys embedded cover art
enhance_file_metadata rebuilds tags from scratch: for FLAC it calls
clear_pictures(), for MP3/MP4 it clears the whole tag block — and it does
this UP FRONT, then saves the file, long before it tries to fetch and embed
the replacement art. So every way the re-embed could come up empty left the
file saved with the original art destroyed and nothing put back:
- extract_source_metadata returns nothing -> early save, no embed
- no album-art URL / art download fails / rejected by the min-size guard
-> embed_album_art_metadata returns early without adding a picture
- art embedding disabled in config -> embed skipped entirely
- embed raises mid-enrichment -> file left cleared on disk
This is the "cover art gets corrupted/destroyed during import" half of #764
(continuation of #755); distinct from #750's truncated-cache DISPLAY bug.
Fix: new core/metadata/art_preservation.py snapshots the existing art
(the live Picture / APIC / MP4Cover objects, so they re-apply verbatim)
BEFORE the clear, and restores it before each save IFF the file currently
has none. Wired into all three exit paths in enhance_file_metadata
(no-metadata early return, the final save, and the except handler). The
restore is a strict no-op when art is already present, so the happy path —
new art embedded — is byte-for-byte unchanged: it never clobbers or
duplicates a freshly-embedded cover. embed_album_art_metadata now returns a
bool so the intent (embedded / didn't) is explicit.
Tests:
- tests/test_art_preservation.py (5) — snapshot/restore round-trips through
real mutagen FLAC + ID3 objects; restore no-ops when new art is present.
- tests/test_enrichment_art_preservation.py (4) — runs the REAL
enhance_file_metadata over a real FLAC with embedded art and asserts the
art survives on disk for missing-metadata / failed-embed / embed-raises,
and is correctly REPLACED (exactly one picture, new bytes) on success.
1019 tests pass across the metadata/enrichment/imports/acoustid suites.
|
3 weeks ago |
|
|
de20897f83 |
Fix: deep-scan / DB-update automation falsely errors on large libraries (stall-based timeout)
The DB-update + deep-scan automation monitor used a hard 2-hour TOTAL cap
(while elapsed < 7200). It tracked progress but only used it to print a stall
warning — the only thing that actually timed out was wall-clock. So a large
library that scans for >2h while progressing fine (reported: 4781 artists) trips
the cap and the automation card flips to 'error: timed out after 2 hours' even
though the scan thread is healthy and still running (the timeout never cancels
it, which is why it keeps progressing in the logs after the 'error').
Time out on STALL, not total runtime:
- 30 min with NO progress -> error ('stalled'); catches a genuinely hung scan.
- 10 min idle -> warning (repeats); unchanged heads-up.
- 24h absolute backstop, purely a runaway-loop guard.
- An actively-progressing scan keeps resetting the idle clock, so it never
times out no matter how many hours the whole library takes.
- Progress is judged on (processed, progress, current_item) so a slow stretch
where the rounded % holds steady (but the artist keeps changing) isn't a
false stall.
The decision is extracted into a pure, testable scan_wait_action(); both the
deep-scan and full-refresh handlers share the monitor loop, so both are fixed.
Tests: tests/test_scan_wait_action.py (9) — headline regression (5h/12h total
but progressing -> 'continue', not timeout), finished/stall-warn/stall-timeout/
abs-cap thresholds, and ordering. 280 automation tests still pass.
|
3 weeks ago |
|
|
c8c3789cb9 |
Album bundle: fall back to per-track on an I/O error, don't hard-fail the batch
Defense-in-depth follow-up to #760. Even with the entrypoint chown fix, if the album-bundle staging dir ever can't be created/written (permissions, read-only mount, disk full), the dispatch caught the plugin exception and marked the whole batch failed — even though the album had already downloaded (the #715 symptom: 'release finishes downloading but the batch fails'). Now an OSError from the plugin is flagged fallback-eligible, so the dispatch returns to the per-track flow instead of hard-failing. OSError covers the staging/filesystem failure that motivated this (#760's PermissionError) and, by Python's IOError==OSError aliasing, any propagated transient I/O error — falling back is never worse than hard-failing, and per-track is the universal graceful path. Programming errors (TypeError, KeyError, RuntimeError, …) are NOT OSError and stay terminal, so genuine bugs still fail loudly — the existing 'plugin exception => failure' contract and its test are preserved. Test: new test_dispatch_staging_oserror_falls_back_to_per_track (PermissionError on the staging dir -> result False, phase 'analysis', not failed). Existing RuntimeError-is-terminal test still passes. 131 album-bundle/plugin tests green. |
3 weeks ago |
|
|
aabf1c0e6a |
Fix #760: chown /app/storage to PUID on every start (album-bundle staging EACCES)
The album-bundle staging area /app/storage is baked into the image owned by the build-time soulsync UID. The entrypoint only re-chowned it to the runtime PUID inside the GATED recursive chown (entrypoint.sh:43), which is skipped whenever /app/data is already owned correctly — and /app/storage was missing from the UNCONDITIONAL per-start chown (line 85). So on installs whose PUID differs from the build UID and whose /app/data is already correct, /app/storage kept its build ownership and wasn't writable, and the Soulseek album-bundle flow died with: PermissionError: [Errno 13] Permission denied: 'storage/album_bundle_staging' (/app/Stream was added to the unconditional chown after this exact bug; /app/storage slipped through.) Add /app/storage — plus /app/MusicVideos and /app/scripts, which were also missing — to the unconditional mkdir+chown (lines 84-85) and the writability audit (line 92), matching the Dockerfile's pre-baked dir list. /app/storage is now chowned to the runtime PUID on every start regardless of the gated recursive chown. Verified with bash -n; all four dir lists are now consistent. |
3 weeks ago |
|
|
cea0e365c2 |
Fix #759: Amazon enrichment floods when its public proxy is down
After an update, installs became unusable: the Amazon enrichment worker runs by
default, the default public T2Tunes proxy (t2tunes.site) was returning
503 'Amazon Music API is not initialized', and the worker treated every album
as an individual error -- logging an ERROR per item, churning network + DB
continuously across the whole library, and marking every row 'error' (a state
the retry tiers never re-attempt, so even after the proxy recovered nothing
re-enriched). The reporter couldn't reach the UI to turn it off.
Two-part fix:
1. Source-outage circuit breaker (core/amazon_outage.py, pure + tested):
- is_source_outage(exc) distinguishes a whole-source outage (HTTP 5xx,
'not initialized', connection failure, non-JSON error page) from a real
per-item miss (404, transient 400, etc.).
- On an outage the worker now leaves the item UNTOUCHED (so it's retried once
the proxy recovers instead of being permanently burned to 'error'), logs
ONCE per streak, and backs off with next_poll_delay_seconds() -- escalating
30s -> 60s -> ... capped at 30 min -- instead of grinding every 2s. It
auto-resumes the normal cadence the moment the source answers (success OR a
non-outage error both clear the streak).
- AmazonClientError now carries status_code so detection doesn't rely on
message parsing.
2. Opt-in by default (web_server.py): amazon_enrichment_paused now defaults to
True. Because enrichment depends on an external public proxy that can be
down, it stays paused unless the user explicitly enables it -- a proxy outage
can no longer take down installs that never opted in. (Behaviour change:
anyone on the old auto-on default is now paused; re-enable in Settings.)
Together: on update the worker is paused -> no flood -> UI accessible; opted-in
users are protected from future outages by the breaker.
Tests: tests/test_amazon_outage.py (12) pin the classifier across every error
surface (incl. the exact 503 'not initialized' case) and the back-off schedule
(monotonic, capped). 157 Amazon tests pass; lint clean.
Note: could not reproduce the exact 'UI fully unreachable' mechanism remotely
(WAL + 8 gthreads shouldn't hard-lock); the fix removes the flood/churn that is
the practical cause and defaults the feature off.
|
3 weeks ago |
|
|
28850672a6 |
Fix: duplicate detector kept lossy over lossless (rank format first)
The Duplicate Detector's 'Keep Best' auto-selection ranked copies by highest
bitrate -> duration -> track number, with no notion of format. A FLAC whose
bitrate the library scan never populated (a common gap) therefore lost to a
282 kbps MP3: 282 > 0, so the MP3 was kept and the FLAC deleted (reported on
Havok 'Prepare For Attack', and again on Kendrick GNX).
Fix: rank by format/lossless tier FIRST, then bitrate, duration, track number.
A lossless file now always beats a lossy one regardless of the recorded
bitrate; bitrate/duration/track# only break ties within the same format.
- core/library/duplicate_keep.py (new): pure, importable pick_duplicate_to_keep
+ duplicate_keep_sort_key + format_rank_for_path (extension rank mirroring
auto_import_worker._quality_rank: flac=10 ... mp3=5 ... unknown=1).
- core/repair_worker.py: _fix_duplicates auto-pick now calls
pick_duplicate_to_keep instead of the bitrate-first max().
- webui/static/enrichment.js: the KEEP/REMOVE recommendation mirrors the same
format-first ranking so the badge matches what the backend will delete.
Parity: Python uses '.ext' keys (os.path.splitext), JS uses 'ext'
(split('.').pop()) -> identical results; both keep the first copy on a full
tie. Verified the only other dedup path (the standalone Duplicate Cleaner
automation, core/library/duplicate_cleaner.py) was already format-priority-first
and correct -- no change needed there.
Tests: tests/test_duplicate_keep.py (11 -- incl. the exact FLAC-with-missing-
bitrate vs 282 kbps MP3 case, format ranking, within-format tie-breakers, and
edge cases). 147 repair/duplicate tests still pass.
Note: why FLAC bitrate is NULL in the DB is a separate library-scan gap;
format-first ranking makes the keep decision correct regardless.
|
3 weeks ago |
|
|
b202c176f7 |
Cover-art sources: skip low-res art (min-resolution guard) + max-res iTunes
Follow-up to the preferred-art feature. Real test runs showed a source could win on priority while handing back a small cover: Cover Art Archive is volunteer-uploaded with no size floor, so CAA-first gave a 599x531 (Taylor Swift) and a 600x600 (Kendrick GNX) -- front-1200 only caps the max, so a ~600px upload stays ~600px -- and Deezer/iTunes lower in the order never got a turn. Fix: - Minimum-resolution guard: artwork._min_size_art_validator builds the resolver's validate hook -- it fetches each candidate, caches the bytes (so the winner isn't fetched twice), and accepts art only when its shortest side >= metadata_enhancement.min_art_size (default 1000px; 0 disables). Art that's too small is a miss, so the resolver falls through to the next source instead of winning on priority. Unmeasurable images are accepted (don't over-reject; fallback is still today's art). Wired into both embed_album_art_metadata and download_cover_art. - iTunes art upgraded to /3000x3000bb/ (was the 600px default) so it contributes high-res when it wins. - select_preferred_art_url gains a validate passthrough to the resolver. - config default metadata_enhancement.min_art_size: 1000. Effect: with an order like caa > deezer > spotify > itunes, a ~600px CAA upload is now skipped and Deezer's ~1900px wins -- consistent big art. (Spotify art often maxes ~640px, so it's skipped at the 1000 floor in favor of bigger sources; lower min_art_size to ~640 to allow it.) Tests: tests/metadata/test_art_min_size.py (6 -- incl. the real 599x531 and 600x600 cases, shortest-side logic, unmeasurable-accept, no-bytes-reject, 0-disables) + iTunes max-res upgrade test. Full metadata suite green (617). |
3 weeks ago |
|
|
6bc2836f47 |
Feature: preferred album-art source selection (opt-in, ordered, with fallback)
Lets users pick which providers' cover art to use and in what priority, generalizing the single prefer_caa_art toggle into an ordered, mix-and-match list (Sokhi's request). Fully opt-in: default album_art_order is [], so every existing install is byte-for-byte unchanged until the user enables sources. How it works: - Per album, walk the user's ordered sources top-to-bottom; the first source that actually has THIS album's cover wins. A miss falls through to the next; if all miss, the download's own art is kept (today's default). The worst case is always exactly the cover you'd get today -- never wrong art, never an error into the download. - Connection-gated: a source is only tried when the user is connected to it (free sources CAA/Deezer/iTunes/AudioDB always; Spotify only when authenticated). Tidal/Qobuz/HiFi deferred (cover-URL construction + no clean core accessor -- not shipping unverified extraction). - Album-match validated: a source's art is used only when the album it returns matches the requested artist+album (significant-token subset, tolerant of Deluxe/Remastered/articles/feat./multi-artist). A loose top search hit for a different record is treated as a miss -> guarantees no wrong-album art. - The list supersedes the legacy prefer_caa_art toggle: when album_art_order is non-empty it is the sole authority (add 'caa' to the list to use Cover Art Archive), and prefer_caa_art is neutralized for both the embedded-tag art and cover.jpg paths. With an empty list, prefer_caa_art behaves exactly as before. Implementation: - core/metadata/art_sources.py: pure resolver -- effective_art_order (config + legacy back-compat) and resolve_cover_art (ordered walk + fallback, exception-safe per source). No network/config/DB; fully unit-testable. - core/metadata/art_lookup.py: availability gating, per-source lookups against existing clients (Deezer/iTunes/AudioDB/Spotify search + CAA via MBID), album-match validation, per-album caching, and select_preferred_art_url -- the single gate the pipeline calls (no-op unless an explicit list is set). - core/metadata/artwork.py: wired into embed_album_art_metadata and download_cover_art, gated so no configured list == current behavior. - web_server.py: GET /api/metadata/art-sources (connected sources only). - config/settings.py: default album_art_order: []. - webui (index.html + settings.js): reorderable list in Core Features reusing the hybrid-source-list pattern + real service logos (with emoji fallback); load/save wired through the existing metadata_enhancement settings flow. loadArtSourceOrder populates the saved order synchronously (filtered to known sources, not availability) so a save before the availability fetch resolves, or a temporarily-disconnected source, can never wipe the saved order. Tests: 40 unit/seam tests (resolver ordering/fallback/back-compat, availability, per-source extraction, album-match validation incl. wrong-album/wrong-artist rejection, caching, exception-safety, the off-by-default gate). Full metadata suite still green (610 passed) -- the gated integration changes nothing when no list is configured. Note: the settings UI (DOM-heavy, not unit-testable in the JS harness) and the live per-source art-fetch quality are validated by manual testing. |
3 weeks ago |
|
|
95d6ad4bc9 |
Fix: torrent/usenet album bundle hard-fails on 'no results' instead of falling back
A torrent-first (or usenet-first) hybrid download would freeze at
"Torrent searching for release 0%" and never move to the next source when
Prowlarr returned no results for the album. Reported by Cezar:
[Album Bundle] torrent flow failed for '...': No torrent results found
Cause: the album-bundle dispatch only returns to the per-track flow (which,
in hybrid mode, tries the next configured source) when the plugin's failure
outcome carries fallback=True; otherwise it marks the batch failed and stops.
Both plugins set fallback=True on their 'results found but none matched the
album' branch, but the adjacent 'no results at all' branch set only an error
and no fallback flag -- so zero results hard-failed while wrong results fell
back. Backwards, and soulseek's plugin already defaults fallback=True for
exactly this reason.
Fix: set fallback=True on the no-results branch in torrent.py and usenet.py.
The dispatch's fallback handling (return False -> per-track flow) was already
correct and is unchanged.
The only consumer of download_album_to_staging is the dispatch, which reads
the result via .get('fallback'), so the change is additive and locally
contained.
Tests: new test_torrent_album_to_staging_no_results_flags_fallback and
test_usenet_album_to_staging_no_results_flags_fallback assert the plugins now
emit fallback=True on an empty search; the existing torrent no-results test is
extended with the same assertion. Existing dispatch tests already pin
fallback=True -> per-track flow. Full downloads/plugins/adapters sweep: 690
passed.
|
3 weeks ago |
|
|
6e7948b642 |
Mirrored playlist modal: revamp + fix header clipping on long playlists
The preview modal looked amateur and its header/footer clipped on long playlists (wolf39's 316-track "Road trip" showed neither title nor buttons). Root cause of the clip: .mm-list (the scroll area) was a flex child with flex:1 but no min-height:0. Flex items default to min-height:auto, so the list refused to shrink below its content, the modal blew past max-height, and overflow:hidden + vertical centering pushed the header off the top and the footer off the bottom. Now the list has min-height:0 and the hero + action bar are flex-shrink:0, so they stay pinned and the list scrolls. Visual revamp to match the rest of the app, using data already returned by /api/mirrored-playlists/<id> (image_url on the playlist and each track): - Hero uses real artwork (playlist cover -> first track art -> gradient fallback) with a blurred art backdrop + darkening overlay, replacing the emoji-in-a-box. Eyebrow + large title + meta line (source pill, owner, track count, total runtime, mirrored-ago). - Track rows gain per-track album thumbnails, two-line title/artist, album, duration, and a sticky column header. Missing art falls back to a gradient tile via onerror (no broken-image icons). - Cleaner action bar: primary Discover, secondary Auto-Sync, ghost Edit/ Close, quiet danger-outline Delete. Old .mirrored-modal-* / .mirrored-track-* / .mirrored-btn-* classes removed from style.css and replaced with the new .mm-* set; the _escJs escaping in the footer buttons (apostrophe fix) is preserved. |
3 weeks ago |
|
|
ffb9249ded |
Fix: mirrored playlist action buttons dead when name has an apostrophe
A mirrored playlist named with an apostrophe (e.g. "Road trip-The
Rolfe's") rendered dead action buttons. _escAttr HTML-escapes ' to ',
but it was used to inject the name into a single-quoted JS string inside an
inline onclick. The HTML parser decodes ' back to a bare ' BEFORE the JS
parser runs, producing an unterminated string literal -> SyntaxError -> the
whole handler fails to compile.
Two symptoms (both reproduced with the real name + the literal line-524
onclick template): clicking the X delete never ran event.stopPropagation(),
so the click bubbled to the card and opened the track preview instead; and
the preview's "Delete Mirror" silently did nothing (no DELETE request, no
log). Plain names ("Classic Rock") were unaffected, which is why it looked
intermittent.
Add a dedicated _escJs() that backslash-escapes the JS metacharacters (\, ')
first, then HTML-escapes the attribute-breaking chars - correct for a
single-quoted JS string inside a double-quoted HTML attribute. Convert all 16
inline-onclick string-argument sites to it: mirrored card (clear/Auto-Sync/
link/delete) and preview modal, plus the same latent bug in pool Fix Match /
Rematch, group bulk-toggle/rename/delete, and automation history/group/delete.
Genuine HTML-attribute usages (class/value/data-*/title/option) stay on
_escAttr where it is correct.
Tests: tests/static/test_stats_automations_esc.mjs extracts the real _escJs/
_escAttr from source and asserts apostrophe + quote/backslash/&/<> names
round-trip through HTML+JS decoding, documents that _escAttr throws a
SyntaxError for the apostrophe case while _escJs compiles clean, and pins
wolf39's exact name. pytest shim tests/test_stats_automations_esc_js.py runs
it under node --test (skips if node<22 / absent).
|
3 weeks ago |
|
|
22f30d3f40 |
tests: isolate the database so the suite can never touch the real DB
Several tests exercise modules (e.g. album_mbid_cache) that call get_database() with no path, which resolves to the real database/music_library.db. Running those writers against the live DB over a WSL-mounted Windows drive corrupted a user's library. conftest never isolated the DB path. conftest.py now sets os.environ['DATABASE_PATH'] to a throwaway temp file at import time (before any test module loads). MusicDatabase resolves its default path from DATABASE_PATH, so EVERY default-path MusicDatabase()/get_database() call in the suite now lands in /tmp — the real DB is never opened. Temp dir is removed at exit. Guard tests assert DATABASE_PATH, MusicDatabase().database_path, and get_database().database_path all resolve into the temp dir and never to database/music_library.db. (album_mbid_cache tests, which were failing against the corrupted live DB, now pass clean on the isolated temp DB.) |
3 weeks ago |
|
|
efe3895d5d |
Fix: metadata cache tables silently missing after DB recovery (stale migration marker)
Nothing was landing in the metadata cache browser because the metadata_cache_entities / metadata_cache_searches tables did not exist, so every cache write no-op-ed. Root cause: _add_metadata_cache_tables short-circuited on a marker-only guard (if the metadata_cache_v1 marker row exists, return). After a DB corruption-recovery the small metadata table (with the marker) survived but the large cache tables did not, so the stale marker permanently blocked the idempotent CREATE TABLE IF NOT EXISTS and the cache was dead forever. Guard now skips only when the marker is set AND the tables actually exist, so a stale marker self-heals: the tables are re-created on the next init. Tests: marker present but tables dropped -> re-created; marker + tables present -> no-op (idempotent). |
3 weeks ago |
|
|
482d5fbc79 |
Fix: Spotify sync crash 'unexpected keyword argument candidate_pool'
When no media server is connected, discovery/sync patches sync_service's
matcher with a database-only implementation. sync_service calls it as
_find_track_in_media_server(track, candidate_pool=...) (a per-artist candidate
cache), but the database-only override took only (spotify_track) — so every
sync raised 'database_only_find_track() got an unexpected keyword argument
candidate_pool' and aborted.
Lift the override from a nested closure to a module-level
_database_only_find_track(spotify_track, candidate_pool=None) so it (a) accepts
the kwarg for interface parity with the real matcher and (b) is importable and
unit-testable. The DB-only path queries the library directly via
check_track_exists, so it accepts but doesn't need the candidate cache. Also
dropped the dead original_find_track local.
Tests: signature includes candidate_pool; called with candidate_pool={} returns
(None, 0.0) on no match; returns the match when the DB has it.
|
3 weeks ago |
|
|
227c9373fe |
Batches panel: redesign expanded track rows + fix scrollbar clipping
The per-track list inside an expanded batch was a cramped flat row with a faint title and a -2px progress-bar hack, and the nested scrollbar sat on top of the text. Reworked: - Each row is now a grid: track number · title (+ artist sub-line) · right-aligned state, with hover, tabular-aligned numbers, per-row state coloring (✓ green / ✗ red / % accent / dim queued / strikethrough cancelled), and a clean full-width progress bar beneath downloading rows. - Track list gets right padding + a thin, subtle scrollbar so it no longer clips titles; same thin-scrollbar treatment on the panel itself. - Panel widened 340->366 with rebalanced side padding for more readable content. Collapsed-panel behavior unchanged. |
3 weeks ago |
|
|
f50e67ac9b |
Batches panel: Phase A visual upgrade (summary, segmented progress, ETA, live track)
Takes the Active Downloads batch panel from flat cards to a glanceable, information-rich view: - Sticky aggregate summary strip: 'N batches · X downloading · Y queued · speed · ~ETA'. - Segmented progress bar per batch — proportional done (green) / failed (red) / active (accent, animated shimmer) / remaining, so the state reads at a glance instead of one dim fill. - Colored stat chips (✓ done · ✗ failed · ↓ active · queued) + a per-batch ETA from a client-side completion-rate sampler (album bundles use the downloader's own speed/size). No backend changes — Phase A is frontend-only. - 'Now downloading' line showing the live track on active batches. - Expand chevron affordance (rotates when open); subtle phase tinting. - Polished empty state with quick-start links (Search / Sync / Wishlist). Card actions (filter / cancel / open-modal / expand) and the fade/history behavior are unchanged. ETA/speed for non-bundle batches and a retry-failed action are Phases B/C (backend). |
3 weeks ago |
|
|
e072b49138 |
Track-detail modal: fix stray cover-art placeholder (hidden overridden by display:flex)
The 🎵 cover placeholder (and the empty provenance block) stayed visible even
when JS set hidden, because .td-thumb-ph / .td-provenance set display:flex,
which a class selector applies over the browser's [hidden] { display:none }.
Scope a winning rule (#track-detail-overlay [hidden] { display:none !important })
so toggled-off elements actually disappear — the cover shows alone when present.
|
3 weeks ago |
|
|
134d306511 |
Track-detail modal: click any download row for a rich, status-aware view
Clicking a track row in the download modal now opens a polished detail modal (its own template, webui/track-detail-modal.html, included into index.html; behavior in static/track-detail.js): cover, title/artist/album, status badge, in-app play, source, quality, AcoustID verdict, file location, and the expected-vs-downloaded provenance — backed by /api/downloads/task/<id>/detail. It adapts by status: - completed -> play (library stream) + full provenance - quarantined-> reason + Listen (quarantine stream) + Accept & Import + Search - failed/not_found -> reason + Search This absorbs the standalone quarantine chooser, which is removed (its Listen/Accept/Search live here now, with the same Windows file-handle release before Accept and the thin-sidecar -> Recover-to-Staging fallback). Plain failed/not-found rows still go straight to the search modal; sync-import modal unaffected. Status cells clear their clickable/detail state each render so a row that flips to completed isn't left with a stale handler. |
3 weeks ago |
|
|
e4bbcfda1b |
Downloads: add per-track detail endpoint for the track-detail modal
New GET /api/downloads/task/<id>/detail merges the live download task with its library_history row (the data the Download History cards show) into one payload the upcoming track-detail modal renders: status kind, title/artist/album, source, quality, final location, AcoustID verdict, and expected-vs-downloaded. Assembly + status classification live in core/downloads/track_detail.py as a pure, importable build_track_detail()/classify_status_kind() (9 unit tests); the endpoint is thin glue that looks up the matching history row by track. |
3 weeks ago |
|
|
ba6c39bae3 |
AcoustID: report errors honestly instead of masking them as 'Skipped'
An invalid API key (and rate limits / missing chromaprint / fingerprint failures) all collapsed into the same None as a genuine no-match, so: - every download showed a benign 'AcoustID: Skipped', and - the 'Test API key' button reported a dead key as VALID because test_api_key trusted 'no exception = valid' but fingerprint_and_lookup swallows the error and returns None. A broken AcoustID setup looked completely normal — which cost a real debugging session to untangle. - New AcoustIDClient.lookup_with_status() returns a structured result that distinguishes ok / no_match / error / no_backend / fingerprint_error / unavailable. fingerprint_and_lookup() stays dict-or-None (library scanner / auto-import / their tests unchanged) as a thin wrapper over it. - verify_audio_file() uses it: a real error -> new VerificationResult.ERROR (-> _acoustid_result='error' -> the existing red 'Error' history badge), a genuine no-match -> SKIP 'No match in AcoustID database'. ERROR never quarantines (an outage/bad key must not punish good files). - test_api_key() now validates via the authoritative direct API call (error code 4 = invalid key) instead of the swallowed-exception path. Tests: structured-status distinction, legacy-wrapper contract, verify ERROR vs SKIP, and test_api_key invalid-vs-accepted. Existing verify tests updated to stub lookup_with_status (a stub returning just recordings is inferred as ok). |
3 weeks ago |
|
|
a703c5fdc2 |
Quarantine: inline 'Approve' button also marks the row Completed
The actions-column Approve button (approveQuarantineFromDownloadRow) POSTed /approve without a task_id, so it took the inner-pipeline path and never marked the task completed — the row stayed 'Quarantined' even though the file imported. The chooser's Accept was already fixed; this brings the inline button in line: it now carries data-task-id and sends task_id, so the re-import runs through the verification wrapper and the row flips to Completed on success. |
3 weeks ago |
|
|
bad3eb1fab |
Quarantine: flip the modal row to Completed after Accept & Import
Accepting a quarantined item re-imported the file correctly, but the download modal kept showing 'Quarantined'. The re-import ran through the inner pipeline, which doesn't mark task completion (that's the verification wrapper's job), and the sidecar context had no task_id anyway (popped before quarantine). The chooser's Accept now sends the originating task_id, and the endpoint re-runs the import through the verification wrapper with that task_id (+ batch_id looked up from the task), so the task is marked completed only after the file is verified moved — the row flips to Completed on the next poll. Manager-tab approvals (no task_id, no JSON body — handled via get_json(silent=True)) keep the original inner-pipeline path. Also clear has-candidates + the quarantine dataset on every status render so a row that goes quarantined -> completed doesn't keep a stale chooser attached. |
3 weeks ago |
|
|
3060678f29 |
Quarantine: manage a quarantined file from the download modal (Listen / Accept / Search)
Clicking a quarantined track's status used to open the generic search modal,
identical to a plain failure — no way to review or recover the file. It now
opens a chooser:
- Listen: streams the file in-app via a new /api/quarantine/<id>/stream
endpoint (range-supported; the real audio Content-Type is recovered from the
sidecar since the on-disk file ends in .quarantined).
- Accept & Import: existing /approve (restore + re-import, gates bypassed).
- Search for a different result: the existing candidates modal (old behavior).
Non-quarantine failures (not_found / failed / cancelled) are unchanged — a
single click listener routes by dataset set at render time, so a task that
fails then later quarantines can't end up double-bound.
Also fixes the Accept failure on Windows: the Listen stream holds an open file
handle, so the subsequent restore move hit WinError 32 ('file in use') and the
endpoint mislabeled it 'thin sidecar'. Accept now releases the audio handle
before approving, and approve/recover moves retry briefly on transient OS locks
(_move_with_retry). Accept also auto-falls-back to Recover-to-Staging for
genuinely thin/orphaned sidecars.
Tests: stream-info resolution (sidecar + filename-fallback + missing), and
_move_with_retry success/give-up.
|
3 weeks ago |
|
|
ec8c8d939c |
Quarantine: propagate quarantine_entry_id through the verification wrapper
post_process_matched_download_with_verification pops task_id/batch_id out of the context before running the inner pipeline (so the inner doesn't fire its own task notifications). But _mark_task_quarantined runs inside that inner call and reads context['task_id'] — which is now None — so it silently no-op'd. Result: every download through this wrapper (album-bundle / staging path) quarantined WITHOUT recording quarantine_entry_id on the task, so the UI had no handle to manage the file (the status click just fell back to the search modal). _mark_task_quarantined now also stashes the entry id on the context (survives the pop), and the wrapper applies it to the real task in both quarantine branches (integrity + AcoustID). Direct (non-wrapper) callers are unchanged. Tests: unit coverage for the stash-with/without-task_id behavior, plus a wrapper-level test proving the entry id reaches the task on integrity quarantine. |
3 weeks ago |
|
|
d6f37f9667 |
Integrity check: don't quarantine valid streamed FLAC as 'zero-length' (#756)
HiFi assembles FLAC from HLS segments and demuxes with `ffmpeg -c copy`, which preserves total_samples=0 in the STREAMINFO of Tidal's fragmented/ streamed FLAC. Every audio frame is present and the file plays fine, but mutagen computes length = total_samples / sample_rate = 0, so the integrity check rejected it as 'zero-length audio' and quarantined nearly every HiFi download. Users confirmed the quarantined files play normally once restored. Length 0 is not proof of corruption at that point: the file already passed the size gate, was identified as a real audio format, and has a valid info block — a genuinely empty/truncated/stub file fails one of those earlier checks instead. Treat length 0 as 'length unknown': accept the file and skip the duration cross-check we can't perform without a length. mutagen never decoded/validated frame data anyway, so this doesn't weaken real corruption detection — size, parse, format, info-block, and duration-drift guards all remain. Tests: a large valid-parse length-0 file (streamed-FLAC signature) is now accepted; a tiny length-0 stub still fails (size gate fires first). |
3 weeks ago |
|
|
2824c25ec6 |
Album bundle: let Soulseek staging-misses fall through to per-track/cross-source fallback (#743)
A Soulseek album bundle stages whichever single folder scored best. If that
folder doesn't contain every track the album needs, the missing tracks were
marked not_found with no fallback — even in hybrid mode where later sources
(Deezer, YouTube, etc.) could fill them. The staging-miss short-circuit fired
for Soulseek because 'soulseek' was lumped into the torrent/usenet source set
when album bundles were added, and album_bundle_partial only reflects whether
the files found IN the folder downloaded, not whether the folder had every
needed track.
Drop 'soulseek' from the short-circuit (keep torrent/usenet). A track not
claimed from the staged Soulseek folder now falls through to the normal
per-track Soulseek search and, in hybrid mode, onward down the configured
chain. Unlike torrent/usenet — where per-track search re-adds the same
release — Soulseek per-track search is a genuine per-file network search, so
this is correct and cheap. Realizes the original author's stated intent
('keep partial bundles from blocking per-track fallback') robustly, since the
partial flag couldn't detect a folder that was simply missing tracks.
Only affects tracks NOT claimed from staging — fully-staged albums claim every
track via try_staging_match and never reach this gate, so working albums are
unchanged. Likely also mitigates #755 (all-album-import failures now fall
through to per-track instead of dying).
Tests: rewrote the two Soulseek staged-miss tests to assert fall-through
(single + hybrid-first); kept the torrent guard; added a usenet guard test.
|
3 weeks ago |
|
|
163de6c146 |
MusicBrainz manual search: field-scope the artist in non-strict mode (#754)
The user-facing Search-for-Match / Fix popup runs non-strict MB searches. That path built a bare "track artist" query with no field scoping, so the artist was just a free fuzzy term — covers and karaoke whose TITLES contained the artist name outranked the canonical recording. Reproduced live: searching "Say You Will" / Foreigner returned cover artists with Foreigner absent, and "Sweet Child O Mine" / Guns N Roses returned only covers (Presnyakov, PMJ…), never the Guns N' Roses original. Keep the track/album side loose (no phrase quotes → diacritic + bracketed- suffix recall, the reason non-strict exists) but field-scope the artist as artist:(...) so it constrains. The artist value is Lucene-escaped via _escape_lucene() — without it, names like "Sunn O)))" or "Anthony Green (Saosin)" would close the artist:( group early (returning unrelated artists) or break the query (zero results). Same fix applied to search_release. Verified against the live MB API: both reporter queries now return the real artist top-to-bottom; diacritic recall is preserved (artist:(Bjork) folds to Björk); and paren/?/!-laden artist names produce valid, balanced queries. Tests pin the constructed query string (no network): non-strict scopes and escapes the artist while keeping the track loose/unquoted; strict path unchanged; plus _escape_lucene unit coverage. |
3 weeks ago |
|
|
ce9ec3f6f4 |
Manual library match: accept non-numeric library track ids (#754)
The save endpoint coerced library_track_id with int(), which rejected every non-numeric id with "Invalid library track id". Library ids are str(ratingKey) — numeric for Plex but GUIDs/hashes for Navidrome, Jellyfin, and other Subsonic servers — and are stored in the TEXT tracks.id column, so the coercion broke manual matching on every non-Plex server. Replace the int() coercion with a normalize_library_track_id() helper that trims and rejects only empty input, passing the opaque string id straight through. Plex numeric ids are unaffected (SQLite INTEGER affinity still stores a clean numeric string as an int, so existing matches are byte-identical) and no schema migration is needed (the INTEGER column already stores non-numeric ids as text). Tests: pure-helper cases (numeric/GUID/whitespace/empty) plus a real-DB round-trip proving a GUID id saves, reads back unchanged, and enriches. |
3 weeks ago |
|
|
3b5a5518a6 |
Cache cap test: exercise the REAL _run_maintenance_write, not a stub
Self-review caught a test-fidelity hole: the temp cache overrode _run_maintenance_write with a simplified version, so evict_over_capacity was tested against the stub's plumbing, not production's (retry + connection handling). Removed the override — _get_db is now the only injected seam, so the test runs the genuine code path. Differential-verified the LRU assertions are real: flipping ORDER BY ASC->DESC makes them fail. 8/8 pass; ruff clean. |
3 weeks ago |
|
|
bb2241498f |
Metadata cache: hard LRU row cap to stop unbounded growth (7.6GB incident)
Investigation (not assumption): the cache's TTL eviction + junk cleanup ARE correct and DO run automatically every 6h (CacheEvictorJob, auto_fix=True). The real gap is there's NO SIZE CEILING — TTL-only eviction means 'how big can it get' = 'however much you fetch within the 30-day window', so heavy discovery/enrichment legitimately grew metadata_cache_entities to ~1.8M rows / 7.6 GB, bloating the main DB (a factor in the corruption incident). Fix — add a bounded LRU cap: - entities_to_evict_for_capacity(total, max_rows): pure decision fn (cap<=0 disables), unit-testable like core.db_integrity.prune_backups. - MetadataCache.evict_over_capacity(): deletes the least-recently-ACCESSED rows (uses the already-stored last_accessed_at; NULL = never-touched = evicted first) down to the ceiling. Default 250k rows, tunable. - Wired as Phase 5 of CacheEvictorJob — runs LAST, after TTL/junk/orphan/null cleanup, so it only trims a still-oversized HEALTHY cache. Verified safe to bound/wipe: audited every cache reader (get_entity/ get_entities_batch/get_search_results/get_entity_detail/browse) — all degrade to None/[]/empty on miss, treated as 'go fetch'. Nothing depends on a row existing, so eviction can't break callers. Tests: tests/metadata/test_cache_capacity_eviction.py (8) — pure-fn coverage + real temp-DB proof that it drops the LRU rows specifically (not arbitrary) and NULL-access rows go first. 18 adjacent cache tests still green; ruff clean. Follow-ups (separate phases, scoped): (2) move the cache to its own bounded metadata_cache.sqlite3 (no JOINs to library tables — confirmed clean to split; invalidate-and-rebuild rather than migrate the 7.6GB), (3) kill the raw_json + 22-extracted-column double storage. |
3 weeks ago |
|
|
9231cbd506 |
Merge branch 'main' into dev
|
3 weeks ago |
|
|
ca2f4da9f4 |
DB backups: verify integrity + never evict the last good backup
Post-incident hardening. A WAL-mode DB corrupted (most likely an interrupted write during a hard restart), and the backup routine made it unrecoverable: it (a) never checked integrity, so src.backup() faithfully copied the corrupt pages into every rolling backup, and (b) pruned oldest-by-mtime, so each new corrupt backup evicted the last good one. Result: all snapshots poisoned. New core/db_integrity.py (pure, unit-tested): - quick_check()/is_healthy(): fast read-only PRAGMA quick_check probe. - safe_backup(): verifies the SOURCE is healthy BEFORE the Online-Backup copy and the RESULT after; refuses + discards rather than save a corrupt copy. - prune_backups(): rotation that NEVER deletes the most-recent verified-healthy backup, even to honor max_keep — so a run of bad backups can't drop your last good snapshot. Wired into BOTH backup paths (the /api/database/backup endpoint and the auto_backup_database automation handler) — they now refuse on integrity failure (409 / error status, existing backups untouched) and prune safely. Tests: tests/test_db_integrity.py (8) using REAL temp DBs incl. a physically corrupted one — proves refuse-corrupt-source, discard-corrupt-result, and the exact incident scenario (newest backups corrupt -> the older healthy one is protected from pruning). Existing maintenance-handler backup test still green (29 passed). compile + ruff clean. NOTE: this prevents silent backup poisoning; it does NOT stop the underlying corruption. Follow-ups still worth doing: WAL-checkpoint on clean shutdown + a periodic live-DB integrity alert (so corruption is caught on day 1). |
3 weeks ago |
|
|
cc433fad37 |
Album picker #730: add word-boundary full-phrase bonus (from PR #731 review)
Compared my #730 fix against contributor PR #731 (same independent design). Grafted their good idea — a confidence bonus when the album's full core phrase appears intact in the release title (rescues long multi-word names whose token coverage gets diluted) — and kept my accent-folding, which #731 lacks (their normalize drops accented chars: Bjork -> 'bj rk'). IMPORTANT: implemented the phrase bonus WORD-BOUNDARY anchored, not as a raw substring. My first cut used 'phrase in norm_title' (matching #731) and it immediately reintroduced the substring bug #730 exists to fix — 'heroes' matched 'superheroes' and the wrong album scored 0.9/passed. PR #731 has this latent flaw. The regex anchors the phrase to word boundaries so the bonus fires for real matches only. Verified: substring trap (Superheroes/Scary Monsters) rejected; edition suffixes + intact-phrase albums kept. +1 phrase-bonus test (incl. the word-boundary guard). 126 plugin tests pass; ruff clean. Co-authored-by: Tyler Richardson-LaPlume <170156756+IamGroot60@users.noreply.github.com> |
3 weeks ago |