Reorganize: move orphan-format siblings alongside the canonical

Discord report (Foxxify): users with the lossy-copy feature enabled
have `track.flac` AND `track.opus` side-by-side in their library.
Reorganize is DB-driven and only knows about ONE file per track
(the lossy copy). The other format used to get left behind in the
old location while the canonical moved to its new destination.
Empty-folder cleanup never fired because the source dir still had
audio.

# What was happening

1. User downloads album → SoulSync transcodes `.flac` → `.opus`,
   embeds `.lrc` lyrics
2. DB row points at `.opus` (the lossy library copy)
3. User runs Library Reorganize
4. Reorganize moves `.opus` to new template path → `Artist/Album/01 Track.opus`
5. `.flac` orphan stays at old location, `.lrc` follows `.opus`
6. Source dir still has the `.flac` → cleanup skips → empty folders pile up

# Fix

`_finalize_track` now finds sibling-stem audio files at the source
BEFORE removing the canonical and moves them to the same destination
dir, preserving both formats with the canonical's renamed stem.

Two new helpers in `core/library_reorganize.py`:

- `_find_sibling_audio_files(audio_path) -> list[str]` — returns
  paths to other audio files at the same directory that share the
  canonical's filename stem. Excludes the canonical itself, non-
  audio extensions (sidecars handled separately by
  `_delete_track_sidecars`), and different-stem tracks (different
  songs in the same dir).
- `_move_sibling_to_destination(sibling_src, canonical_dst) -> str`
  — moves a sibling-format file to the canonical's destination dir
  with the canonical's renamed stem + the sibling's original
  extension. Defensive — OS errors logged at warning, return None,
  doesn't raise (caller treats as best-effort).

After the fix:

1. `.opus` → moved to new dir
2. `.flac` sibling detected → moved to same new dir with same stem
3. Source `.opus` removed, `.lrc` sidecar deleted from source
4. Source dir empty → cleanup proceeds normally
5. Both formats end up paired at the new location

# Tests added (11)

`tests/test_reorganize_orphan_format_handling.py`:

- Sibling detection: finds `.flac` when `.opus` is canonical (and
  symmetric direction), excludes canonical itself, excludes
  different-stem tracks, excludes non-audio (`.lrc`/`.nfo`),
  finds multiple siblings (3+ formats), returns empty when source
  dir missing
- Sibling move: renames to canonical stem + preserves sibling
  extension, creates destination dir if missing, no-op when source
  already at destination, returns None on OS failure (caller
  treats as best-effort)

# Verification

- 11/11 new tests pass
- 97/97 reorganize-related tests pass total (no regression in
  existing helpers)
- Ruff clean

# Follow-up in same PR

Next commit: cleanup repair job for legacy "Unknown Artist /
album_id" rows from the pre-#524 manual-import bug. Reorganize
correctly leaves those alone (they're DB-broken, not file-broken),
but a separate maintenance job to find + re-enrich them is needed.
pull/545/head
Broque Thomas 6 days ago
parent 4f9a9c79a2
commit d944a166f8

@ -1051,6 +1051,23 @@ def _finalize_track(ctx: _RunContext, track_id, resolved_src, new_path) -> bool:
with ctx.state_lock:
ctx.src_dirs_touched.add(os.path.dirname(resolved_src))
ctx.dst_dirs_touched.add(os.path.dirname(new_path))
# Discord report (Foxxify): users with lossy-copy enabled have
# `track.flac` AND `track.opus` side-by-side. The DB tracks ONE
# (the lossy copy). Reorganize used to move only the canonical
# and leave the orphan behind, blocking empty-folder cleanup.
# Move sibling-format audio to the same destination dir BEFORE
# removing the canonical source, preserving both formats with
# the canonical's renamed stem.
siblings = _find_sibling_audio_files(resolved_src)
for sibling_src in siblings:
moved_to = _move_sibling_to_destination(sibling_src, new_path)
if moved_to:
logger.debug(
"[Reorganize] Moved sibling-format file alongside canonical: %s",
moved_to,
)
try:
os.remove(resolved_src)
except OSError as rm_err:
@ -1450,6 +1467,80 @@ _AUDIO_EXTS = frozenset(
)
def _find_sibling_audio_files(audio_path: str) -> list:
"""Find OTHER audio files at the same source directory that share
the canonical file's stem.
Discord report (Foxxify): users with the lossy-copy feature
enabled end up with `track.flac` AND `track.opus` side-by-side.
Reorganize is DB-driven and only knows about ONE file per track
(the lossy copy in library), so the other format gets left behind
in the old location while the canonical moves to the new
destination. Cleanup never fires because the source dir still has
audio.
This helper returns the orphan-format paths so the caller can
move them alongside the canonical to the new destination dir.
Same stem + audio extension + NOT the canonical itself.
Returns empty list when source dir doesn't exist or read fails
(defensive never raises).
"""
src_dir = os.path.dirname(audio_path)
if not os.path.isdir(src_dir):
return []
stem = os.path.splitext(os.path.basename(audio_path))[0]
canonical_basename = os.path.basename(audio_path)
siblings = []
try:
entries = os.listdir(src_dir)
except OSError:
return []
for name in entries:
if name == canonical_basename:
continue
sibling_stem, ext = os.path.splitext(name)
if sibling_stem != stem:
continue
if ext.lower() not in _AUDIO_EXTS:
continue
full = os.path.join(src_dir, name)
if os.path.isfile(full):
siblings.append(full)
return siblings
def _move_sibling_to_destination(sibling_src: str, canonical_dst: str) -> Optional[str]:
"""Move a sibling-format audio file to the same destination
directory as the canonical, preserving its extension.
Example: canonical at ``/library/Artist/Album/01 Track.opus`` +
sibling source ``/old/01 Track.flac`` destination ``/library/
Artist/Album/01 Track.flac``. The destination filename uses the
canonical's stem (post-template-rename) + the sibling's original
extension so a renamed canonical gets matching siblings.
Returns the destination path on success, None on failure (logged
at warning, doesn't raise — sibling moves are best-effort).
"""
dst_dir = os.path.dirname(canonical_dst)
canonical_stem = os.path.splitext(os.path.basename(canonical_dst))[0]
_, sibling_ext = os.path.splitext(sibling_src)
sibling_dst = os.path.join(dst_dir, canonical_stem + sibling_ext)
if os.path.normpath(sibling_src) == os.path.normpath(sibling_dst):
return sibling_dst # already at the right place
try:
os.makedirs(dst_dir, exist_ok=True)
shutil.move(sibling_src, sibling_dst)
return sibling_dst
except OSError as e:
logger.warning(
"[Reorganize] Couldn't move sibling-format file %s%s: %s",
sibling_src, sibling_dst, e,
)
return None
def _delete_track_sidecars(audio_path: str) -> None:
"""Delete per-track sidecars (.lrc / .nfo / .txt / .cue / .json) that
sit alongside `audio_path` and share its filename stem. Best-effort

@ -0,0 +1,174 @@
"""Pin orphan-format handling in library_reorganize.
Discord report (Foxxify): users with the lossy-copy feature enabled
end up with `track.flac` AND `track.opus` side-by-side. Reorganize is
DB-driven and only knows about ONE file per track (the lossy copy in
the library), so the other format used to get left behind in the old
location while the canonical moved to the new destination. Cleanup
never fired because the source dir still had audio.
Post-fix the reorganize finalisation step finds sibling-stem audio
files at the source and moves them to the same destination dir as
the canonical, preserving both formats.
"""
from __future__ import annotations
import os
from core.library_reorganize import (
_find_sibling_audio_files,
_move_sibling_to_destination,
)
# ---------------------------------------------------------------------------
# Sibling detection
# ---------------------------------------------------------------------------
class TestFindSiblingAudioFiles:
def test_finds_flac_when_opus_is_canonical(self, tmp_path):
"""Reporter's exact case: lossy-copy `.opus` is the
canonical (DB-tracked); `.flac` is the orphan to move."""
opus = tmp_path / "01 Track.opus"
flac = tmp_path / "01 Track.flac"
opus.write_bytes(b"opus-data")
flac.write_bytes(b"flac-data")
siblings = _find_sibling_audio_files(str(opus))
assert siblings == [str(flac)]
def test_finds_opus_when_flac_is_canonical(self):
"""Symmetric direction — works either way."""
# tmp_path fixture handled by next test inline
import tempfile, pathlib
tmp = pathlib.Path(tempfile.mkdtemp())
flac = tmp / "X.flac"
opus = tmp / "X.opus"
flac.write_bytes(b"a"); opus.write_bytes(b"b")
siblings = _find_sibling_audio_files(str(flac))
assert siblings == [str(opus)]
def test_excludes_canonical_itself(self, tmp_path):
"""Canonical must NOT appear in its own sibling list."""
canonical = tmp_path / "X.opus"
canonical.write_bytes(b"data")
siblings = _find_sibling_audio_files(str(canonical))
assert siblings == []
def test_excludes_different_stem(self, tmp_path):
"""Different track in same dir shouldn't be flagged as
sibling only same-stem files."""
canonical = tmp_path / "01 Track One.opus"
other_track = tmp_path / "02 Track Two.flac"
canonical.write_bytes(b"a"); other_track.write_bytes(b"b")
siblings = _find_sibling_audio_files(str(canonical))
assert siblings == []
def test_excludes_non_audio_extensions(self, tmp_path):
"""Sidecars (.lrc, .nfo, .txt) handled by separate sidecar
helper must not appear in audio-sibling list."""
canonical = tmp_path / "X.opus"
sidecar = tmp_path / "X.lrc"
nfo = tmp_path / "X.nfo"
canonical.write_bytes(b"a")
sidecar.write_bytes(b"lyrics")
nfo.write_bytes(b"info")
siblings = _find_sibling_audio_files(str(canonical))
assert siblings == []
def test_finds_multiple_siblings(self, tmp_path):
"""User could have 3+ formats: .flac + .opus + .mp3."""
opus = tmp_path / "X.opus"
flac = tmp_path / "X.flac"
mp3 = tmp_path / "X.mp3"
opus.write_bytes(b"a"); flac.write_bytes(b"b"); mp3.write_bytes(b"c")
siblings = _find_sibling_audio_files(str(opus))
# All formats other than canonical
assert sorted(siblings) == sorted([str(flac), str(mp3)])
def test_missing_source_dir_returns_empty(self, tmp_path):
"""Defensive: source dir vanished mid-reorganize. Return
empty, don't raise."""
siblings = _find_sibling_audio_files(str(tmp_path / "nonexistent" / "X.opus"))
assert siblings == []
# ---------------------------------------------------------------------------
# Sibling move
# ---------------------------------------------------------------------------
class TestMoveSiblingToDestination:
def test_moves_to_same_dir_as_canonical_with_renamed_stem(self, tmp_path):
"""Canonical's renamed stem propagates to siblings — so a
renamed `.opus` (`01 Track.opus`) gets a matching `.flac`
(`01 Track.flac`) at the new location, even if source was
`track-original-name.flac`."""
src_dir = tmp_path / "old"
dst_dir = tmp_path / "Artist" / "Album"
src_dir.mkdir()
sibling_src = src_dir / "track-original-name.flac"
sibling_src.write_bytes(b"flac-data")
canonical_dst = dst_dir / "01 Track.opus"
result = _move_sibling_to_destination(str(sibling_src), str(canonical_dst))
# Sibling at new location with canonical's renamed stem +
# sibling's original extension
expected = dst_dir / "01 Track.flac"
assert result == str(expected)
assert expected.exists()
assert expected.read_bytes() == b"flac-data"
# Source removed
assert not sibling_src.exists()
def test_creates_destination_dir_if_missing(self, tmp_path):
src = tmp_path / "old" / "X.flac"
src.parent.mkdir()
src.write_bytes(b"data")
canonical_dst = tmp_path / "new" / "X.opus"
result = _move_sibling_to_destination(str(src), str(canonical_dst))
assert result is not None
assert (tmp_path / "new" / "X.flac").exists()
def test_no_op_when_source_equals_destination(self, tmp_path):
"""Defensive: if sibling is already at the destination (e.g.
idempotent re-run), return path without raising."""
f = tmp_path / "X.flac"
f.write_bytes(b"data")
canonical_dst = tmp_path / "X.opus"
result = _move_sibling_to_destination(str(f), str(canonical_dst))
# Sibling stays put (same dir as canonical destination)
assert f.exists()
assert result == str(f)
def test_returns_none_on_failure(self, tmp_path, monkeypatch):
"""OS error on move → returns None, doesn't raise. Caller
treats as best-effort (sibling stays at old location, user
sees it next reorganize run)."""
src = tmp_path / "old" / "X.flac"
src.parent.mkdir()
src.write_bytes(b"data")
def fake_move(s, d):
raise OSError("disk full")
monkeypatch.setattr('core.library_reorganize.shutil.move', fake_move)
result = _move_sibling_to_destination(
str(src), str(tmp_path / "new" / "X.opus"),
)
assert result is None
Loading…
Cancel
Save