mirror of https://github.com/Nezreka/SoulSync.git
Discogs: strip artist disambiguation suffixes at every name surface (#634)
Discogs uses two disambiguation conventions for duplicate artist names: - legacy `(N)` numeric suffix: "Bullet (2)", "Madonna (3)" - newer `*` asterisk suffix: "John Smith*", "Foo*" Both were leaking through to the UI on artist search and album search, and worse — through the import path into folder names on disk (reported: importing yielded folders literally named `Foo*`). The pre-existing cleanup only handled `(N)` and only at ONE site — `get_user_collection` (line 469) and one path inside `extract_track_from_release` (line 448 — `re.sub(r'\s*\(\d+\)$', '', artist_name)`). Every other surface (artist search, album search, album-track lookups, get_artist_albums feature matching) returned the raw Discogs string. Centralized into `_clean_discogs_artist_name(name)` at module top, with regex covering both suffixes including repeated forms (`Baz**`, `Foo (3)*`). Applied at six sites: - `Artist.from_discogs_artist` (artist search) - `Album.from_discogs_release` (album search — three fallbacks: array, string, title-split) - `Track.from_discogs_track` (track lookup — track-level + release-level fallback) - `extract_track_from_release` (replaces the inline `(N)`-only re.sub) - `get_user_collection` (existing site, now also strips `*`) - `get_artist_albums` (artist_name used for primary-vs-feature matching; cleaning prevents `Beyoncé*` from failing equality vs `Beyoncé`) - `get_album` (artists_list + per-track artists in the tracklist projection) Tests: - New `test_clean_discogs_artist_name` parametrized over 14 cases covering `(N)`, `*`, repeated `**`, combined `(N) *`, whitespace handling, empty/None defensive returns. - New `test_get_user_collection_strips_discogs_asterisk_disambiguation` pinning the asterisk path end-to-end through the collection import flow (sibling to the existing `(N)` test). - Existing 37 discogs tests still pass. Out of scope (separate issue): the same #634 report flagged track-count and year fields rendering as 0 / empty in Discogs album search. Both are inherent to Discogs `/database/search` response shape — search results don't carry `tracklist` (only release detail does) and `year` is often `0` in search payloads. Fixing requires lazy-fetching release detail per row, which hits the 25 req/min unauth limit hard. Not bundled here.fix/usenet-album-poll-sab-handoff
parent
36614e1a4d
commit
1d6ced286b
Loading…
Reference in new issue