working

10 months ago · df34ff45f9
parent 2431eba11b
commit df34ff45f9
2 changed files with 498 additions and 55 deletions
--- a/SPOTIFY_MATCHING_SPEC.md
+++ b/SPOTIFY_MATCHING_SPEC.md
@ -216,29 +216,29 @@ Playlist Track → Plex Check → (Missing) → Soulseek Search → Quality Filt
   - ✅ Real-time status updates on playlist buttons (🔍 Analyzing, ⏬ Downloading)
   - ✅ Maintain operation state across modal open/close cycles

-4. **⚠️ NEEDS FIXING - Soulseek Search Integration**
-   - ⚠️ **CRITICAL**: Must use existing downloads.py infrastructure for search/download
-   - ⚠️ **CRITICAL**: Implement smart search strategy for artist name issues
-   - ⚠️ **CRITICAL**: Use existing quality filtering and result matching logic
-   - ⚠️ **CRITICAL**: Integrate with existing download queue system
-
-5. **🔄 IN PROGRESS - Smart Search Strategy**
-   - **Primary Search**: Track name only (e.g., "humble" not "kendrick lamar humble")
-   - **Secondary Search**: Shortened artist + track (e.g., "kendrick humble" not "kendrick lamar humble")  
-   - **Matching Logic**: Use duration, artist name from slskd results for verification
-   - **Quality Selection**: Leverage existing downloads.py filtering and sorting
-
-6. **🔄 IN PROGRESS - Downloads.py Integration**
-   - Use existing `SoulseekClient.search()` and filtering infrastructure
-   - Integrate with existing download queue management
-   - Apply matched download folder structure automatically
-   - Use existing file organization and metadata handling
-
-7. **🔄 IN PROGRESS - Folder Organization & Matching**
-   - **Structure**: `ArtistName/ArtistName - AlbumName/Track.ext` (existing matched download logic)
-   - **Album Detection**: Use Spotify metadata to determine album vs single
-   - **Automatic Matching**: Treat as "matched downloads" with Spotify metadata
-   - **Quality Filtering**: Use existing downloads.py quality/format preferences
+4. **✅ COMPLETED - Soulseek Search Integration**
+   - ✅ **CRITICAL**: Using existing downloads.py infrastructure for search/download
+   - ✅ **CRITICAL**: Implemented smart search strategy for artist name issues
+   - ✅ **CRITICAL**: Using existing quality filtering and result matching logic
+   - ✅ **CRITICAL**: Integrated with existing download queue system
+
+5. **✅ COMPLETED - Smart Search Strategy**
+   - ✅ **Single-word tracks**: Track + full artist first (e.g., "Aether Virtual Mage")
+   - ✅ **Multi-word tracks**: Track name first (e.g., "Astral Chill")
+   - ✅ **Fallback strategies**: Shortened artist, first word, full artist combinations
+   - ✅ **Strict matching**: Exact track name containment required in results
+
+6. **✅ COMPLETED - Downloads.py Integration**
+   - ✅ Using existing `SoulseekClient.search()` and filtering infrastructure
+   - ✅ Integrated with existing download queue management
+   - ✅ Applied matched download folder structure automatically
+   - ✅ Using existing file organization and metadata handling
+
+7. **⚠️ NEEDS IMPROVEMENT - Advanced Matching & Quality Selection**
+   - ⚠️ **HIGH PRIORITY**: FLAC preference when multiple valid matches exist
+   - ⚠️ **HIGH PRIORITY**: More intelligent track title parsing (handle '-', '_', bitrate, etc.)
+   - ⚠️ **HIGH PRIORITY**: Spotify matching for proper folder naming structure
+   - ⚠️ **HIGH PRIORITY**: Confidence-based auto-matching with failed matches tracking

 ### ✅ COMPLETE WORKFLOW IMPLEMENTED:

@ -282,4 +282,114 @@ Playlist → Spotify Tracks → Plex Analysis → Track Table Updates → Missin
 - Seamless integration with current UI and workflow
 - Intelligent Plex deduplication preventing unnecessary downloads
 - Proper folder organization matching app standards
- Robust error handling with graceful degradation to download all tracks
+- Robust error handling with graceful degradation to download all tracks
+
+---
+
+## 🚀 CURRENT STATE & NEXT PHASE IMPROVEMENTS
+
+### ✅ CURRENT WORKING STATE (What's Working Now):
+
+#### **Core Functionality Complete:**
+1. **Modal System**: Sophisticated UI with live counters, dual progress bars, track table
+2. **Plex Analysis**: Background thread analyzes tracks against Plex library
+3. **Smart Search**: Single-word tracks prioritize artist inclusion, multi-word tracks work well
+4. **Download Integration**: Uses existing downloads.py infrastructure properly
+5. **Progress Tracking**: Real-time updates, modal can be closed/reopened
+6. **Folder Structure**: Basic folder creation for downloaded tracks
+
+#### **Search Strategy Working:**
+- ✅ "Aether Virtual Mage" → finds correct Virtual Mage track
+- ✅ "Astral Chill" → finds correct track 
+- ✅ "Orbit Love" → finds correct track
+- ✅ Downloads integrate with existing queue system
+- ✅ Sequential searching prevents overwhelming slskd
+
+### ⚠️ CRITICAL IMPROVEMENTS NEEDED (Next Phase):
+
+#### **1. INTELLIGENT MATCHING SYSTEM**
+**Current Issue**: System is finding tracks but not always selecting the best quality/match
+**Requirements**:
+- **FLAC Priority**: When multiple valid matches exist, always choose FLAC over MP3/other formats
+- **Advanced Title Parsing**: Handle track names with extra characters like:
+  - `Artist - Track Name [320kbps]`
+  - `01. Track_Name - Artist_Name.flac`
+  - `Track Name (feat. Other Artist) - 2023 Remaster`
+- **Bitrate Recognition**: Parse and prefer higher quality files
+- **Version Filtering**: Avoid unwanted remixes, live versions, instrumentals unless specified
+
+#### **2. SPOTIFY INTEGRATION FOR FOLDER STRUCTURE**
+**Current Issue**: Downloads go to basic folders without proper Spotify metadata integration
+**Requirements**:
+- **Must work exactly like "matched downloads"** from the main downloads.py functionality
+- **Spotify API Lookup**: For each track, find exact Spotify match for metadata
+- **Album Detection**: Determine if track is part of album or is a single
+- **Proper Folder Structure**:
+  - **Singles**: `Transfer/ARTIST_NAME/ARTIST_NAME - SINGLE_NAME/SINGLE_NAME.flac`
+  - **Albums**: `Transfer/ARTIST_NAME/ARTIST_NAME - ALBUM_NAME/01 TRACK_NAME.flac`
+- **Cover Art**: Download album/artist artwork automatically
+- **Metadata Enhancement**: Update file tags with Spotify metadata
+
+#### **3. CONFIDENCE-BASED AUTO-MATCHING**
+**Current Issue**: No systematic tracking of failed matches or confidence thresholds
+**Requirements**:
+- **High Confidence Auto-Download**: Tracks with >80% confidence match automatically
+- **Medium Confidence Review**: 60-80% confidence tracks flagged for manual review
+- **Failed Matches List**: Maintain list of tracks that couldn't be matched reliably
+- **Manual Search Integration**: Allow manual search for failed tracks
+- **Success Rate Tracking**: Show user statistics on match success rates
+
+#### **4. ENHANCED QUALITY SELECTION ALGORITHM**
+**Current Scoring System Improvements Needed**:
+```python
+# Current basic scoring needs enhancement:
+# - Track name containment: 120-150 points
+# - Artist containment: 40-80 points  
+# - Duration matching: Up to 100 points
+
+# NEEDED: Advanced quality scoring:
+# - FLAC/Lossless: +50 points (higher than current +15)
+# - High bitrate: +30 points (320kbps vs 128kbps)
+# - Clean filename: +20 points (avoid [tags], underscores)
+# - Proper metadata: +15 points (correct artist/title fields)
+# - Album context: +10 points (part of complete album)
+```
+
+### 🔧 TECHNICAL IMPLEMENTATION ROADMAP:
+
+#### **Phase 1: FLAC Priority & Quality Enhancement** (Immediate)
+1. Update `select_best_match()` scoring in `sync.py:2955`
+2. Add FLAC detection and boost scoring significantly
+3. Implement bitrate parsing and quality preference
+4. Add file format detection improvements
+
+#### **Phase 2: Spotify Matching Integration** (High Priority)
+1. Add Spotify API lookup for each downloaded track
+2. Implement album vs single detection using existing matched download logic
+3. Create proper Transfer folder structure with Spotify metadata
+4. Integration with existing downloads.py matched download functions
+
+#### **Phase 3: Advanced Matching Intelligence** (Critical)
+1. Enhanced track title parsing with regex patterns
+2. Improved artist name normalization and matching
+3. Context-aware matching (album context, release year, etc.)
+4. Machine learning-style confidence scoring improvements
+
+#### **Phase 4: Failed Matches & Manual Review** (Important)
+1. Failed matches tracking and storage
+2. Manual search interface for problem tracks
+3. Success rate analytics and reporting
+4. User feedback integration for match quality
+
+### 📊 EXPECTED OUTCOMES:
+- **90%+ automatic match rate** for popular tracks
+- **FLAC preference** ensuring highest quality downloads
+- **Perfect folder organization** matching existing matched download standards
+- **Zero manual intervention** for high-confidence matches
+- **Clear manual review workflow** for edge cases
+
+### 🎯 CURRENT NEXT STEPS:
+1. **Update FLAC priority** in matching algorithm
+2. **Add Spotify metadata lookup** for proper folder structure
+3. **Enhance track title parsing** for better matching accuracy
+4. **Implement confidence thresholds** for auto vs manual matching
--- a/ui/pages/sync.py
+++ b/ui/pages/sync.py
@ -3016,26 +3016,42 @@ class DownloadMissingTracksModal(QDialog):
            if hasattr(result, 'duration') and result.duration:
                result_duration = result.duration
            
-            # STRICT REQUIREMENT: Track title must be contained exactly in filename or title
-            # This is now a mandatory requirement - no match without this
-            track_contained = False
-            
-            # Check if full track name is contained in the result title
-            if track_name in result_title:
-                score += 150  # High score for exact containment in title
-                reasons.append("track_exact_in_title")
-                track_contained = True
-            
-            # Check if full track name is contained in the filename
-            elif track_name in result_filename:
-                score += 120  # High score for exact containment in filename
-                reasons.append("track_exact_in_filename")
-                track_contained = True
-            
-            # If track name is not contained exactly, reject this result immediately
-            if not track_contained:
+            # INTELLIGENT TRACK MATCHING: Use advanced matching with confidence scoring
+            match_result = self.intelligent_track_match(track_name, result_title, result_filename)
+            
+            # Only proceed if we have a reasonable match
+            if not match_result['matched'] or match_result['confidence'] < 60:
                continue  # Skip this result entirely
            
+            # Score based on match confidence and type
+            base_score = match_result['confidence']
+            match_type = match_result['type']
+            
+            if match_type == 'exact_title':
+                score += 150  # Highest priority for exact title match
+                reasons.append(f"track_exact_title({match_result['confidence']}%)")
+            elif match_type == 'exact_filename':
+                score += 140  # Very high for exact filename match
+                reasons.append(f"track_exact_filename({match_result['confidence']}%)")
+            elif match_type == 'substring_title':
+                score += 130  # High for substring in title
+                reasons.append(f"track_substring_title({match_result['confidence']}%)")
+            elif match_type == 'substring_filename':
+                score += 120  # High for substring in filename
+                reasons.append(f"track_substring_filename({match_result['confidence']}%)")
+            elif match_type == 'word_match_high':
+                score += 110  # Good for high word match
+                reasons.append(f"track_word_match_high({match_result['confidence']}%)")
+            elif match_type == 'word_match_medium':
+                score += 100  # Medium for medium word match
+                reasons.append(f"track_word_match_medium({match_result['confidence']}%)")
+            elif match_type == 'fuzzy_match':
+                score += 90   # Lower for fuzzy match
+                reasons.append(f"track_fuzzy_match({match_result['confidence']}%)")
+            else:
+                score += base_score  # Use confidence as base score
+                reasons.append(f"track_match_{match_type}({match_result['confidence']}%)")
+            
            # BONUS: Artist name contained (extra points)
            artist_contained = False
            
@ -3077,15 +3093,11 @@ class DownloadMissingTracksModal(QDialog):
                    score -= 20
                    reasons.append(f"duration_mismatch({duration_diff:.1f}s)")
            
-            # Quality preference: FLAC > other lossless > high bitrate > low bitrate
-            if hasattr(result, 'quality') and result.quality:
-                quality_lower = result.quality.lower()
-                if quality_lower in ['flac', 'alac', 'ape']:
-                    score += 15  # Lossless bonus (lower priority than matching)
-                    reasons.append(f"quality_lossless({quality_lower})")
-                elif 'mp3' in quality_lower or 'aac' in quality_lower:
-                    score += 5  # Standard formats
-                    reasons.append(f"quality_standard({quality_lower})")
+            # ENHANCED QUALITY PREFERENCE: Heavily prioritize FLAC and high quality
+            quality_score = self.calculate_quality_score(result, result_filename)
+            score += quality_score['score']
+            if quality_score['reason']:
+                reasons.append(quality_score['reason'])
            
            # File size reasonableness (avoid tiny or corrupted files)
            if hasattr(result, 'size') and result.size:
@ -3156,6 +3168,243 @@ class DownloadMissingTracksModal(QDialog):
        
        return len(intersection) / len(union) if union else 0.0
    
+    def calculate_quality_score(self, result, filename):
+        """Calculate enhanced quality score prioritizing FLAC and high bitrates"""
+        score = 0
+        reason = ""
+        
+        # Get file format from multiple sources
+        file_format = ""
+        bitrate = 0
+        
+        # Check quality field first
+        if hasattr(result, 'quality') and result.quality:
+            quality_lower = result.quality.lower()
+            file_format = quality_lower
+        
+        # Also check filename for format clues
+        filename_lower = filename.lower() if filename else ""
+        
+        # Extract format from filename if not found in quality field
+        if not file_format:
+            if '.flac' in filename_lower:
+                file_format = 'flac'
+            elif '.alac' in filename_lower or '.m4a' in filename_lower:
+                file_format = 'alac'
+            elif '.ape' in filename_lower:
+                file_format = 'ape'
+            elif '.mp3' in filename_lower:
+                file_format = 'mp3'
+            elif '.aac' in filename_lower:
+                file_format = 'aac'
+            elif '.ogg' in filename_lower or '.oga' in filename_lower:
+                file_format = 'ogg'
+        
+        # Extract bitrate from filename (common patterns)
+        import re
+        bitrate_match = re.search(r'(\d{2,4})\s*k?bps?', filename_lower)
+        if not bitrate_match:
+            bitrate_match = re.search(r'\[(\d{2,4})k?\]', filename_lower)
+        if not bitrate_match:
+            bitrate_match = re.search(r'(\d{2,4})k(?![a-z])', filename_lower)  # 320k but not 320kb
+        
+        if bitrate_match:
+            try:
+                bitrate = int(bitrate_match.group(1))
+            except:
+                bitrate = 0
+        
+        # PRIORITY 1: FLAC gets highest bonus (user requirement)
+        if file_format == 'flac' or 'flac' in filename_lower:
+            score += 50  # Significantly higher than old +15
+            reason = f"format_flac_priority"
+            
+            # Extra bonus for high quality FLAC indicators
+            if any(indicator in filename_lower for indicator in ['24bit', '24-bit', '96khz', '192khz', 'hi-res']):
+                score += 20
+                reason += "_hires"
+                
+        # PRIORITY 2: Other lossless formats
+        elif file_format in ['alac', 'ape']:
+            score += 35
+            reason = f"format_lossless_{file_format}"
+            
+        # PRIORITY 3: High bitrate MP3/AAC (320kbps)
+        elif file_format in ['mp3', 'aac']:
+            if bitrate >= 320:
+                score += 25
+                reason = f"format_mp3_320kbps"
+            elif bitrate >= 256:
+                score += 15
+                reason = f"format_mp3_256kbps"
+            elif bitrate >= 192:
+                score += 10
+                reason = f"format_mp3_192kbps"
+            elif bitrate >= 128:
+                score += 5
+                reason = f"format_mp3_128kbps"
+            else:
+                score += 5  # Unknown bitrate MP3
+                reason = f"format_mp3_unknown"
+                
+        # PRIORITY 4: Other formats
+        elif file_format == 'ogg':
+            score += 8
+            reason = "format_ogg"
+        else:
+            # Unknown format - give minimal points
+            score += 2
+            reason = "format_unknown"
+        
+        # BONUS: Clean filename (no brackets, underscores, or messy formatting)
+        clean_filename_score = 0
+        if filename_lower:
+            # Penalty for messy filenames
+            underscore_count = filename_lower.count('_')
+            if underscore_count > 3:  # Too many underscores
+                clean_filename_score -= 5
+            elif '[' in filename_lower and ']' in filename_lower:
+                # Some brackets are OK (like [FLAC]) but too many is messy
+                bracket_count = filename_lower.count('[') + filename_lower.count(']')
+                if bracket_count > 4:
+                    clean_filename_score -= 3
+            
+            # Bonus for clean formatting
+            if not any(char in filename_lower for char in ['_', '@', '#', '$', '%']):
+                clean_filename_score += 10
+                if reason:
+                    reason += "_clean"
+        
+        score += clean_filename_score
+        
+        # BONUS: Album context detection
+        if any(indicator in filename_lower for indicator in ['album', 'discography', 'collection']):
+            score += 5
+            if reason:
+                reason += "_album_context"
+        
+        return {'score': score, 'reason': reason}
+    
+    def intelligent_track_match(self, spotify_track_name, result_title, result_filename):
+        """Intelligent track matching that handles various formatting patterns"""
+        import re
+        
+        # Normalize the Spotify track name
+        clean_spotify_name = self.normalize_track_title(spotify_track_name)
+        
+        # Create multiple versions of the result title/filename for matching
+        result_text = f"{result_title} {result_filename}".lower()
+        clean_result_title = self.normalize_track_title(result_title)
+        clean_result_filename = self.normalize_track_title(result_filename)
+        
+        # Matching strategies with different confidence levels
+        match_types = []
+        
+        # EXACT MATCH (highest confidence)
+        if clean_spotify_name == clean_result_title:
+            match_types.append(('exact_title', 100))
+        elif clean_spotify_name == clean_result_filename:
+            match_types.append(('exact_filename', 95))
+        
+        # SUBSTRING MATCH (high confidence)
+        elif clean_spotify_name in clean_result_title:
+            match_types.append(('substring_title', 90))
+        elif clean_spotify_name in clean_result_filename:
+            match_types.append(('substring_filename', 85))
+        
+        # WORD MATCH (medium confidence)
+        # Check if all important words from track name appear in result
+        spotify_words = set(clean_spotify_name.split())
+        result_words = set(clean_result_title.split()) | set(clean_result_filename.split())
+        
+        if spotify_words and len(spotify_words) > 0:
+            word_match_ratio = len(spotify_words.intersection(result_words)) / len(spotify_words)
+            if word_match_ratio >= 0.8:  # 80% of words match
+                match_types.append(('word_match_high', int(80 * word_match_ratio)))
+            elif word_match_ratio >= 0.6:  # 60% of words match
+                match_types.append(('word_match_medium', int(60 * word_match_ratio)))
+        
+        # FUZZY MATCH (lower confidence for complex cases)
+        # Handle cases like "Track Name - Artist Name [320kbps]"
+        simplified_result = self.simplify_complex_title(result_text)
+        if clean_spotify_name in simplified_result:
+            match_types.append(('fuzzy_match', 70))
+        
+        # Return the best match type found
+        if match_types:
+            best_match = max(match_types, key=lambda x: x[1])
+            return {'type': best_match[0], 'confidence': best_match[1], 'matched': True}
+        else:
+            return {'type': 'no_match', 'confidence': 0, 'matched': False}
+    
+    def normalize_track_title(self, title):
+        """Normalize track title by removing common formatting and extra content"""
+        if not title:
+            return ""
+        
+        import re
+        
+        # Convert to lowercase and strip
+        normalized = title.lower().strip()
+        
+        # Remove file extensions
+        normalized = re.sub(r'\.(flac|mp3|aac|alac|ape|ogg|m4a)$', '', normalized)
+        
+        # Remove common bracketed content (but preserve essential parts)
+        # Remove quality indicators: [320kbps], [FLAC], [24bit], etc.
+        normalized = re.sub(r'\[(320|256|192|128)k?bps?\]', '', normalized)
+        normalized = re.sub(r'\[flac\]', '', normalized)
+        normalized = re.sub(r'\[24bit\]', '', normalized)
+        normalized = re.sub(r'\[hi-?res\]', '', normalized)
+        
+        # Remove track numbers: "01. ", "1-", "01 - "
+        normalized = re.sub(r'^(\d{1,2}[-.\s]*)', '', normalized)
+        
+        # Remove common separators between track and artist when they appear together
+        # "Track Name - Artist Name" -> focus on track name part
+        if ' - ' in normalized:
+            parts = normalized.split(' - ')
+            # Usually the first part is the track name
+            if len(parts) >= 2:
+                normalized = parts[0].strip()
+        
+        # Remove featuring info: "(feat. Artist)", "ft. Artist", etc.
+        normalized = re.sub(r'\(feat\.?[^)]*\)', '', normalized)
+        normalized = re.sub(r'\bft\.?\s+[^,\s]+', '', normalized)
+        normalized = re.sub(r'\bfeat\.?\s+[^,\s]+', '', normalized)
+        
+        # Remove common extra content
+        normalized = re.sub(r'\(remix\)', '', normalized)
+        normalized = re.sub(r'\(remaster\)', '', normalized)
+        normalized = re.sub(r'\(official[^)]*\)', '', normalized)
+        
+        # Replace multiple separators with spaces
+        normalized = re.sub(r'[_\-\.\s]+', ' ', normalized)
+        
+        # Remove extra whitespace
+        normalized = re.sub(r'\s+', ' ', normalized).strip()
+        
+        return normalized
+    
+    def simplify_complex_title(self, text):
+        """Simplify complex titles that may have artist names, quality info, etc."""
+        import re
+        
+        # Remove everything in brackets and parentheses
+        simplified = re.sub(r'\[[^\]]*\]', '', text)
+        simplified = re.sub(r'\([^)]*\)', '', text)
+        
+        # Remove common quality indicators
+        simplified = re.sub(r'\b(320|256|192|128)k?bps?\b', '', simplified)
+        simplified = re.sub(r'\bflac\b', '', simplified)
+        simplified = re.sub(r'\bmp3\b', '', simplified)
+        
+        # Remove excessive punctuation
+        simplified = re.sub(r'[_\-\.\s]+', ' ', simplified)
+        simplified = re.sub(r'\s+', ' ', simplified).strip()
+        
+        return simplified
+    
    def start_download_with_match(self, search_result, spotify_track, track_index, table_index):
        """Start download using the matched search result and downloads.py infrastructure"""
        print(f"🚀 Starting download with matched result: {search_result.filename}")
@ -3246,8 +3495,88 @@ class DownloadMissingTracksModal(QDialog):
        
        return search_result
    
+    def create_spotify_based_search_result(self, original_search_result, spotify_track, spotify_artist):
+        """Create a search result using Spotify metadata instead of Soulseek metadata"""
+        from dataclasses import dataclass
+        
+        # Debug: Check what type of search result we received
+        print(f"🔍 Debug - original_search_result type: {type(original_search_result)}")
+        print(f"🔍 Debug - original_search_result attributes: {dir(original_search_result)}")
+        if hasattr(original_search_result, 'filename'):
+            print(f"🔍 Debug - filename: {original_search_result.filename}")
+        if hasattr(original_search_result, 'user'):
+            print(f"🔍 Debug - user: {original_search_result.user}")
+        else:
+            print(f"🔍 Debug - NO USER ATTRIBUTE FOUND")
+        
+        @dataclass
+        class SpotifyBasedSearchResult:
+            # Soulseek download details - using expected field names
+            filename: str
+            username: str      # downloads.py expects 'username' not 'user'
+            size: int
+            bitrate: int       # downloads.py expects 'bitrate' not 'bit_rate'
+            sample_rate: int
+            duration: int
+            quality: str       # downloads.py expects 'quality' not 'format'
+            
+            # Spotify metadata for organization
+            title: str
+            artist: str
+            album: str
+            track_number: int = 0
+            
+            # Add compatibility properties for any code expecting old names
+            @property
+            def user(self):
+                return self.username
+                
+            @property 
+            def bit_rate(self):
+                return self.bitrate
+                
+            @property
+            def format(self):
+                return self.quality
+            
+        # Get Spotify metadata 
+        spotify_title = spotify_track.name
+        spotify_artist_name = spotify_artist.name
+        spotify_album = getattr(spotify_track, 'album', 'Unknown Album')
+        spotify_duration = int(spotify_track.duration_ms / 1000) if hasattr(spotify_track, 'duration_ms') else 0
+        
+        # Determine track number if this is part of an album
+        track_number = getattr(spotify_track, 'track_number', 0) if hasattr(spotify_track, 'track_number') else 0
+        
+        # Create hybrid result - Soulseek download data + Spotify metadata
+        # Map TrackResult attributes to expected format
+        spotify_based_result = SpotifyBasedSearchResult(
+            # Soulseek download details (keep for actual download) - map attributes correctly
+            filename=getattr(original_search_result, 'filename', f"{spotify_title}.flac"),
+            username=getattr(original_search_result, 'username', 'unknown_user'),  # TrackResult uses 'username'
+            size=getattr(original_search_result, 'size', 50000000),
+            bitrate=getattr(original_search_result, 'bitrate', 1411),  # TrackResult uses 'bitrate'
+            sample_rate=getattr(original_search_result, 'sample_rate', 44100),
+            duration=getattr(original_search_result, 'duration', spotify_duration),
+            quality=getattr(original_search_result, 'quality', 'flac'),  # TrackResult uses 'quality'
+            
+            # Spotify metadata (used for folder organization)
+            title=spotify_title,
+            artist=spotify_artist_name,
+            album=spotify_album,
+            track_number=track_number
+        )
+        
+        print(f"🎯 Created Spotify-based search result:")
+        print(f"   📁 Title: {spotify_title} (Spotify)")
+        print(f"   🎤 Artist: {spotify_artist_name} (Spotify)")  
+        print(f"   💿 Album: {spotify_album} (Spotify)")
+        print(f"   📄 File: {original_search_result.filename} (Soulseek)")
+        
+        return spotify_based_result
+    
    def start_matched_download_via_infrastructure(self, search_result, track_index, table_index):
-        """Start matched download using downloads.py infrastructure with automatic artist matching"""
+        """Start matched download using downloads.py infrastructure with Spotify metadata"""
        try:
            # Get the Spotify track for artist info
            track_result = self.missing_tracks[track_index]
@ -3273,9 +3602,13 @@ class DownloadMissingTracksModal(QDialog):
            
            artist = SpotifyArtist(name=artist_name)
            
-            # Call downloads.py infrastructure directly with auto-matched artist
-            # This bypasses the SpotifyMatchingModal since we already have the artist info
-            download_item = self.downloads_page._start_download_with_artist(search_result, artist)
+            # CREATE SPOTIFY-BASED SEARCH RESULT instead of using Soulseek metadata
+            # This ensures folder organization uses Spotify metadata, not Soulseek metadata
+            spotify_based_result = self.create_spotify_based_search_result(search_result, spotify_track, artist)
+            
+            # Call downloads.py infrastructure with Spotify-based search result
+            # This ensures proper folder organization using Spotify metadata
+            download_item = self.downloads_page._start_download_with_artist(spotify_based_result, artist)
            
            if download_item:
                print(f"✅ Successfully queued download for: {spotify_track.name}")