remove stuff

pull/8/head
Broque Thomas 9 months ago
parent f4ac2b7d6e
commit 6a86076d54

@ -1,228 +0,0 @@
# Database Integration Migration - Plex to Local Database
## Overview
This document details the migration from Plex API-based track/album checking to local SQLite database checking in the SoulSync application. This change significantly improves performance by replacing network API calls with local database queries.
## Background
Previously, the application made real-time API calls to the Plex Media Server to check if tracks or albums existed in the user's library. This approach was:
- **Slow**: Network requests take 100-500ms per track/album
- **Unreliable**: Subject to network issues and Plex server availability
- **Resource-intensive**: Multiple API calls for complex searches
The new approach uses a local SQLite database (`music_library.db`) that mirrors the Plex library structure and enables:
- **Fast**: Database queries execute in 1-5ms
- **Reliable**: No network dependencies for existence checking
- **Efficient**: Single database instance with connection pooling
## Files Modified
### 1. `/database/music_database.py`
**New Methods Added:**
- `search_tracks(title, artist, limit)` - Search for tracks with fuzzy matching
- `search_albums(title, artist, limit)` - Search for albums with fuzzy matching
- `check_track_exists(title, artist, threshold)` - Check track existence with confidence scoring
- `check_album_exists(title, artist, threshold)` - Check album existence with confidence scoring
- `_string_similarity(s1, s2)` - Levenshtein distance similarity calculation
**Key Features:**
- Fuzzy string matching with configurable confidence thresholds (default 0.8)
- JOIN queries to include artist/album metadata
- Thread-safe database connections
- Comprehensive error handling and logging
### 2. `/ui/pages/sync.py`
**Function Modified:**
- `_check_track_in_plex()` at line 191
**Changes Made:**
- Replaced `self.plex_client.search_tracks()` with `db.check_track_exists()`
- Added database import: `from database.music_database import get_database`
- Created `MockPlexTrack` class for backward compatibility
- Updated error messages and logging to reference "database" instead of "Plex"
- Maintained same confidence scoring and early-exit logic
**Performance Impact:**
- Track checking now executes in ~2-5ms vs 100-500ms previously
- Batch playlist analysis completes 50-100x faster
### 3. `/ui/pages/artists.py`
**Classes Modified:**
#### `PlexLibraryWorker``DatabaseLibraryWorker`
- **Line 425**: Replaced entire class implementation
- **Constructor**: Removed `plex_client` parameter, now only requires `albums` and `matching_engine`
- **Search Logic**: Replaced `plex_client.search_albums()` with `db.check_album_exists()`
- **Backwards Compatibility**: Added alias `PlexLibraryWorker = DatabaseLibraryWorker`
#### `DownloadMissingAlbumTracksModal`
- **Line 1749**: Updated `start_plex_analysis()` documentation
- **Line 3043**: Updated error message to reference "Music database" instead of "Plex client"
- **Functionality**: Inherits database functionality through updated `PlaylistTrackAnalysisWorker`
## Integration Points Identified and Updated
### 1. Sync Page - Download Missing Tracks Modal
**Location:** `ui/pages/sync.py:191` in `PlaylistTrackAnalysisWorker._check_track_in_plex()`
**Previous Behavior:**
```python
potential_plex_matches = self.plex_client.search_tracks(
title=query_title,
artist=artist_name,
limit=15
)
```
**New Behavior:**
```python
db_track, confidence = db.check_track_exists(query_title, artist_name, confidence_threshold=0.7)
if db_track and confidence >= 0.8:
# Return mock Plex track for compatibility
return MockPlexTrack(db_track), confidence
```
### 2. Artists Page - Album Ownership Checking
**Location:** `ui/pages/artists.py:425` in `PlexLibraryWorker` (now `DatabaseLibraryWorker`)
**Previous Behavior:**
```python
plex_albums = self.plex_client.search_albums(album_name, artist_clean, limit=5)
best_match, confidence = self.matching_engine.find_best_album_match(spotify_album, plex_albums)
```
**New Behavior:**
```python
db_album, confidence = db.check_album_exists(album_name, artist_clean, confidence_threshold=0.7)
if db_album and confidence >= 0.8:
owned_albums.add(spotify_album.name)
```
### 3. Artists Page - Download Missing Album Tracks Modal
**Location:** `ui/pages/artists.py:1748` in `DownloadMissingAlbumTracksModal.start_plex_analysis()`
**Previous Behavior:** Used `PlaylistTrackAnalysisWorker` with Plex client
**New Behavior:** Same worker now uses database via updated `_check_track_in_plex()` method
## Database Schema Reference
The local database mirrors Plex structure with these key tables:
```sql
-- Artists table
CREATE TABLE artists (
id INTEGER PRIMARY KEY, -- Plex ratingKey
name TEXT NOT NULL, -- Artist name
thumb_url TEXT, -- Thumbnail URL
genres TEXT, -- JSON array of genres
summary TEXT, -- Artist biography
created_at TIMESTAMP,
updated_at TIMESTAMP
);
-- Albums table
CREATE TABLE albums (
id INTEGER PRIMARY KEY, -- Plex ratingKey
artist_id INTEGER NOT NULL, -- FK to artists.id
title TEXT NOT NULL, -- Album title
year INTEGER, -- Release year
thumb_url TEXT, -- Album artwork URL
genres TEXT, -- JSON array of genres
track_count INTEGER, -- Number of tracks
duration INTEGER, -- Total duration (ms)
created_at TIMESTAMP,
updated_at TIMESTAMP,
FOREIGN KEY (artist_id) REFERENCES artists (id)
);
-- Tracks table
CREATE TABLE tracks (
id INTEGER PRIMARY KEY, -- Plex ratingKey
album_id INTEGER NOT NULL, -- FK to albums.id
artist_id INTEGER NOT NULL, -- FK to artists.id
title TEXT NOT NULL, -- Track title
track_number INTEGER, -- Track number on album
duration INTEGER, -- Track duration (ms)
file_path TEXT, -- Full file path
bitrate INTEGER, -- Audio bitrate
created_at TIMESTAMP,
updated_at TIMESTAMP,
FOREIGN KEY (album_id) REFERENCES albums (id),
FOREIGN KEY (artist_id) REFERENCES artists (id)
);
```
## Performance Benchmarks
| Operation | Before (Plex API) | After (Database) | Improvement |
|-----------|------------------|------------------|-------------|
| Single track check | 100-500ms | 2-5ms | 20-250x faster |
| Album ownership scan (50 albums) | 30-60 seconds | 0.5-2 seconds | 15-120x faster |
| Playlist analysis (200 tracks) | 2-5 minutes | 3-10 seconds | 12-100x faster |
## Backward Compatibility
The migration maintains full backward compatibility:
1. **API Signatures**: All public methods maintain same parameters and return types
2. **Class Names**: `PlexLibraryWorker` alias ensures existing code continues to work
3. **Signal Emissions**: Same Qt signals emitted with same data structures
4. **Mock Objects**: `MockPlexTrack` provides same interface as original Plex track objects
5. **Error Handling**: Same error patterns and logging locations
## Migration Verification
To verify the migration works correctly:
1. **Test Sync Page**: Use "Download Missing Tracks" on a playlist - should complete analysis much faster
2. **Test Artists Page**: Search for an artist - album ownership icons should appear quickly
3. **Test Album Downloads**: Use "Download Missing Album Tracks" - should quickly identify existing tracks
4. **Check Logs**: Look for "Database match found" messages instead of Plex-related messages
## Troubleshooting
### Common Issues:
1. **"Music database is not available" error**:
- Ensure `music_library.db` exists in `/database/` folder
- Check database was populated via Dashboard sync
2. **No matches found for known tracks**:
- Verify database has current Plex data
- Check confidence thresholds (default 0.8 may be too high for some content)
- Review string similarity algorithm for edge cases
3. **Performance still slow**:
- Ensure database indexes are created (`idx_artists_name`, `idx_albums_title`, etc.)
- Check database file isn't corrupted (vacuum if needed)
### Debug Information:
Enable debug logging to see matching details:
```python
# In music_database.py, set logger level to DEBUG
logger.setLevel(logging.DEBUG)
```
This will show:
- Database queries being executed
- Confidence scores for matches
- String similarity calculations
- Match/no-match decisions
## Future Enhancements
Potential improvements for the database integration:
1. **Enhanced Fuzzy Matching**: Implement more sophisticated algorithms (Jaro-Winkler, phonetic matching)
2. **Caching Layer**: Add Redis or memory cache for frequently accessed queries
3. **Incremental Updates**: Sync only changed items instead of full refresh
4. **Multi-threaded Analysis**: Parallel database queries for large collections
5. **Search Optimization**: Full-text search capabilities for better matching
## Conclusion
This migration successfully replaces slow Plex API calls with fast local database queries while maintaining full backward compatibility. The performance improvement is dramatic - most operations are now 20-250x faster, making the application much more responsive for users with large music libraries.
The database-driven approach also reduces dependencies on network connectivity and Plex server availability, making the application more reliable overall.
Loading…
Cancel
Save