No description
Find a file
2026-02-04 11:13:37 -05:00
data Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00
MAL-Dubs@5e9ebf7801 Got the pieces needed 2026-02-04 08:42:07 -05:00
.gitignore Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00
.gitmodules Got the pieces needed 2026-02-04 08:42:07 -05:00
config.json.example Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00
exclusions.txt Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00
media_dub_report.py Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00
media_dubs.db Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00
missing_dubs_report.json Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00
missing_dubs_report.md Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00
README.md Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00
requirements.txt Rebasing and squashing to remove dirty stuff from repo 2026-02-04 11:13:37 -05:00

Sonarr Dub List

A tool to identify anime in your Sonarr/Radarr collection that has English dubs available but you only have the Japanese audio version.

Features

  • Compares your media collection against MAL (MyAnimeList) dubbed anime database
  • Identifies episodes/movies missing English dub tracks
  • Excludes Season 0 (specials/OVAs) which are often not dubbed
  • Supports exclusion list for series with partial dubs
  • Uses fuzzy matching with similarity scoring to avoid false positives
  • Generates detailed reports in JSON and Markdown formats

Setup

  1. Install dependencies:

    pip install -r requirements.txt
    
  2. Configure credentials:

    cp config.json.example config.json
    # Edit config.json with your Sonarr/Radarr API keys and URLs
    
  3. Data sources:

    • MAL anime data is downloaded into data/ directory (one JSON file per anime)
    • Community dub list is in MAL-Dubs/data/dubInfo.json (4800+ dubbed anime)

Usage

# Full run - fetch from Sonarr/Radarr and generate report
python3 media_dub_report.py

# Quick run - use cached data from previous run
python3 media_dub_report.py --skip-api-fetch

# Adjust matching thresholds
python3 media_dub_report.py --min-score 5 --min-similarity 0.5

Command Line Options

  • --skip-api-fetch - Use existing database, don't reload from Sonarr/Radarr
  • --min-score <float> - Minimum FTS score to accept a match (default: 5.0)
  • --min-stdev <float> - Minimum score standard deviation (default: 0.0, higher = more confident)
  • --min-similarity <float> - Minimum edit distance similarity 0-1 (default: 0.0, higher = stricter)
  • --db-path <path> - SQLite database path (default: media_dubs.db)

Exclusions

To exclude series with partial dubs (e.g., Detective Conan - only first 130 episodes dubbed):

  1. Find the Sonarr ID:

    sqlite3 media_dubs.db "SELECT sonarr_id, title FROM sonarr_series WHERE title LIKE '%YourSeries%'"
    
  2. Add to exclusions.txt:

    841  # Detective Conan - partial dub
    

How It Works

  1. Data Loading:

    • Loads MAL anime metadata from data/*.json
    • Fetches your collection from Sonarr/Radarr APIs
    • Checks language tracks for each episode/movie file
  2. Title Matching:

    • Uses FTS5 (Full-Text Search) for initial candidate matches
    • Calculates edit distance similarity after normalization
    • Sorts by similarity first (prefer exact matches), then FTS score
    • Filters by configurable thresholds
  3. Report Generation:

    • Lists episodes with files but missing English audio
    • Excludes Season 0 (specials/OVAs)
    • Excludes series in exclusions list
    • Outputs to missing_dubs_report.md and missing_dubs_report.json

Output

The markdown report includes:

  • Summary statistics
  • Match quality scores for transparency
  • Collapsible tables showing top 10 candidate matches
  • Clickable MAL links for verification

API References