mirror of https://github.com/Nezreka/SoulSync.git
normalize_string() was running unidecode on all text, converting Japanese kanji to Chinese pinyin gibberish (命の灯火 → "tvanimedei"). Now detects CJK characters (kanji, hiragana, katakana, hangul, fullwidth forms) and skips unidecode for text containing them — just lowercases instead. Non-CJK text (Latin accents, Cyrillic) still goes through unidecode normally.pull/253/head
parent
1646c3d9e1
commit
d944d4a7d2
Loading…
Reference in new issue