The icu_tokenizer uses the same Unicode Text Segmentation algorithm as the standard tokenizer,but adds better support for some Asian languages by using a dictionary-based approach to identify words in Thai, Lao, Chinese, Japanese, and Korean, and using custom rules to break Myanmar and Khmer text into syllables.icu_tokenizer 는 standard tokenizer와 동일한 Unicode 텍스트 분할 알고리즘(Unicode Text Segmentation..