dictionary 3

3-4-1. Algorithmic Stemmers

Most of the stemmers available in Elasticsearch are algorithmic in that they apply a series of rules to a word in order to reduce it to its root form, such as stripping the final s or es from plurals. They don’t have to know anything about individual words in order to stem them.Elasticsearch에서 이용할 수 있는 대부분의 형태소 분석기는, 단어를 원형으로 축소하기 위해, 복수형에서 마지막의 s 나 es 를 떼어내는 것 같은, 일련의 규칙을 적용하는 알고리즘이다. 형태소 분석을 위..

3-4-2. Dictionary Stemmers

Dictionary stemmers work quite differently from algorithmic stemmers. Instead of applying a standard set of rules to each word, they simply look up the word in the dictionary. Theoretically, they could produce much better results than an algorithmic stemmer. A dictionary stemmer should be able to do the following: 사전 형태소 분석기(dictionary stemmers) 는 algorithmic stemmers와 전혀 다르게 동작한다. 각 단어에 규칙의 기준을..

3-4-3. Hunspell Stemmer

Elasticsearch provides dictionary-based stemming via the hunspell token filter. Hunspell hunspell.github.io is the spell checker used by Open Office, LibreOffice, Chrome, Firefox, Thunderbird, and many other open and closed source projects.Elasticsearch는 hunspell token filter를 통해, 사전 기반의 형태소 분석을 제공한다. Hunspell hunspell.sourceforge.net은 Open Office, Libre Office, Chrome, FireFox, Thunderbird 그리고 ..