Fuzzy 8

2. Search in Depth

In Getting Started we covered the basic tools in just enough detail to allow you to start searching your data with Elasticsearch. It won’t take long, though, before you find that you want more: more flexibility when matching user queries, more-accurate ranking of results, more-specific searches to cover different problem domains.Getting Started에서 Elasticsearch로 데이터 검색을 시작할 수 있는, 충분히 자세한 기본적인 too..

2-2. Full-Text Search

Now that we have covered the simple case of searching for structured data, it is time to explore full-text search: how to search within full-text fields in order to find the most relevant documents.지금까지 구조화된 데이터를 위한, 간단한 검색을 살펴봤다. 이제 full-text 검색(full-text search) 을 탐험할 시간이다. 가장 적합한 document를 찾기 위해, full-text field를 검색하는 방법을 알아 보자.The two most important aspects of full-text search are as follows..

2-2-1. Term-Based Versus Full-Text

While all queries perform some sort of relevance calculation, not all queries have an analysis phase.Besides specialized queries like the bool or function_score queries, which don’t operate on text at all, textual queries can be broken down into two families:모든 query가 relevance 연산의 일종을 수행하지만, 모든 query가 analysis 절을 가지지는 않는다. 텍스트를 전혀 다루지 않는, bool 이나 function-score query 같은 특별한 query 이외에, 텍스트를 다루는 ..

3-7. Typoes and Mispelings

We expect a query on structured data like dates and prices to return only documents that match exactly. However, good full-text search shouldn’t have the same restriction. Instead, we can widen the net to include words that may match, but use the relevance score to push the better matches to the top of the result set.날짜나 가격 같은, 구조화된 데이터에 대한 query는 정확히 일치하는 document만 반환하기를 기대한다.그러나, 좋은 full-text ..

3-7-1. Fuzziness

Fuzzy matching treats two words that are "fuzzily" similar as if they were the same word. First, we need to define what we mean by fuzziness.퍼지 일치(fuzzy matching) 는 두 단어가 동일한 단어인 것처럼, "애매하게(fuzzily)" 유사한 두 단어를 다룬다. 먼저, fuzziness 의 의미를 정의해야 한다.In 1965, Vladimir Levenshtein developed the Levenshtein distance, which measures the number of single-character edits required to transform one word into t..

3-7-2. Fuzzy Query

The fuzzy query is the fuzzy equivalent of the term query. You will seldom use it directly yourself, but understanding how it works will help you to use fuzziness in the higher-level match query.fuzzy query는 term query와 어떤 면에서 유사하다. 그것을 직접 사용할 리는 거의 없을 것이다. 그러나, 그것이 동작하는 방법을 이해하면, 높은 수준의 match query에서, fuzziness를 사용하는데 도움이 될 것이다.To understand how it works, we will first index some documents:그것이 ..

3-7-4. Scoring Fuzziness

Users love fuzzy queries. They assume that these queries will somehow magically find the right combination of proper spellings. Unfortunately, the truth is somewhat more prosaic.사용자들은 fuzzy query를 좋아한다. 이들 query가 적절한 맞춤법의 올바른 조합을, 왠지 마술처럼 찾을 거라 생각한다. 유감스럽게도, 진실은 어느 정도 더 평범하다.Imagine that we have 1,000 documents containing "Schwarzenegger", and just one document with the misspelling "Schwarzenege..