proximity 8

2. Search in Depth

In Getting Started we covered the basic tools in just enough detail to allow you to start searching your data with Elasticsearch. It won’t take long, though, before you find that you want more: more flexibility when matching user queries, more-accurate ranking of results, more-specific searches to cover different problem domains.Getting Started에서 Elasticsearch로 데이터 검색을 시작할 수 있는, 충분히 자세한 기본적인 too..

2-2. Full-Text Search

Now that we have covered the simple case of searching for structured data, it is time to explore full-text search: how to search within full-text fields in order to find the most relevant documents.지금까지 구조화된 데이터를 위한, 간단한 검색을 살펴봤다. 이제 full-text 검색(full-text search) 을 탐험할 시간이다. 가장 적합한 document를 찾기 위해, full-text field를 검색하는 방법을 알아 보자.The two most important aspects of full-text search are as follows..

2-3-06. Most Fields

Full-text search is a battle between recall—returning all the documents that are relevant—and precision—not returning irrelevant documents. The goal is to present the user with the most relevant documents on the first page of results.full-text 검색은 recall(적합한 document 모두를 반환하는)과 정확성(precision)(부적합한 document를 반환하지 않는)의 전쟁이다. 결과의 첫 번째 page에 가장 적합한 document를 사용자에게 제시하는 것이 목표이다.To improve recall, we ..

2-4. Proximity Matching

Standard full-text search with TF/IDF treats documents, or at least each field within a document, as a big bag of words. The match query can tell us whether that bag contains our search terms, but that is only part of the story. It can’t tell us anything about the relationship between words.TF/IDF를 가진 표준 full-text 검색은, document나, 최소한 document내의 각각의 filed를, 단어가 들어 있는 큰 가방(bag of words) 으로 생각한다. 그..

2-4-5. Proximity for Relevance

Although proximity queries are useful, the fact that they require all terms to be present can make them overly strict. It’s the same issue that we discussed in Controlling Precision in Full-Text Search: if six out of seven terms match, a document is probably relevant enough to be worth showing to the user, but the match_phrase query would exclude it.proximity query가 유용하지만, 모든 단어가 존재해야 한다는 사실은, 너..

2-4-6. Improving Performance

Phrase and proximity queries are more expensive than simple match queries. Whereas a matchquery just has to look up terms in the inverted index, a match_phrase query has to calculate and compare the positions of multiple possibly repeated terms.phrase와 proximity query는, 단순한 match query에 비해, 더 많은 비용이 든다. match query는 단어를 inverted index에서 찾는 반면에, match_phrase query는 가능한 한 여러 번, 반복해서 단어들의 위치를 계산하고 ..

2-4-7. Finding Associated

As useful as phrase and proximity queries can be, they still have a downside. They are overly strict: all terms must be present for a phrase query to match, even when using slop.phrase와 proximity query는 유용하지만, 단점이 있다. 지나치게 엄격하다. phrase query에 일치하기 위해, 심지어 slop 을 사용할 경우에도, 모든 단어가 반드시 존재해야 한다.The flexibility in word ordering that you gain with slop also comes at a price, because you lose the assoc..