full-text 16

1-01. You Know, for Search…

Elasticsearch is an open-source search engine built on top of Apache Lucene™, a full-text search-engine library. Lucene is arguably the most advanced, high-performance, and fully featured search engine library in existence today—both open source and proprietary.Elasticsearch는 Apache Lucene™을 기반으로 만들어진 검색 엔진이며, full-text 검색 엔진 라이브러리이다. Lucene은, Open Source와 상업용 양쪽 모두를 살펴보더라도, 오늘날 현존하는 가장 고급스러운, 고..

1-01-10. Full-Text Search

The searches so far have been simple: single names, filtered by age. Let’s try a more advanced, full-text search—a task that traditional databases would really struggle with.지금까지의 검색은 이름 하나만을 검색하고, 나이를 filtering하는 단순한 검색이었다. 기존의 DB와 진정으로 겨뤄볼 수 있는, 더 고급스러운 full-text 검색을 해 보자.We are going to search for all employees who enjoy rock climbing:rock climbing을 즐겨 하는 모든 직원을 찾아보자.GET /megacorp/employee/_s..

1-06. Mapping and Analysis

While playing around with the data in our index, we notice something odd. Something seems to be broken: we have 12 tweets in our indices, and only one of them contains the date 2014-09-15, but have a look at the total hits for the following queries:index에 있는 데이터를 살펴보다 보면, 이상한 점을 발견할 수 있다. 뭔가 깨진 것 같다. index에는 12개의 tweet이 있다. 그 중의 단 하나만 2014-09-15 를 포함하고 있다. 그런데 아래 query의 total hits를 보면GET /_searc..

1-06-1. Exact Values Versus Full Text

Data in Elasticsearch can be broadly divided into two types: exact values and full text.Elasticsearch에서 데이터는 크게 두 가지(exact value와 full text) 형태로 나누어진다.Exact values are exactly what they sound like. Examples are a date or a user ID, but can also include exact strings such as a username or an email address. The exact value Foo is not the same as the exact value foo. The exact value 2014 is not the..

1-06-2. Inverted Index

Elasticsearch uses a structure called an inverted index, which is designed to allow very fast full-text searches. An inverted index consists of a list of all the unique words that appear in any document, and for each word, a list of the documents in which it appears.Elasticsearch는 full-text 검색을 매우 빠르게 할 수 있도록 설계된, inverted index 라는 구조를 사용한다. inverted index는 특정 document에 나타나는 유일한 단어 모두의 목록과, 각각의 ..

1-06-3. Analysis and Analyzers

Analysis is a process that consists of the following:analysis 프로세스는 다음과 같이 구성된다.First, tokenizing a block of text into individual terms suitable for use in an inverted index,먼저, 문장(text)을, inverted index에서 사용하기에 적합한, 개별 단어(term) 로 분리한다.Then normalizing these terms into a standard form to improve their "searchability" or recall그리고, "검색 능력", recall 을 개선하기 위해, 표준 형태로 이들 단어를 정규화한다.This job is perfor..

1-07-4. Most Important Queries

While Elasticsearch comes with many queries, you will use just a few frequently. We discuss them in much greater detail in Search in Depth but next we give you a quick introduction to the most important queries.Elasticsearch에는 많은 query가 있지만, 자주 사용하는 것은 소수일 것이다. Search in Depth에서, 훨씬 더 자세히 이야기할 것이다. 아래에서 가장 중요한 query들에 대해 빠르게 소개하겠다.match_all QueryeditThe match_all query simply matches all documen..

1-10-04. Configuring Analyzers

The third important index setting is the analysis section, which is used to configure existing analyzers or to create new custom analyzers specific to your index.세 번째로 중요한 index 설정은 analysis 부분이다. 이것은 기존의 analyzer를 설정하거나, index에 지정된 새로운 사용자 정의 analyzer를 생성하는데 사용된다.In Analysis and Analyzers, we introduced some of the built-in analyzers, which are used to convert full-text strings into an inverted..

1-11-1. Making Text Searchable

The first challenge that had to be solved was how to make text searchable. Traditional databases store a single value per field, but this is insufficient for full-text search. Every word in a text field needs to be searchable, which means that the database needs to be able to index multiple values—words, in this case—in a single field.해결해야 할 첫 번째 과제는, 텍스트를 검색 가능하도록 만드는 방법이다. 전통적인 데이터베이스는 field당 ..