'2.X/2. Search in Depth' 카테고리의 글 목록 (5 Page)

2-5-3. wildcard and regexp Queries

The wildcard query is a low-level, term-based query similar in nature to the prefix query, but it allows you to specify a pattern instead of just a prefix. It uses the standard shell wildcards: ?matches any character, and * matches zero or more characters.wildcard query는 prefix query와 근본적으로 유사한 low-level의 단어 기반의 query이다. 그러나, 이것은 단지 prefix가 아닌 pattern을 지정할 수 있다. 이것은 표준 shell wildcard를 사용한다. ? 는 ..

2.X/2. Search in Depth 2017.09.24

2-5-4. Query-Time Search-as-You-Type

Leaving postcodes behind, let’s take a look at how prefix matching can help with full-text queries.Users have become accustomed to seeing search results before they have finished typing their query—so-called instant search, or search-as-you-type. Not only do users receive their search results in less time, but we can guide them toward results that actually exist in our index.우편번호를 뒤로 하고, prefix ..

2.X/2. Search in Depth 2017.09.24

2-5-5. Index-Time Optimizations

All of the solutions we’ve talked about so far are implemented at query time. They don’t require any special mappings or indexing patterns; they simply work with the data that you’ve already indexed.지금까지 이야기 했던 해결책 전부는 검색 시 에 구현된다. 어떤 특별한 mapping이나 색인 pattern도 요구하지 않는다. 기존의 색인 데이터로 충분히 동작한다.The flexibility of query-time operations comes at a cost: search performance. Sometimes it may make sense ..

2.X/2. Search in Depth 2017.09.24

2-5-6. Ngrams for Partial Matching

As we have said before, "You can find only terms that exist in the inverted index". Although the prefix, wildcard, and regexp queries demonstrated that that is not strictly true, it is true that doing a single-term lookup is much faster than iterating through the terms list to find matching terms on the fly. Preparing your data for partial matching ahead of time will increase your search perform..

2.X/2. Search in Depth 2017.09.24

2-5-7. Index-Time Search-as-You-Type

The first step to setting up index-time search-as-you-type is to define our analysis chain, which we discussed in Configuring Analyzers, but we will go over the steps again here.색인 시에, instant 검색을 설정하기 위한 첫 번째 단계는, analysis chain을 정의하는 것이다. Configuring Analyzers에서 언급했었는데, 여기에서 다시 그 단계를 살펴보겠다.Preparing the IndexeditThe first step is to configure a custom edge_ngram token filter, which we will cal..

2.X/2. Search in Depth 2017.09.24

2-5-8. Ngrams for Compound Words

Finally, let’s take a look at how n-grams can be used to search languages with compound words.German is famous for combining several small words into one massive compound word in order to capture precise or complex meanings. For example:마지막으로, n-grams가 복합 단어를 가진 언어를 검색하는데 사용될 수 있는 방법을 살펴보자. 독일어는, 정확하거나 복잡한 의미를 나타내기 위해, 다수의 작은 단어를 대규모 복합 단어로 조합하기로 유명하다. 예를 들면,Aussprachewörterbuch발음 기호 사전(Pronunci..

2.X/2. Search in Depth 2017.09.24

2-6. Controlling Relevance

Databases that deal purely in structured data (such as dates, numbers, and string enums) have it easy: they just have to check whether a document (or a row, in a relational database) matches the query.오직 구조화된 데이터(date, number, string enum)만을 다루는 데이터베이스는 쉽다. document(RDB에서 row)가 query에 일치하는지 여부만을 확인한다.While Boolean yes/no matches are an essential part of full-text search, they are not enough by t..

2.X/2. Search in Depth 2017.09.24

2-6-01. Theory Behind Relevance Scoring

Lucene (and thus Elasticsearch) uses the Boolean model to find matching documents, and a formula called the practical scoring function to calculate relevance. This formula borrows concepts from term frequency/inverse document frequency and the vector space model but adds more-modern features like a coordination factor, field length normalization, and term or query clause boosting.Lucene과 Elastic..

2.X/2. Search in Depth 2017.09.24

2-6-02. Lucene’s Practical Scoring Function

For multiterm queries, Lucene takes the Boolean model, TF/IDF, and the vector space model and combines them in a single efficient package that collects matching documents and scores them as it goes.다중 단어 query의 경우, Lucene은 Boolean model, TF/IDF 그리고, vector space model을 가지고, 일치하는 document를 수집하고, score를 계산하는, 하나의 효율적인 패키지로 그들을 조합한다.A multiterm query like다중 단어 query는GET /my_index/doc/_search { "que..

2.X/2. Search in Depth 2017.09.24

2-6-03. Query-Time Boosting

In Prioritizing Clauses, we explained how you could use the boost parameter at search time to give one query clause more importance than another. For instance:Prioritizing Clauses에서, 어떤 query 절이 다른 것보다 더 중요하다는 의미로, 검색 시에 boost 매개변수를 사용하는 방법을 설명한 바 있다. 예를 들면,GET /_search { "query": { "bool": { "should": [ { "match": { "title": { "query": "quick brown fox", "boost": 2 } } }, { "match": { "content"..

2.X/2. Search in Depth 2017.09.24

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

不爲也比不能也

2.X/2. Search in Depth 62

티스토리툴바