1-09-1. Query Phase

2.X/1. Getting Started

1-09-1. Query Phase

drscg 2017. 9. 30. 18:02

During the initial query phase, the query is broadcast to a shard copy (a primary or replica shard) of every shard in the index. Each shard executes the search locally and builds a priority queue of matching documents.

초기 query 절 에서, query는 index에 있는 모든 shard의, shard 복사본(primary 또는 replica shard)에 전달된다. 각 shard는 내부적으로 검색을 실행하고, 일치하는 document의 우선순위(priority) queue 를 생성한다.

우선순위(priority) Queue

A priority queue is just a sorted list that holds the top-n matching documents. The size of the priority queue depends on the pagination parameters from and size. For example, the following search request would require a priority queue big enough to hold 100 documents:

우선순위(priority) queue 는, 상위 N(Top-N) 개의 일치하는 document를 가지고 있는, 정렬된 목록일 뿐이다. 우선순위 queue의 크기는 페이지 계산 매개변수(from 과 size)에 달려 있다. 예를 들어, 다음과 같은 검색 request는 100개의 document를 가질 수 있는, 충분히 큰 우선순위 queue가 필요하다.

GET /_search
{
    "from": 90,
    "size": 10
}

The query phase process is depicted in Figure 14, “분산 검색의 query 절”.

query 절은 Figure 14, “분산 검색의 query 절”처럼 나타낼 수 있다.

Figure 14. 분산 검색의 query 절

The query phase consists of the following three steps:

query 절은 다음과 같이 3단계로 구성된다.

The client sends a search request to Node 3, which creates an empty priority queue of size from + size.
클라이언트가 Node 3 에 search request를 보내면, Node 3 은 from + size 의 크기를 가진, 비어있는 우선순위 queue를 생성한다.
Node 3 forwards the search request to a primary or replica copy of every shard in the index. Each shard executes the query locally and adds the results into a local sorted priority queue of size from + size.
Node 3 은 검색 request를, index에 있는 모든 shard의 primary 또는 replica shard로 전달한다. 각 shard는 내부적으로 query를 실행하고, 그 결과를 from + size 의 크기를 가진, 내부의 정렬된 우선순위 queue에 추가한다.
Each shard returns the doc IDs and sort values of all the docs in its priority queue to the coordinating node, Node 3, which merges these values into its own priority queue to produce a globally sorted list of results.
각 shard는 각각의 우선순위 queue에 있는, 모든 document의 ID와 정렬 값을 조정(coordinating) node인, node 3에 반환한다. Node 3 은 최종적으로 정렬된 결과의 목록을 만들어 내기 위해, 자신의 우선순위 queue에 이 값들을 병합한다.

When a search request is sent to a node, that node becomes the coordinating node. It is the job of this node to broadcast the search request to all involved shards, and to gather their responses into a globally sorted result set that it can return to the client.

검색 request가 node로 보내질 때, 그 node는 조정(coordinating) node가 된다. 검색 request를 모든 관련된 shard로 보내고, 그들의 response를 최종적으로 정렬된 결과로 모으고, 그것을 클라이언트로 반환하는 것이 조정(coordinating) node의 역할이다.

The first step is to broadcast the request to a shard copy of every node in the index. Just like document GET requests, search requests can be handled by a primary shard or by any of its replicas.This is how more replicas (when combined with more hardware) can increase search throughput. A coordinating node will round-robin through all shard copies on subsequent requests in order to spread the load.

첫 번째 단계는, request를 index에 있는 모든 node의 shard 복사본으로 보내는 것이다. 마치 document GET requests처럼, 검색 request는 primary shard나 replica shard에 의해 처리될 수 있다. 이것이 더 많은 replica(더 많은 H/W가 조합되었을 때)가 검색 처리량을 증가시킬 수 있는 방법이다. 조정(coordinating) node는 부하를 분산하기 위해, 후속 request를 모든 shard 복사본을 통한 round-robin 방식으로 처리한다.

Each shard executes the query locally and builds a sorted priority queue of length from + size—in other words, enough results to satisfy the global search request all by itself. It returns a lightweight list of results to the coordinating node, which contains just the doc IDs and any values required for sorting, such as the _score.

각 shard는 내부적으로 query를 실행하고, from + size 의 크기(즉, 자체적으로 전체 검색 request를 만족시키기에 충분한 결과)를 가진, 정렬된 우선순위 queue를 만든다. 그리고, 그 결과 목록을 조정 node로 반환한다. 그 결과 목록은 document ID와, _score 처럼, 정렬에 필요한 값들을 포함하고 있다.

The coordinating node merges these shard-level results into its own sorted priority queue, which represents the globally sorted result set. Here the query phase ends.

조정(coordinating) node는 이들 shard-level의 결과를, 최종적으로 정렬된 결과 집합을 나타내는, 자신의 priority queue에 병합한다. 여기까지가 query 절이다.

An index can consist of one or more primary shards, so a search request against a single index needs to be able to combine the results from multiple shards. A search against multiple or all indices works in exactly the same way—there are just more shards involved.

index는 하나 이상의 primary shard로 구성되어 있다. 따라서 단일 index에 대한 검색 request는 여러 shard로부터의 결과를 조합해야 한다. 다수 나 모든 indices에 대한 검색은, 정확히 동일한 방식으로 동작한다. 단지, 더 많은 shard가 포함될 뿐이다.

'2.X > 1. Getting Started' 카테고리의 다른 글

1-08-4. Doc Values Intro (0)	2017.09.30
1-09. Distributed Search Execution (0)	2017.09.30
1-09-2. Fetch Phase (0)	2017.09.30
1-09-3. Search Options (0)	2017.09.30
1-09-4. Scroll (0)	2017.09.30

현재글1-09-1. Query Phase

elasticsearch, definitive guide

Query, primary, phrase, score, Size, replica, index, Term, full-text, cache, json, inverted, MATCH, Shard, parent, Filter, Relevance, Type, Cluster, Mapping,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

不爲也比不能也

1-09-1. Query Phase

'2.X > 1. Getting Started' 카테고리의 다른 글

'2.X/1. Getting Started'의 다른글

티스토리툴바

1-09-1. Query Phase

'2.X > 1. Getting Started' 카테고리의 다른 글

'2.X/1. Getting Started'의 다른글

관련글

티스토리툴바