1-03-12. Retrieving Multiple Documents

2.X/1. Getting Started

1-03-12. Retrieving Multiple Documents

drscg 2017. 10. 1. 09:57

As fast as Elasticsearch is, it can be faster still. Combining multiple requests into one avoids the network overhead of processing each request individually. If you know that you need to retrieve multiple documents from Elasticsearch, it is faster to retrieve them all in a single request by using the multi-get, or mget, API, instead of document by document.

Elasticsearch가 빠른 만큼, 여전히 더 빨라질 수 있다. 다수의 request를 하나로 조합하는 것은, 각 request를 개별적으로 처리하는 경우에 발생하는 네트워크 부하를 피할 수 있다. 만약 Elasticsearch에서 다수의 document를 가져와야 한다면, document 별로 가져오는 대신, _multi-get_이나 mget API를 사용하여, 한번의 request로 그들 모두를 가져오는 것이 더 빠르다.

The mget API expects a docs array, each element of which specifies the _index, _type, and _idmetadata of the document you wish to retrieve. You can also specify a _source parameter if you just want to retrieve one or more specific fields:

mget API는, 각 요소가 가져오려는 document의 _index, _type, _id metadata를 가지는 docs 배열을 지정해야 한다. 하나 이상의 특정 field만 가져오려면, _source 매개변수를 지정할 수도 있다.

GET /_mget
{
   "docs" : [
      {
         "_index" : "website",
         "_type" :  "blog",
         "_id" :    2
      },
      {
         "_index" : "website",
         "_type" :  "pageviews",
         "_id" :    1,
         "_source": "views"
      }
   ]
}

COPY AS CURL VIEW IN SENSE

The response body also contains a docs array that contains a response per document, in the same order as specified in the request. Each of these responses is the same response body that we would expect from an individual get request:

response body도 request에서 지정한 것과 동일한 순서로, document별로 response를 포함하는 docs 배열을 포함한다. 이들 response 각각은 개별 get request에서 얻은 response body와 동일하다.

{
   "docs" : [
      {
         "_index" :   "website",
         "_id" :      "2",
         "_type" :    "blog",
         "found" :    true,
         "_source" : {
            "text" :  "This is a piece of cake...",
            "title" : "My first external blog entry"
         },
         "_version" : 10
      },
      {
         "_index" :   "website",
         "_id" :      "1",
         "_type" :    "pageviews",
         "found" :    true,
         "_version" : 2,
         "_source" : {
            "views" : 2
         }
      }
   ]
}

COPY AS CURL VIEW IN SENSE

If the documents you wish to retrieve are all in the same _index (and maybe even of the same _type), you can specify a default /_index or a default /_index/_type in the URL.

동일한 index 에 있는(심지어 동일한 _type 의) 모든 document를 가져오려 한다면, URL에 기본 /_index나 기본 /_index/_type 을 지정하면 된다.

You can still override these values in the individual requests:

개별 request에 이런 값들을 재정의하면 된다.

GET /website/blog/_mget
{
   "docs" : [
      { "_id" : 2 },
      { "_type" : "pageviews", "_id" :   1 }
   ]
}

COPY AS CURL VIEW IN SENSE

In fact, if all the documents have the same _index and _type, you can just pass an array of idsinstead of the full docs array:

사실, 모든 document가 동일한 _index 와 _type 을 가진다면, 모든 docs 배열 대신에, ids 의 배열을 넘겨도 된다.

GET /website/blog/_mget
{
   "ids" : [ "2", "1" ]
}

Note that the second document that we requested doesn’t exist. We specified type blog, but the document with ID 1 is of type pageviews. This nonexistence is reported in the response body:

request한 것 중 두 번째 document가 존재하지 않는다는 것을 눈 여겨 보자. type을 blog 로 지정했으나, ID가 1 인 document는 pageviews type이다. response body에 document가 존재하지 않는다고 나타난다.

{
  "docs" : [
    {
      "_index" :   "website",
      "_type" :    "blog",
      "_id" :      "2",
      "_version" : 10,
      "found" :    true,
      "_source" : {
        "title":   "My first external blog entry",
        "text":    "This is a piece of cake..."
      }
    },
    {
      "_index" :   "website",
      "_type" :    "blog",
      "_id" :      "1",
      "found" :    false  
    }
  ]
}

COPY AS CURL VIEW IN SENSE

이 document는 발견되지 않았다.

The fact that the second document wasn’t found didn’t affect the retrieval of the first document. Each doc is retrieved and reported on individually.

두 번째 document가 발견되지 않는다는 사실은, 첫 번째 document를 가져오는데 영향을 미치지 않는다. 각 document는 개별적으로 가져오고 보고된다.

The HTTP status code for the preceding request is 200, even though one document wasn’t found. In fact, it would still be 200 if none of the requested documents were found—because the mget request itself completed successfully. To determine the success or failure of the individual documents, you need to check the found flag.

비록 하나의 document도 발견되지 않았지만, 위의 request에 대한 HTTP status code는200 이다. 사실, 요청된 document가 모두 다 발견되지 않아도 여전히 200 이다. 그 이유는 mget request 자체가 성공적으로 완료되었기 때문이다. 개별 document의 성공 여부를 결정하려면 found flag를 확인해야 한다.

'2.X > 1. Getting Started' 카테고리의 다른 글

1-03-10. Optimistic Concurrency Control (0)	2017.10.01
1-03-11. Partial Updates to Documents (0)	2017.10.01
1-03-13. Cheaper in Bulk (0)	2017.10.01
1-04. Distributed Document Store (0)	2017.10.01
1-04-1. Routing a Document to a Shard (0)	2017.10.01

현재글1-03-12. Retrieving Multiple Documents

elasticsearch, definitive guide

replica, phrase, MATCH, Shard, full-text, json, index, Filter, Term, cache, Cluster, Query, score, inverted, Relevance, Mapping, primary, parent, Type, Size,

Today :
Yesterday :

일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

不爲也比不能也