4-10-4. Limiting Memory Usage

2.X/4. Aggregations

4-10-4. Limiting Memory Usage

drscg 2017. 9. 23. 22:31

Once analyzed strings have been loaded into fielddata, they will sit there until evicted (or your node crashes). For that reason it is important to keep an eye on this memory usage, understand how and when it loads, and how you can limit the impact on your cluster.

일단 analyzed string이 fielddata에 로드되면, 그것들은 추출될 (또는 node가 crash)) 때까지 그 곳에 있을 것이다. 이런 이유로 이런 메모리 사용을 관찰하고, 그것이 언제 어떻게 로드되는지를 이해하고, cluster에 대한 영향을 어떻게 제한할 것인가는 중요하다.

Fielddata is loaded lazily. If you never aggregate on an analyzed string, you’ll never load fielddata into memory. Furthermore, fielddata is loaded on a per-field basis, meaning only actively used fields will incur the "fielddata tax".

fielddata는 지연되어 로드된다. analyzed string를 절대로 aggregation하지 않는다면, fielddata는 절대로 메모리에 로드되지 않을 것이다. 뿐만 아니라, fielddata는 field별로 로드된다. 이것은 적극적으로 사용되는 field만이 "fielddata tax(비용)" 을 발생시킨다는 것을 의미한다.

However, there is a subtle surprise lurking here. Suppose your query is highly selective and only returns 100 hits. Most people assume fielddata is only loaded for those 100 documents.

그러나, 여기에 숨겨진 미묘한 놀라움이 있다. query가 매우 까다로워서 100개의 hit만을 반환한다고 가정해보자. 대부분의 사람들은 100개의 document만이 fielddata에 로드된다고 추정할 것이다.

In reality, fielddata will be loaded for all documents in that index (for that particular field), regardless of the query’s specificity. The logic is: if you need access to documents X, Y, and Z for this query, you will probably need access to other documents in the next query.

실제로는, query의 특수함에 관계없이, fielddata는 해당 index(해당되는 특정 field를 위해)의 *모든* document를 로드할 것이다. 어떤 query로 document X, Y, Z document에 액세스해야 한다면, 다음 query에서는 다른 document에 액세스해야 할 것이다.

Unlike doc values, the fielddata structure is not created at index time. Instead, it is populated on-the-fly when the query is run. This is a potentially non-trivial operation and can take some time. It is cheaper to load all the values once, and keep them in memory, than load only a portion of the total fielddata repeatedly.

doc values와 달리, fielddata 구조는 index시에 생성되지 않는다. 대신, query가 실행될 때 즉시 채워진다. 이것은 어쩌면 중요한 연산이고 약간의 시간이 걸린다. 모든 값을 한번에 로드해, 메모리에 보관하는 것이 전체 fielddata의 특정 부분만을 반복적으로 로드하는 것보다 더 싸다.

The JVM heap is a limited resource that should be used wisely. A number of mechanisms exist to limit the impact of fielddata on heap usage. These limits are important because abuse of the heap will cause node instability (thanks to slow garbage collections) or even node death (with an OutOfMemory exception).

JVM의 heap은 현명하게 사용해야 하는, 제한된 자원이다. fielddata가 heap 사용에 미치는 영향을 제한하는, 많은 메커니즘이 존재한다. heap의 남용은 node의 불안정(느린 garbage collection 덕분에)이나, crash(OutofMemory Exception)의 원인이기 때문에, 이런 제한은 중요하다.

heap 크기의 선택

There are two rules to apply when setting the Elasticsearch heap size, with the $ES_HEAP_SIZE environment variable:

$ES_HEAP_SIZE 환경 변수를 가지고, Elasticsearch의 heap size를 설정할 때 적용할, 두 가지 규칙이 있다.

이용 가능한 RAM의 50% 이상은 불가

Lucene makes good use of the filesystem caches, which are managed by the kernel. Without enough filesystem cache space, performance will suffer. Furthermore, the more memory dedicated to the heap means less available for all your other fields using doc values.

Lucene은 kernel에 의해 관리되는, filesystem cache를 잘 사용한다. filesystem cache가 충분하지 않으면, 성능이 저하된다. 뿐만 아니라, 더 많은 메모리가 heap 전용으로 사용될 경우 다른 field 모두에 대해 doc values로 사용될 수 있는 메모리가 더 적어진다는 것을 의미한다.

32GB 이상은 불가

If the heap is less than 32 GB, the JVM can use compressed pointers, which saves a lot of memory: 4 bytes per pointer instead of 8 bytes.

heap이 32GB보다 적다면, JVM은 압축 포인터(compressed pointer)를 사용할 수 있는데, 이는 많은 메모리를 절약한다. pointer당 8byte가 아닌 4byte

For a longer and more complete discussion of heap sizing, see Heap: Sizing and Swapping

heap sizing에 대한 더 많은 것은 Heap: Sizing and Swapping을 참조하자.

Fielddata Sizeedit

The indices.fielddata.cache.size controls how much heap space is allocated to fielddata. As you are issuing queries, aggregations on analyzed strings will load into fielddata if the field wasn’t previously loaded. If the resulting fielddata size would exceed the specified size, other values will be evicted in order to make space.

indices.fielddata.cache.size 는 fielddata에 할당되는 heap 공간의 크기를 제어한다. query를 실행하면, 그 필드가 이전에 로드되지 않았다면, analyzed string에 대한 aggregation은 fielddata에 로드될 것이다. 얻어진 fielddata의 크기가 지정된 size 를 초과하면, 공간을 마련하기 위해, 다른 값들은 제거될 것이다.

By default, this setting is unbounded—Elasticsearch will never evict data from fielddata.

기본적으로 이 설정은 무제한(unbounded) 이다. Elasticsearch는 절대로 fielddata에서 데이터를 제거하지 않을 것이다.

This default was chosen deliberately: fielddata is not a transient cache. It is an in-memory data structure that must be accessible for fast execution, and it is expensive to build. If you have to reload data for every request, performance is going to be awful.

이 기본값은 고의적으로 선택되었다. fielddata는 일시적인 cache가 아니다. 그것은 빠른 실행을 위해, 액세스 가능한 메모리에 있는 데이터 구조이고, 구축 비용이 비싸다. 모든 요청에 대해 데이터를 다시 로드한다면, 성능은 끔찍해질 것이다.

A bounded size forces the data structure to evict data. We will look at when to set this value, but first a warning:

제한된 크기는 데이터 구조에서 데이터를 강제로 제거한다. 아래에서 이 값을 설정하여 검토할 것이다. 하지만 먼저 주의할 점이 있다.

This setting is a safeguard, not a solution for insufficient memory.

이 설정은 보호장치이지, 부족한 메모리에 대한 해결책이 아니다.

If you don’t have enough memory to keep your fielddata resident in memory, Elasticsearch will constantly have to reload data from disk, and evict other data to make space. Evictions cause heavy disk I/O and generate a large amount of garbage in memory, which must be garbage collected later on.

fielddata를 메모리에 상주시킬 만큼, 메모리가 충분하지 않다면, Elasticsearch는 항상 디스크에서 데이터를 다시 로드 해야 하고, 공간을 만들기 위해 다른 데이터를 제거해야 한다. 제거는 많은 디스크 I/O를 가져오고, 메모리에 많은 양의 garbage를 만들어낸다. 이를 나중에 garbage collection을 통해 정리해야 한다.

Imagine that you are indexing logs, using a new index every day. Normally you are interested in data from only the last day or two. Although you keep older indices around, you seldom need to query them. However, with the default settings, the fielddata from the old indices is never evicted! fielddata will just keep on growing until you trip the fielddata circuit breaker (see Circuit Breaker), which will prevent you from loading any more fielddata.

매일 새로운 index를 사용하여, log를 색인하는 작업을 가정해 보자. 일반적으로 하루나 이틀 전의 데이터에만 관심을 가지게 된다. 과거의 index를 보관하겠지만, 그들은 거의 조회되지 않는다. 그러나, 기본 설정으로 인해, 과거 index의 fielddata는 절대 제거되지 않는다. fielddata circuit breaker(Circuit Breaker 참조)가 작동될 때까지, 증가를 계속할 것이다. circuit breaker는 이것은 더 이상의 fielddata가 로드되는 것을 방지한다.

At that point, you’re stuck. While you can still run queries that access fielddata from the old indices, you can’t load any new values. Instead, we should evict old values to make space for the new values.

이 시점에서 문제가 발생한다. 과거 index의 fielddata를 액세스하는 query를 실행할 수 있지만, 새로운 값을 로드할 수는 없다. 새로운 값을 위한 공간을 만들기 위해, 기존 값을 제거해야 한다.

To prevent this scenario, place an upper limit on the fielddata by adding this setting to the config/elasticsearch.yml file:

이런 시나리오를 방지하기 위해, config/elasticsearch.yml file에, 아래 설정을 추가하여, fielddata의 상한선을 둘 수 있다.

indices.fielddata.cache.size:  20%

heap 크기의 비율이나, 5GB 같은 고정 값을 설정할 수 있다.

With this setting in place, the least recently used fielddata will be evicted to make space for newly loaded data.

이 설정을 하면, 최소한, 최근에 사용된 fielddata는 새로 로드되는 데이터를 위한 공간을 만들기 위하여, 제거될 것이다.

There is another setting that you may see online: indices.fielddata.cache.expire.

online에서 볼 수 있는 다른 설정(indices.fielddata.cache.expire)이 있다.

We beg that you never use this setting! It will likely be deprecated in the future.

이 설정을 절대로 사용하지 않기를 바란다. 이것은 미래에는 더 이상 사용되지 않을 것이다.

This setting tells Elasticsearch to evict values from fielddata if they are older than expire, whether the values are being used or not.

이 설정은 expire 보다 더 오래된 값을, 그 값의 사용 여부에 관계없이, fielddata에서 제거할 것이다.

This is terrible for performance. Evictions are costly, and this effectively schedulesevictions on purpose, for no real gain.

이것은 성능에 끔찍한 영향 을 준다. 제거에는 많은 비용이 들어가는데, 이것은 실질적인 이점도 없이, 일부러 실질적인 제거를 계획한다.

There isn’t a good reason to use this setting; we literally cannot theory-craft a hypothetically useful situation. It exists only for backward compatibility at the moment. We mention the setting in this book only since, sadly, it has been recommended in various articles on the Internet as a good performance tip.

이 설정을 사용할 타당한 이유가 없다. 문자 그대로, 가상의 유용한 상황을 이론화할 수도 없다. 그것은 현재, 과거 버전과의 호환성 때문에 존재한다. 슬프게도, 이 책에서 그 설정을 언급한 이후로, 인터넷에서 "좋은 성능 팁" 으로, 다양한 글에서 추천되고 있다.

It is not. Never use it!

그렇지 않다. 절대 사용하지 말라.

Monitoring fielddataedit

It is important to keep a close watch on how much memory is being used by fielddata, and whether any data is being evicted. High eviction counts can indicate a serious resource issue and a reason for poor performance.

fielddata에 얼마나 많은 메모리가 사용되고 있는지, 그리고 어떤 데이터가 제거되는지, 자세히 관찰해야 한다. 높은 제거 횟수는 심각한 자원 이슈와 성능 저하의 원인을 가리킨다.

Fielddata usage can be monitored:

fielddata의 사용량은 다음 API로 관찰 할 수 있다.

per-index using the indices-stats API:
indices-stats API 를 이용한 index별 관찰
```
GET /_stats/fielddata?fields=*
```
per-node using the nodes-stats API:
nodes-stats API를 이용한 index별 관찰
```
GET /_nodes/stats/indices/fielddata?fields=*
```

Or even per-index per-node:

또는 node별, index 별 관찰

GET /_nodes/stats/indices/fielddata?level=indices&fields=*

By setting ?fields=*, the memory usage is broken down for each field.

?fields=* 를 설정하면, 메모리 사용량은 각 field로 세분화된다.

Circuit Breakeredit

An astute reader might have noticed a problem with the fielddata size settings. fielddata size is checked after the data is loaded. What happens if a query arrives that tries to load more into fielddata than available memory? The answer is ugly: you would get an OutOfMemoryException.

눈치 빠른 독자는, fielddata 크기 설정에 문제가 있음을 알 수 있을 것이다. fielddata의 크기는 데이터가 로드된 후에 확인된다. fielddata에, 이용할 수 있는 메모리보다 더 많이 로드하려는 query가 오면, 어떻게 될까? 당연히, OutOfMemeoy Exception이 발생할 것이다.

Elasticsearch includes a fielddata circuit breaker that is designed to deal with this situation. The circuit breaker estimates the memory requirements of a query by introspecting the fields involved (their type, cardinality, size, and so forth). It then checks to see whether loading the required fielddata would push the total fielddata size over the configured percentage of the heap.

Elasticsearch는 이런 상황을 처리하기 위하여 설계된, fielddata circuit breaker 를 가지고 있다. circuit breaker는 관련된 field(type, cardinality, size 등)를 가로채, query에 필요한 메모리를 추정한다. 그 다음에, 필요한 fielddata를 총 fielddata의 크기에 넣어, 설정된 heap의 비율 이상인지를 확인한다.

If the estimated query size is larger than the limit, the circuit breaker is tripped and the query will be aborted and return an exception. This happens before data is loaded, which means that you won’t hit an OutOfMemoryException.

추정된 query 크기가 한계보다 더 크면, circuit breaker가 작동 되고, query는 중단되고, exception을 반환한다. 이것은 데이터가 로드되기 전에 일어난다. 즉, OutOfMemory Exception이 발생하지 않는다.

Available Circuit Breakers

Elasticsearch has a family of circuit breakers, all of which work to ensure that memory limits are not exceeded:

Elasticsearch는 circuit breaker를 여러 개 가지고 있다. 그것 모두는 메모리 한계를 초과하지 않도록 보장한다.

indices.breaker.fielddata.limit

The fielddata circuit breaker limits the size of fielddata to 60% of the heap, by default.

fielddata circuit breaker는 기본적으로, fielddata의 크기를 heap의 60% 로 제한한다.

indices.breaker.request.limit

The request circuit breaker estimates the size of structures required to complete other parts of a request, such as creating aggregation buckets, and limits them to 40% of the heap, by default.

request circuit breaker는 request의 다른 부분을 완성하는데 필요한 구조(예: aggregation bucket의 생성)의 크기를 추정하고, 기본적으로, 그들을 heap의 40% 로 제한한다.

indices.breaker.total.limit

The total circuit breaker wraps the request and fielddata circuit breakers to ensure that the combination of the two doesn’t use more than 70% of the heap by default.

total circuit breaker는 request 와 fielddata circuit breaker를 감싼 것이다. 기본적으로, 위의 둘의 조합이 heap의 70% 이상을 사용하지 않도록 한다.

The circuit breaker limits can be specified in the config/elasticsearch.yml file, or can be updated dynamically on a live cluster:

circuit breaker 제한은 config/elasticsearch.yml file에서 지정하거나, 동작하고 있는 cluster에 동적으로 업데이트될 수 있다.

PUT /_cluster/settings
{
  "persistent" : {
    "indices.breaker.fielddata.limit" : "40%" 
  }
}

제한은 heap의 백분율로 나타낸다.

It is best to configure the circuit breaker with a relatively conservative value. Remember that fielddata needs to share the heap with the request circuit breaker, the indexing memory buffer, the filter cache, Lucene data structures for open indices, and various other transient data structures. For this reason, it defaults to a fairly conservative 60%. Overly optimistic settings can cause potential OOM exceptions, which will take down an entire node.

상대적으로 보수적인 값으로 circuit breaker를 설정하는 것이 가장 좋다. fielddata는, request circuit breaker, 색인 메모리 버퍼, filter cache, 열려 있는 indices를 위한 Lucene의 데이터 구조, 그리고 다양한 임시 구조와 heap을 공유해야 한다는 점을 기억하자. 이런 이유로, 상당히 보수적인 60%가 기본이다. 지나치게 낙관적인 설정은 잠재적으로 OOM(OutOfMemory) Exception을 발생시킬 수 있다. 이는 전체 node를 다운시킬 것이다.

On the other hand, an overly conservative value will simply return a query exception that can be handled by your application. An exception is better than a crash. These exceptions should also encourage you to reassess your query: why does a single query need more than 60% of the heap?

반면에, 지나치게 보수적인 값은 단순하게 응용프로그램에서 처리될 수 있는 query를 예외로 반환할 것이다. exception이 crash보다 더 낫다. 이러한 예외는 query를 다시 검토할 수 있는 기회가 된다. 왜 이런 query가 heap의 60% 이상을 사용할까?

In Fielddata Size, we spoke about adding a limit to the size of fielddata, to ensure that old unused fielddata can be evicted. The relationship betweenindices.fielddata.cache.size and indices.breaker.fielddata.limit is an important one. If the circuit-breaker limit is lower than the cache size, no data will ever be evicted. In order for it to work properly, the circuit breaker limit must be higher than the cache size.

Fielddata Size에서, 기존의 사용되지 않은 fielddata를 제거하기 위해, fielddata 크기에 제한을 추가하는 것에 대해 이야기했다. indices.fielddata.cache.size 와 indices.breaker.fielddata.limit 사이의 관계는 중요하다. circuit-breaker 제한이 cache 크기보다 더 작으면, 데이터는 제거되지 않을 것이다. 정상적으로 동작하기 위해서는 circuit breaker 제한이 cache 크기보다 반드시 더 커야 한다.

It is important to note that the circuit breaker compares estimated query size against the total heap size, not against the actual amount of heap memory used. This is done for a variety of technical reasons (for example, the heap may look full but is actually just garbage waiting to be collected, which is hard to estimate properly). But as the end user, this means the setting needs to be conservative, since it is comparing against total heap, not free heap.

circuit breaker가 추정된 query 크기를, 실제 사용된 heap 메모리의 양이 아닌, heap의 총 크기와 비교한다는 점은 중요하다. 다양한 기술적인 이유 때문에 이렇게 동작한다. 예를 들자면, heap이 가득 찬 것으로 보이지만, 실제로는 garbage collection을 기다리고 있다. 이것을 적절하게 추정하는 것은 어렵다. 그러나, 최종 사용자로서는 그 설정이 보수적이어야 한다는 의미이다. 왜냐하면, 남은 heap이 아닌, 총 heap과 비교하기 때문이다.

저작자표시 비영리 변경금지

'2.X > 4. Aggregations' 카테고리의 다른 글

4-10-2. Deep Dive on Doc Values (0)	2017.09.23
4-10-3. Aggregations and Analysis (0)	2017.09.23
4-10-5. Fielddata Filtering (0)	2017.09.23
4-10-6. Preloading Fielddata (0)	2017.09.23
4-10-7. Preventing Combinatorial Explosions (0)	2017.09.23

현재글4-10-4. Limiting Memory Usage

elasticsearch, definitive guide

Shard, parent, full-text, Mapping, cache, primary, score, Term, json, index, Cluster, inverted, replica, Filter, phrase, Size, Type, MATCH, Relevance, Query,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

不爲也比不能也