6-4-10. User-Based Data

2.X/6. Modeling Your Data

6-4-10. User-Based Data

drscg 2017. 9. 23. 12:58

Often, users start using Elasticsearch because they need to add full-text search or analytics to an existing application. They create a single index that holds all of their documents. Gradually, others in the company realize how much benefit Elasticsearch brings, and they want to add their data to Elasticsearch as well.

흔히, 사용자가 기존의 응용프로그램에 full-text 검색이나 분석 기능을 추가해야 하는 경우, Elasticsearch 사용을 시작한다. 모든 document를 가지고 있는 단일 index를 생성한다. 점차적으로 회사에 있는 다른 사람들도 Elasticsearch가 가져오는 혜택이 얼마나 많은지를 알게 된다. 그들은 그들의 데이터도 Elasticsearch에 추가하기를 원한다.

Fortunately, Elasticsearch supports multitenancy so each new user can have her own index in the same cluster. Occasionally, somebody will want to search across the documents for all users, which they can do by searching across all indices, but most of the time, users are interested in only their own documents.

다행히도, Elasticsearch는 multitenancy를 지원한다. 따라서, 각각의 새로운 사용자는 동일한 cluster에 그들 자신의 index를 가질 수 있다. 가끔, 누군가가 모든 사용자의 document를 검색하려 하면, 모든 index를 검색하면 된다. 하지만, 주로, 사용자는 그들 자신의 document에게만 관심이 있다.

Some users have more documents than others, and some users will have heavier search loads than others, so the ability to specify the number of primary shards and replica shards that each index should have fits well with the index-per-user model. Similarly, busier indices can be allocated to stronger boxes with shard allocation filtering. (See Migrate Old Indices.)

어떤 사용자는 다른 이보다 많은 document를 가지고 있고, 어떤 사용자는 다른 사용자보다 더 많은 검색 부하를 가질 수 있다. 그래서, 각 index가 가질 수 있는 primary shard와 replica shard의 수를 지정하는 기능은 사용자 별 index(index-per-user) 모델과 잘 맞아야 한다. 유사하게, 더 바쁜 index는 shard 할당 filtering(Migrate Old Indices 참조)을 통해 더 강력한 박스에 할당될 수 있다.

Don’t just use the default number of primary shards for every index. Think about how much data that index needs to hold. It may be that all you need is one shard—any more is a waste of resources.

모든 index에 primary shard의 수를 기본값으로 사용하지 말자. index가 가질 데이터가 얼마나 되는지를 생각하자. 필요한 것은 단일 shard일지도 모른다. 더 이상은 자원의 낭비이다.

Most users of Elasticsearch can stop here. A simple index-per-user approach is sufficient for the majority of cases.

대부분의 Elasticsearch 사용자는 여기에서 멈출 수 있다. 간단한 사용자 별 index(index-per-user) 접근 방식은 대부분의 경우에 충분하다.

In exceptional cases, you may find that you need to support a large number of users, all with similar needs. An example might be hosting a search engine for thousands of email forums. Some forums may have a huge amount of traffic, but the majority of forums are quite small. Dedicating an index with a single shard to a small forum is overkill—a single shard could hold the data for many forums.

예외적인 경우에, 모두가 유사한 요구 사항을 가진, 다수의 사용자를 지원해야 하는 경우가 있다. 예를 들어, 수천 개의 이메일 포럼에 대한 검색 엔진을 호스팅 할 수 있다. 어떤 포럼은 거대한 양의 트래픽을 가지겠지만, 대부분의 포럼은 아주 적다. 어떤 작은 포럼에 단일 shard를 가진 index를 전용으로 하는 것은 지나치다. 단일 shard는 많은 포럼에 대한 데이터를 가질 수 있다.

What we need is a way to share resources across users, to give the impression that each user has his own index without wasting resources on small users.

필요한 것은 작은 사용자들에게 자원을 낭비하지 않고, 각 사용자가 자신의 index를 가진 듯한 인상을 주도록, 사용자들간에 자원을 공유하는 방식이다.

'2.X > 6. Modeling Your Data' 카테고리의 다른 글

6-4-08. Index Templates (0)	2017.09.23
6-4-09. Retiring Data (0)	2017.09.23
6-4-11. Shared Index (0)	2017.09.23
6-4-12. Faking Index per User with Aliases (0)	2017.09.23
6-4-13. One Big User (0)	2017.09.23

현재글6-4-10. User-Based Data

elasticsearch, definitive guide

Term, Cluster, index, Query, primary, json, replica, Filter, Shard, Type, MATCH, Size, score, inverted, Relevance, phrase, Mapping, full-text, cache, parent,

Today :
Yesterday :

일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

不爲也比不能也