
Elasticsearch面临许多挑战。但是,当您要使用它“大量存储”日志时会发生什么?是的,并且可以轻松地在多个数据中心中的任何一个发生故障时幸免?什么样的体系结构是值得做的,会遇到什么陷阱?
elasticsearch -, : , .
— , . , Manticore Search, Sphinx search, Elasticsearch. , - ...search, . .
, , Elasticsearch. , - , -, .
:
- Graylog. , , .
- : 50-80 , - , , 2-3
- , , : .
- , , , .
- , — - .
- , : , , - (, ).
, .
-, - Elasticsearch ( ).
- 18 — , , .
: Podman , one-cloud. 2 , 2.0Ghz v4 , .
:

:
- 3-4 VIP - Graylog, , .
- VIP LVS.
- Graylog, GELF, syslog.
- Elasticsearch.
- , , -.

, , .
Elasticsearch — master, coordinator, data node. , .
Master
, , , cluster wide housekeeping.
Coordinator
- : . , , , master, , .
Data node
, , , .
Graylog
- Kibana Logstash ELK-. Graylog UI . Graylog Kafka Zookeeper, Graylog . Graylog (Kafka) Elasticsearch , . Logstash, Graylog Elasticsearch.
, Graylog service discovery, Elasticsearch , .
:

. , .
, , , .
: Elasticsearch data nodes.
— , Elasticsearch. , Lucene index. Lucene index, , .

, «» -.
, ( ) -. -, , replication factor, ( ). , -, , , , .
30 .
:

- — . — primary-, . — replica-. -.
, -. , , , :

, .. , 48 ( : 48 ).
:
- , , , , . “” . “ ” , .
, - , . - , - . .
latency , SSD. , , 56 . 56 - , , Elasticsearch . Elasitcsearch thread pool , - " — ".
, - 20 , 1 360 . , 48 , 15 . 2 .
, .
, Graylog - . , 2-3 .
, Graylog , : « , — ».
Master : « 71», -, primary-shard 71.
replica-shard, -.

Graylog . , Elasticsearch round-robin primary-shard replica-shard.

180 , , , , «» -. , , , .
48 300-400ms, , leading wildcard.
«» Elasticsearch: Java

, , .
, Elasticsearch Java.
, Lucene, background job', Lucene . , OutOfMemoryError-. , , , .
, Lucene- . . ( heap.size RAM), - off-heap , - ~500MB, .
: RAM , , .
4-5 , - 10-20.
, , off-heap Elasticsearch . , direct buffer pools , , explicit GC Elasticsearch.
- , . .
: Java . 16 (-XX:MaxDirectMemorySize=16g), , explicit GC , , .
, «, » , .
, mmapfs, . , mmapfs , mapped-. - , GC safepoint, , . , master , . 5-10 garbage collector, , . “, ” - .
, niofs, , Elastic , hybridfs, . .
, . 2-3 , .
Full GC, - , . GC : , , , — .
, , - , . , , .
, , - , - Elasticsearch, , .
, , , . GC , . GC, , .
, , — JDK13 Shenandoah. , .
Java .
«» Elasticsearch:

, , - .
: - «» , , Graylog es_rejected_execution.
- , thread_pool.write.queue - , Elasticsearch , 200 . Elasticsearch . - .
, : 300 , , Full GC.
, , , Graylog , , 3 , . , , Elasticsearch , , ( ), , , .
, - - , Elastic, - — - Graylog.
, , : , , .
. , , , , , , .
, Elasticsearch , - round-robin (, primary-shard, , ), replica-shard, . , use_adaptive_replica_selection: true.
:

query time , .
, -.
:
- - master, .
- .
- : - - primary-, replica- -, .
- , , .
, - :

:

?
- .
?
, TaskBatcher, , . , replica primary, - - — TaskBatcher, .
- , - - « - - -».
- , . , , . , , .
, - , full GC. - , , .
, 6.4.0, , 10 - 360 , .
:

6.4.0, , - . «» . : 2, 3 10 ( , ) -, - , , , B, C, D.
- - - , - 20-30 , - .
, , , « » . , , 7.2.
, - , , , , - primary-shard ( replica-shard - primary, ).
, «», - stale . , , - , -, - - . .
- 5 . .
:
- 360 - 700 .
- 60 -.
- 40 , 6.4.0 — -, ,
- , .
- heap.size, 31 : , leading wildcard - , circuit breaker Elasticsearch.
- , , , .
, , :
- - , , - . - - , 2-3 , 2, 3, 4 — , - , .
- , pending-. , , - , , , replica- primary, - .
- garbage collector, .
- , , « ».
- , heap, RAM I/O.
Thread Pool Elasticsearch. Elasticsearch , , thread_pool.management. , , _cat/shards , . , , thread_pool.management , , 5 , , .
: ! , .
, - , , , , .
