200 TB Elasticsearch Cluster +


Elasticsearch dihadapi oleh banyak orang. Tetapi apa yang terjadi ketika Anda ingin menggunakannya untuk menyimpan log "dalam volume yang sangat besar"? Ya, dan tanpa kesulitan selamat dari kegagalan salah satu dari beberapa pusat data? Arsitektur apa yang layak dilakukan, dan perangkap apa yang akan ditemukan?


elasticsearch -, : , .


— , . , Manticore Search, Sphinx search, Elasticsearch. , - ...search, . .


, , Elasticsearch. , - , -, .



:


  • Graylog. , , .
  • : 50-80 , - , , 2-3
  • , , : .
  • , , , .
  • , — - .
  • , : , , - (, ).

, .



-, - Elasticsearch ( ).


- 18 — , , .


: Podman , one-cloud. 2 , 2.0Ghz v4 , .


:




:


  • 3-4 VIP - Graylog, , .
  • VIP LVS.
  • Graylog, GELF, syslog.
  • Elasticsearch.
  • , , -.



, , .


Elasticsearch — master, coordinator, data node. , .


Master
, , , cluster wide housekeeping.


Coordinator
- : . , , , master, , .


Data node
, , , .


Graylog
- Kibana Logstash ELK-. Graylog UI . Graylog Kafka Zookeeper, Graylog . Graylog (Kafka) Elasticsearch , . Logstash, Graylog Elasticsearch.


, Graylog service discovery, Elasticsearch , .


:



. , .



, , , .


: Elasticsearch data nodes.


— , Elasticsearch. , Lucene index. Lucene index, , .



, «» -.


, ( ) -. -, , replication factor, ( ). , -, , , , .


30 .


:



- — . — primary-, . — replica-. -.


, -. , , , :



, .. , 48 ( : 48 ).


:


- , , , , . “” . “ ” , .


, - , . - , - . .


latency , SSD. , , 56 . 56 - , , Elasticsearch . Elasitcsearch thread pool , - " — ".


, - 20 , 1 360 . , 48 , 15 . 2 .



, .


, Graylog - . , 2-3 .


, Graylog , : « , — ».


Master : « 71», -, primary-shard 71.


replica-shard, -.



Graylog . , Elasticsearch round-robin primary-shard replica-shard.



180 , , , , «» -. , , , .


48 300-400ms, , leading wildcard.


«» Elasticsearch: Java



, , .


, Elasticsearch Java.



, Lucene, background job', Lucene . , OutOfMemoryError-. , , , .


, Lucene- . . ( heap.size RAM), - off-heap , - ~500MB, .


: RAM , , .



4-5 , - 10-20.


, , off-heap Elasticsearch . , direct buffer pools , , explicit GC Elasticsearch.


- , . .


: Java . 16 (-XX:MaxDirectMemorySize=16g), , explicit GC , , .



, «, » , .


, mmapfs, . , mmapfs , mapped-. - , GC safepoint, , . , master , . 5-10 garbage collector, , . “, ” - .


, niofs, , Elastic , hybridfs, . .



, . 2-3 , .


Full GC, - , . GC : , , , — .


, , - , . , , .


, , - , - Elasticsearch, , .


, , , . GC , . GC, , .


, , — JDK13 Shenandoah. , .


Java .


«» Elasticsearch:



, , - .


: - «» , , Graylog es_rejected_execution.


- , thread_pool.write.queue - , Elasticsearch , 200 . Elasticsearch . - .


, : 300 , , Full GC.


, , , Graylog , , 3 , . , , Elasticsearch , , ( ), , , .


, - - , Elastic, - — - Graylog.


, , : , , .


. , , , , , , .


, Elasticsearch , - round-robin (, primary-shard, , ), replica-shard, . , use_adaptive_replica_selection: true.


:



query time , .


, -.


:


  • - master, .
  • .
  • : - - primary-, replica- -, .
  • , , .

, - :



:



?


- .


?


, TaskBatcher, , . , replica primary, - - — TaskBatcher, .


- , - - « - - -».


- , . , , . , , .


, - , full GC. - , , .


, 6.4.0, , 10 - 360 , .


:



6.4.0, , - . «» . : 2, 3 10 ( , ) -, - , , , B, C, D.


- - - , - 20-30 , - .


, , , « » . , , 7.2.


, - , , , , - primary-shard ( replica-shard - primary, ).


, «», - stale . , , - , -, - - . .


- 5 . .


:


  • 360 - 700 .
  • 60 -.
  • 40 , 6.4.0 — -, ,
  • , .
  • heap.size, 31 : , leading wildcard - , circuit breaker Elasticsearch.
  • , , , .


, , :


  • - , , - . - - , 2-3 , 2, 3, 4 — , - , .
  • , pending-. , , - , , , replica- primary, - .
  • garbage collector, .
  • , , « ».
  • , heap, RAM I/O.

Thread Pool Elasticsearch. Elasticsearch , , thread_pool.management. , , _cat/shards , . , , thread_pool.management , , 5 , , .


: ! , .


, - , , , , .



All Articles