我建议您熟悉“ Hadoop中的大量数据的分布式处理方法”系列中的讲座“ Hadoop。ZooKeeper”的解码。
什么是ZooKeeper,它在Hadoop生态系统中的位置。关于分布式计算的真相。标准分布式系统的方案。协调分布式系统的复杂性。典型的协调问题。ZooKeeper设计中体现的原理。ZooKeeper数据模型。Znode标志。会议。客户端API 基元(配置,组成员身份,简单的锁定,组长选举,没有成群效应的锁定)。ZooKeeper体系结构。ZooKeeper数据库。ZAB。请求处理程序。

今天让我们谈论ZooKeeper。这东西很有用。与任何Apache Hadoop产品一样,它也带有徽标。它描绘了一个男人。
, , , . . - - . . ZooKeeper – , . , , .
, , . , MapReduce , , . , , . MapReduce - , . MapReduce , , , . , ZooKeeper.
, Hadoop, Yahoo! Apache. HBase. JIRA HBase, , - , . . . ZooKeeper, , , , . , Hadoop. , , , .

, - , . , , . , , ZooKeeper, . . – , . HDFS, MapReduce , . , ZooKeeper. - , .

? , , , . , , , , , , - , . TCP, , . TCP . . - . . , , . , - , , . .
, , latency. , . Latency – - , , .
. – . -, , , . , . . . , , . .
. , - . , - Hadoop. . , . , - , , . - . .
, , -, . , , cat . – Vim . , . Vim , , , - . , , .

, .
, , , . – ? , . , ? , , - . - , , - , - . , -, - , ?
, . , – , - .
, , , - , – ? - race condition, , , - ? - . .
– . , , , .
, , , , . , , , . - , , , . , , . . . . - , , .
ZooKeeper. – , , , .

, . , – HDFS, HBase. -, , slave-. , , , .

– Coordination Service, . . , - backup stanby , . , . backup. - , , backup. Coordination Service. , .

– , . , . - , . , - slaves, , , . .

, , -slaves, . , . , Cassandra, .
, . .

- , , , , . – , , , . - , . . .

(), , , .

– partial failures. , , - , , , , , . . .
, . – , . . , .
ZooKeeper , .

, , , , , . , shared-nothing. , , , , , .
, shared memory . context switch, . . , , .

, , , . - . , , , , , , . , , , , . , .
. , , , , . . , .
. , , .
? , ? , , , . -, , - , - . - , , - . , . . , , .
– group membership. - , – , . , , , , . , , .
– leader election, , . – , - , . , , . , , , .
– mutually exclusive access. . , , - , , , . - . , , -. , , locks.
ZooKeeper . , .

. - , - . , , - . .
.
- , , .

ZooKeeper . – standalone, . . , . – 100 , , 100 . , ZooKeeper. high availability. ZooKeeper . , . , . – , , – . , .
. , , . .
, - . - . , , , . . – . – -? . watch mechanism, , - . .

Client – , ZooKeeper.
Server – ZooKeeper.
Znode – ZooKeeper. znode ZooKeeper , .
. – update/write, - . .
, , , ZooKeeper.

ZooKeeper . , . , . znodes.
znode - , , , 10 . znode - .

Znode . . znode , .
. – ephemeral flag. Znode - . , . , . , - . , .
– sequential flag. znode. , 1_5. , p_1, – p_2. , , , , – sequential.
znode. , .

– watch flag. , - . , . ZooKeeper , . , - . , - , .
, . , , .

. , , . - .
- . , - . , , .

, API . , , create znode . znode, . - , . . znode.
– . , , znode . , znode, , , , .
, «-1».

– znode. true, , false.
flag watch, .
flag , . .
– getData. , znode . flag watch. , . , , .

SetData. version. , znode .
«-1», .
– getChildren. znode, . , flag watch.
sync , , .
, , , write, - , , , . , , , , .

. . , .
, , update/write, . create, setData, sync, delete. read – exists, getData, getChildren.

, . , -. . , . . ZooKeeper? , ? , , , , ?
ZooKeeper . , znode. , , . , . , , .
getData . true. - , , , , . , - , true. , .
SetData. , «-1», . . , , , . , . , , , . , , , - . , , . , , . .

– group membership. , , . , . -, - , .
? workers create. . sequential , . children , , .


, Java-. , main. , . host, , . . . – .
? API, . . ZooKeeper. hosts. , , 5 . , connectedSignal. , . , - . persistent. , , . . . , . , close , . , - , ZooKeeper .

- ? . , - , . , , lock1. , , lock . , getData , . . , watcher , , . . , lock, , lock , , . . . , . lock , , - .

, . , . ? , . . , . . - , . , .
- , , , . , . - , , .
, , herd effect, . . , , , , , .

, . , lock, hert effect. , id lock. , lock , , , , lock. , lock. , id, lock, . , .
id, lock, watcher , - . . . lock. , id lock, . , lock, - .

ZooKeeper? 4 . – Request. ZooKeeper Atomic Broadcast. Commit Log, . In-memory Replicated DB, . . , .
, Request Processor. In-memory .

. instances ZooKeeper .
, Commit log. , , , log . .

ZooKeeper Atomic Broadcast – , .
ZAB ZooKeeper. , - . , . , . , , , . . broadcasting , .
write request. , transactional update. .
这里值得一提的是,可以保证同一操作的更新是幂等的。这是什么?如果执行两次,它将具有相同的状态,即请求本身不会因此改变。并且您需要这样做,以便在跌倒的情况下可以重新启动操作,从而滚动当前已掉落的更改。在这种情况下,系统的状态将变得相同,也就是说,不应使一系列相同的状态(例如更新过程)导致系统的最终状态不同。







