Anda di halaman 1dari 2

How-to: Use Apache ZooKeeper to Build Distributed Apps (and Why)

by Sean Mackrory

February 14, 2013

no comments

Its widely accepted that you should never design or implement your own cryptographic algorithms but
rather use well-tested, peer-reviewed libraries instead. The same can be said of distributed systems:
Making up your own protocols for coordinating a cluster will almost certainly result in frustration and
failure.
Architecting a distributed system is not a trivial problem; it is very prone to race conditions, deadlocks, and
inconsistency. Making cluster coordination fast and scalable is just as hard as making it reliable. Thats
where Apache ZooKeeper, a coordination service that gives you the tools you need to write correct
distributed applications, comes in handy.
With ZooKeeper, these difficult problems are solved once, allowing you to build your application without
trying to reinvent the wheel. ZooKeeper is already used by Apache HBase, HDFS, and other Apache
Hadoop projects to provide highly-available services and, in general, to make distributed programming
easier. In this blog post youll learn how you can use ZooKeeper to easily and safely implement important
features in your distributed software.

How ZooKeeper Works


ZooKeeper runs on a cluster of servers called an ensemble that share the state of your data. (These may
be the same machines that are running other Hadoop services or a separate cluster.) Whenever a change
is made, it is not considered successful until it has been written to a quorum (at least half) of the servers
in the ensemble. A leader is elected within the ensemble, and if two conflicting changes are made at the
same time, the one that is processed by the leader first will succeed and the other will fail. ZooKeeper
guarantees that writes from the same client will be processed in the order they were sent by that client.
This guarantee, along with other features discussed below, allow the system to be used to implement
locks, queues, and other important primitives for distributed queueing. The outcome of a write operation
allows a node to be certain that an identical write has not succeeded for any other node.
A consequence of the way ZooKeeper works is that a server will disconnect all client sessions any time it
has not been able to connect to the quorum for longer than a configurable timeout. The server has no way
to tell if the other servers are actually down or if it has just been separated from them due to a network
partition, and can therefore no longer guarantee consistency with the rest of the ensemble. As long as
more than half of the ensemble is up, the cluster can continue service despite individual server failures.
When a failed server is brought back online it is synchronized with the rest of the ensemble and can
resume service.
It is best to run your ZooKeeper ensemble with an odd number of server; typical ensemble sizes are
three, five, or seven. For instance, if you run five servers and three are down, the cluster will be
unavailable (so you can have one server down for maintenance and still survive an unexpected failure). If
you run six servers, however, the cluster is still unavailable after three failures but the chance of three
simultaneous failures is now slightly higher. Also remember that as you add more servers, you may be

able to tolerate more failures, but you also may begin to have lower write throughput. (Apaches
documentation has a nice illustration of the performance characteristics of various ZooKeeper ensemble
sizes.)

Anda mungkin juga menyukai