Pacuna's Blog

Running a Zookeeper Ensemble on AWS

In this post I’ll describe the steps to deploy a Zookeeper Ensemble on AWS. I won’t be using any tool to automate the process. Just plain old manual configuration. I’m writing this because during the process of setting up the cluster I encountered some issues that weren’t covered in other posts I saw. So if you are trying to run your own Zookeeper cluster on AWS and you want to learn how to configure it manually, this post might be useful for you.

Prerequisites

I’m assuming you’re comfortable working with AWS and Linux, and that you already have a AWS account to work with. I won’t be covering basic AWS stuff . Also, this is about the deployment, not about Zookeeper. So if you want to learn more about the tool itself, you can visit the official site.

Launching the instances

One thing I don’t enjoy too much is to configure Java on a new machine. There’s an Amazon Linux instance that comes with Java already installed and configured. In the instances list, make sure you find this image (there’s another one that doesn’t come with Java preinstalled):

Select that image and choose the family you want to use. I’ll be using the m4.large instance in case you want to follow along:

Let’s launch a 3-node cluster. The recommended setting for a non-standalone Zookeeper deployment is to always use an odd number of nodes. This is because the consensus algorithm that zookeeper uses is designed to work with an odd number.

I’ll leave the default values for the rest of the configuration. You can also configure the storage but since this is just a proof of concept, I’ll stick with the 8GB EBS that comes by default, then click on review and launch and then launch.

Installing Zookeeper

We have to install zookeeper in each one of the nodes. The configuration will be almost the same for the 3 machines with just a minor difference.

Once the instances are up and running let’s add a name to each one so we don’t get confused. We need to be able to identify each node and their addresses. I’ll use zk-1, zk-2 and zk-3.

Now let’s ssh into the first machine and start with the process:

ssh -i zk-cluster.pem [email protected]–18–222–212–204.us-east-2.compute.amazonaws.com

Look for the stable version of Zookeeper in the download section, copy the link and download it:

wget https://www-us.apache.org/dist/zookeeper/stable/zookeeper-3.4.12.tar.gz

Now, extract Zookeeper and install it:

tar -xzf zookeeper-3.4.12.tar.gz
sudo mv zookeeper-3.4.12 /usr/local/zookeeper

We also need to create a data directory:

sudo mkdir /var/lib/zookeeper

Zookeeper comes with a sample configuration file that we can use as a base:

sudo cp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper/conf/zoo.cfg

This zoo.cfg file will be used by default when launching the Jar file. Open the file and change the dataDir parameter to the data directory we just created:

dataDir=/var/lib/zookeeper

Also, we need to add a section that contains the other nodes information. The only information we need is their private IP addresses which you can copy from the AWS console:

Using the same id order we used for the name tags, add the following to the configuration file:

server.1=172.31.47.44:2888:3888
server.2=172.31.39.15:2888:3888
server.3=172.31.46.188:2888:3888

We are defining 3 nodes and their ids (1, 2 and 3). Then we use their private address, then the port used for communication, and finally a port that’s used for leader election.

The final configuration file should look like this:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/lib/zookeeper
clientPort=2181
server.1=172.31.47.44:2888:3888
server.2=172.31.39.15:2888:3888
server.3=172.31.46.188:2888:3888

Important: all the steps we’ve done so far should be replicated for each node.

Once you finished with all the nodes, the last step is to create an identifier file for each of them. This file should be located inside the data directory /var/lib/zookeeper and should be named myid. This file will contain only the id for the correspondent node.

So, for example, for the zk-1 node, create the file and add 1 to it:

echo 1 | sudo tee -a /var/lib/zookeeper/myid

For the second server zk-2 use 2:

echo 2 | sudo tee -a /var/lib/zookeeper/myid

And finally for the third server zk-3 use 3:

echo 3| sudo tee -a /var/lib/zookeeper/myid

And that’s it for the configuration

Launching the cluster

Before launching the Zookeeper servers, we need to modify the security group the machines were launch with, so they can communicate to each other through the ports we defined in the configuration. The 3 machines were launched using the same security group so let’s just add a new inbound rule for all TCP ports but restricted to the subnet we’re using:

In case you don’t want access for every port, you need at least access to ports 2181, 2888 and 3888 for all the nodes.

Now we can start the Zookeeper server for each node. It’s convenient to have an opened ssh session simultaneously in case you want to check for any problems.

Inside each node, run the following command:

sudo /usr/local/zookeeper/bin/zkServer.sh start

After starting the servers, you can run

sudo /usr/local/zookeeper/bin/zkServer.sh status

To check the status of each one of the nodes. You’ll see which one was elected as the leader and which ones are followers:

Troubleshooting

Useful things I found during the process:

  • Use start-foreground instead of start to test if the server starts correctly (or use tail -f zookeeper.out to checkout the logs if you using the background start)
  • Make sure your security groups are well configured and that the CIDR you choose covers all the nodes IP addresses.

And that’s it! Now you can start experimenting with your Zookeeper cluster.

Thanks for reading!

View original

#zookeeper #aws

- 1 toasts