In my previous post I talked about how to: Setting up Ganglia in CentOS environment. At that time, I used only a single cluster for the whole setup. But it's highly unlikely that you have only a single cluster in your development/production environment. Consider you have two clusters - 1. Storm 2. Kafka and you want to monitor all of these cluster nodes through a single Ganglia UI. You do not have to install Ganglia multiple times for that, you just need to configure your Ganglia. It would have been much easier if AWS supports multicast but as it doesn't support multicast, you need to do a work-around in unicast mode to achieve monitoring multiple clusters in one single Ganglia.
The idea behind this work-around is pretty straightforward. Suppose I have two clusters: cluster#1 - Storm and cluster#2 - Kafka and their respective IP addresses are:
10.0.0.194 - Storm Cluster (supervisor 1)
10.0.0.195 - Storm Cluster (supervisor 2)
10.0.0.196 - Storm Cluster (supervisor 3)
10.0.0.182 - Storm Cluster (nimbus)
10.0.0.249 - Kafka Cluster
10.0.0.250 - Kafka Cluster
10.0.0.251 - Kafka Cluster
10.0.0.33 - my client machine
What I am going to do is, I will configure each of the cluster to send collected data (gmond) to one of their specific node only and configure the gmetad daemon in a way that it can collects the data only from a designated node (gmond daemon) from each cluster. Ganglia will categorize each cluster data by their unique cluster name defined in gmond.conf file.
As you can see in the above figure that all Kakfa cluster's data is sending to one specific node - 10.0.0.249 and all Storm cluster's data is sending to one of its node - 10.0.0.182. Client machine (10.0.0.33) is running gmetad daemon and I will configure that daemon so that it can look for two data sources for two clusters where their source IP addresses will be 10.0.0.249 and 10.0.0.182 for Kafka and Storm respectively.
I'm assuming that you already setup your Ganglia and it's running as expected. So I am not going to discuess about what is gmond.conf and gmetad.conf files. In case if you have not setup yet, you might want to take a look at this post.
This is my gmond.conf file (only the part which I modified) which I'm using for all Kafka hosts (this file is unique for each host per cluster):
And here is my gmond.conf file for all Storm hosts (this file is unique for each host per cluster):
You notice that I'm using unique host address for udp_send_channel for each cluster. Now, I need to tell my gmetad daemon to look for those two host address to collect data from. Here is my gmetad.conf file:
You are done! Now restart all gmond daemons and gmetad daemon and wait for few minutes.
Once you navigate to your Ganglia UI url you should be able to see your grid and list of your clusters in the drop-down.
You are done! Now restart all gmond daemons and gmetad daemon and wait for few minutes.
Once you navigate to your Ganglia UI url you should be able to see your grid and list of your clusters in the drop-down.
You can dig further to see each of your host for each cluster:
There is another work-around which you can also try to get a better understanding of Ganglia. In that case you need to use separate port number for each cluster. Here, I'm distinguishing each cluster's data source per IP address, but in that work-around you can have a single IP address for all clusters but multiple port numbers. You can try that work-around as an exercise :).
Note: For privacy purpose, I had to modify several lines on this post from my original post. So if you find something is not working or facing any issues, please do not hesitate to contact me.
There is another work-around which you can also try to get a better understanding of Ganglia. In that case you need to use separate port number for each cluster. Here, I'm distinguishing each cluster's data source per IP address, but in that work-around you can have a single IP address for all clusters but multiple port numbers. You can try that work-around as an exercise :).
Note: For privacy purpose, I had to modify several lines on this post from my original post. So if you find something is not working or facing any issues, please do not hesitate to contact me.
There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this.
ReplyDeleteHadoop Training in Chennai
Big Data Training in Chennai
I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.
ReplyDeleteRegards,
Python Courses in Chennai|Python Classes in Chennai|Python training courses
I was just wondering how I missed this article so far, this is a great piece of content I have ever seen in the entire Internet. Thanks for sharing this worth able information in here and do keep blogging like this.
ReplyDeleteHadoop Training Chennai | Best hadoop training institute in chennai | Hadoop training institutes in chennai
Thanks for the interesting blog.Hadoop is a platform for storing and processing of Data in an environment with clusters of computers using simple programming language.It is designed in such a way that it connects from single servers to group of servers with proper computation and storage.
ReplyDeleteHadoop training in chennai
Dot Net Training Institute in Chennai | Dot Net Training Institute in Velachery.
ReplyDeleteThe blog gave me idea to configure ganglia for multiple cluster Thanks for sharing it
ReplyDeleteHadoop Training in Chennai
Hi friend..,
ReplyDeleteI read your blog starting onwards. This gave me the good idea for hadoop. I miss this artical so many days. Thanks for sharing this blog.keep sharing more articals.
Hadoop Training in Bangalore
Nice blog has been shared by you. it will be really helpful to many peoples who are all working under the technology.thank you for sharing this blog.
ReplyDeleteHadoop Online Training
Data Science Online Training
The blog gave me idea to Configure Ganglia for multiple clusters in Unicast mode My sincere thanks for sharing this post please continue to share this post
ReplyDeleteHadoop Training in Chennai
I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.
ReplyDeleteHadoop Online Training
Data Science Online Training
It was really a wonderful article and I was really impressed by reading this blog.
ReplyDeleteBulk SMS Services in Chennai
I ‘d mention that most of us visitors are endowed to exist in a fabulous
ReplyDeleteplace with very many wonderful individuals with very helpful things."Devops Training in Bangalore"