How a DNS problem can put your Mysql server down

Last week i was waked up from bed by the monitoring team from my company. There was a problem with my system, there was a DNS problem undergoing but as a side effect my app was down. Since it has a lot of traffic it had to be solved immediately.

I jumped to the computer and I quickly diagnosed the system. Everything was fine except the Mysql connection pool which was exhausted. The first thing that crossed my mind is that it was just a coincidence and I quickly ran show processlist to see a list of MySQL processes. The output was an infinite list of load balancer’s ip address having “login” text as status. In order to achieve high availability i am using Mysql by having a balanced ip address between two Mysql servers. The balancer runs a quick check every 5 seconds by connecting to Mysql and does a simple select on a table.

So for a particular reason the “load balancer” was not able to finish its login attempts and it was overloading my Mysql servers. While I was in the middle of the investigation the problem suddenly stopped. I was happy but somehow scared, i had no idea what the hell happened.

A quick search into Mysql documentation reveals that Mysql is doing a reverse DNS lookup which was the cause of my problems. Since the DNS server had a problem, the operation of reverse DNS was taking far more that 5 seconds to time out. This resulted in overloading the database servers. Check this explanation in the official documentation, How MySQL Uses DNS

After reading tha page I think that mysql needs this reverse DNS lookup only for its permission module and if you don’t use host names with the grant option then you are safe to disable this option. I quote here the parameter which does this:

–skip-name-resolve

Do not resolve host names when checking client connections. Use only IP numbers. If you use this option, all Host column values in the grant tables must be IP numbers or localhost. See Section 7.5.11, “How MySQL Uses DNS”.

I have been able to avoid this? Perhaps, but considering that I used MySQL in production for the first time, it is unlikely to think so.

Long live the reverse DNS, cheers!

Creating a secure JMX Agent in JDK 1.5

What is JMX?

Java Management Extension is an open technology for management, and monitoring that can be deployed wherever management and monitoring are needed. The most common use in a web application is for application management. This is very often an afterthought which results in many unmanaged application deployments.

You can monitor you application for availability and performance but in the same time you can use the JMX to manage and monitor you application from business perspective. Application’s runtime metrics can be expose through JMX, or in a service oriented architecture you could use JMX to control your services.

All good but when you start to work with JMX and JDK 1.5 soon you will discover one big limitation that was fixed in jdk 1.6 update 16 if i recall correctly:

Default RMI JMX agent for remote access opens 2 ports, one which is set by the -Dcom.sun.management.jmxremote.port=XXXX and one randomly assigned port.. What about firewalls?

JMX service url

service:jmx:rmi://hostname:port1/jndi/rmi//hostname:port2/jmxrmi

Where:

Read more

Configure Apache and Tomcat severs together

The most common way to deploy your application in the production environment is to hide the Tomcat behind Apache. This has good and bad parts but it gives you a lot of flexibility and support from Apache. There are a couple of alternatives to put these two severs together:

Read more

Tomcat Clustering & Java Servlet Specification

After I read more about Tomcat Clustering I realized that the main purpose of Tomcat clustering is to offer fault tolerance, failover  and high availability support. I read a lot about load balancing but when it comes to Java Servlets I found out that the only choice you have in terms of balancing is to use sticky sessions. This is more a limitation that comes from Java Servlet Specification and not from Tomcat, but it make sense.

For an application to be “distributed” you have to mark  it as “distributable” by add the <distributable/> tag in web.xml.

<web-app>
<distributable />
</web-app>

There are multiple ways to balance the client request to your server pool but when it comes to Java Servlet Specification you have only one choice, as the specs say:

Within an application that is marked as distributable, all requests that are part of a session can only be handled on a single JVM at any one time.

You may have multiple JVMs, each handling requests from different clients concurrently for any given distributable web application

So, I guess you can kiss goodbye the round robin and all other load balancing options, but at least Tomcat will provide you  failover, scalability  and high availability.

Tomcat clustering configuration

The following steps assume that you have installed a Tomcat 5.5.x bundle or latest, i only tested on 5.5.27 but is should work for other configuration as well. The network configuration apply to Linux and may vary with the distribution. It should work as is for distributions based on Red Hat.

For Tomcat clustering we have two main things to configure:

Configure the network support for cluster

Opening Specific HTTP Ports (e.g. Port 45564, 4001)

The cluster class will start up a membership service (multicast) and a replication service (tcp unicast). See also http://www.cyberciti.biz/faq/howto-rhel-linux-open-port-using-iptables/ for a brief article on this. You will need to have root access as noted above to complete this.

Your server may or may not already have this entry. Open iptables:

> vi /etc/sysconfig/iptables

Add the following entries:

-A RH-Firewall-1-INPUT -p udp -m udp --dport 45564 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 45564 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 4001 -j ACCEPT

Save and close the above file and after restart the iptables

  > /etc/init.d/iptables restart

Configure the multicast address and routes.

Clustering membership is established using very simple multicast pings. Each Tomcat instance will periodically send out a multicast ping, in the ping message the instance will broad cast its IP and TCP listen port for replication. If an instance has not received such a ping within a given timeframe, the member is considered dead.

Add route  (the server’s ip address)

sudo /sbin/route add 228.0.0.4 gw 10.72.10.1 dev bond0

Edit rc.local to make the change persistent through restarts.

sudo vim /etc/rc.d/rc.local

Add this line at the end (the server’s ip address)

/sbin/route add 228.0.0.4 gw 10.72.10.1 dev bond0

Configure Tomcat to support clustering.

Application clustering with Tomcat has two steps:

Enable Tomcat clustering support

You need to enable the cluster support in Tomcat by editing the server.xml file. Open server.xml

sudo vim /usr/local/tomcat-5.5.27/conf/server.xml

Enable clustering configuration in the configuration file, notice that the default configuration is using the DeltaManager which will replicate only the session’s changes and not the entire object:

<Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
managerClassName="org.apache.catalina.cluster.session.DeltaManager"
	expireSessionsOnShutdown="false"
	useDirtyFlag="true"
	notifyListenersOnReplication="true">
<Membership className="org.apache.catalina.cluster.mcast.McastService"
	mcastAddr="228.0.0.4"
	mcastPort="45564"
	mcastFrequency="500"
	mcastDropTime="3000"/>
<Receiver className="org.apache.catalina.cluster.tcp.ReplicationListener"
	tcpListenAddress="10.72.10.1"
	tcpListenPort="4001"
	tcpSelectorTimeout="100"
	tcpThreadCount="6"/>
<Sender className="org.apache.catalina.cluster.tcp.ReplicationTransmitter"
	replicationMode="pooled"
	ackTimeout="15000"
	waitForAck="true"/>
<Valve className="org.apache.catalina.cluster.tcp.ReplicationValve"
  filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
<Deployer className="org.apache.catalina.cluster.deploy.FarmWarDeployer"
	tempDir="/tmp/war-temp/"
	deployDir="/tmp/war-deploy/"
	watchDir="/tmp/war-listen/"
	watchEnabled="false"/>
<ClusterListener className="org.apache.catalina.cluster.session.ClusterSessionListener"/>
  </Cluster>

One main condition for replication to work is that your session content is serializable. Add a _jvmRoute_ to your Tomcat Engine section From

  <Engine name="Catalina" defaultHost="localhost">

To

  <Engine name="Catalina" defaultHost="localhost" jvmRoute="tomcat1">

jvmRoute identifies unique a Tomcat instance in a cluster. If multiple servers are used I recommend you to use descriptive names.

Make your application clusterizable

Configuring Tomcat clustering is not enough to cluster your application. For that you need to tell Tomcat which application you want to be clusterizable. This is achieved in two ways:

Enable application clustering by ROOT.xml

Edit ROOT.xml file

 sudo vim /usr/local/tomcat-5.5.27/conf/Catalina/localhost/ROOT.xml

Look for

 <Context path="" cookies="true" distributable="true" crossContext="true">

Change it to

 <Context path="" debug="0" reloadable="true"
cookies="true" crossContext="false" privileged="false" >

Enable application clustering by editing the web.xml

Edit the web.xml file

 sudo vim /usr/local/tomcat-5.5.27/webapps/ROOT/WEB-INF/web.xml

Look for:

 <web-app xmlns="http://java.sun.com/xml/ns/j2ee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" version="2.4">
	<context-param>
	<param-name>contextClass</param-name>
	.............

Change it to:

 <web-app xmlns="http://java.sun.com/xml/ns/j2ee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" version="2.4">
	<distributable/>
	<context-param>
	<param-name>contextClass</param-name>
	.............

Restart Tomcat

 cd /usr/local/tomcat-5.5.27/bin/
sudo ./shutdown.sh
sudo ./startup.sh
or if you have a init script
sudo /etc/init.d/tomcat5 restart

You need to configure all the nodes in the cluster as detailed above. Every node should have unique name provided by “jvmRoute” attribute.

Further reading

Cluster-howto | http://tomcat.apache.org/tomcat-5.5-doc/cluster-howto.html

Next Page →