How a DNS problem can put your Mysql server down
Last week i was waked up from bed by the monitoring team from my company. There was a problem with my system, there was a DNS problem undergoing but as a side effect my app was down. Since it has a lot of traffic it had to be solved immediately.
I jumped to the computer and I quickly diagnosed the system. Everything was fine except the Mysql connection pool which was exhausted. The first thing that crossed my mind is that it was just a coincidence and I quickly ran show processlist to see a list of MySQL processes. The output was an infinite list of load balancer’s ip address having “login” text as status. In order to achieve high availability i am using Mysql by having a balanced ip address between two Mysql servers. The balancer runs a quick check every 5 seconds by connecting to Mysql and does a simple select on a table.
So for a particular reason the “load balancer” was not able to finish its login attempts and it was overloading my Mysql servers. While I was in the middle of the investigation the problem suddenly stopped. I was happy but somehow scared, i had no idea what the hell happened.
A quick search into Mysql documentation reveals that Mysql is doing a reverse DNS lookup which was the cause of my problems. Since the DNS server had a problem, the operation of reverse DNS was taking far more that 5 seconds to time out. This resulted in overloading the database servers. Check this explanation in the official documentation, How MySQL Uses DNS
After reading tha page I think that mysql needs this reverse DNS lookup only for its permission module and if you don’t use host names with the grant option then you are safe to disable this option. I quote here the parameter which does this:
–skip-name-resolve
Do not resolve host names when checking client connections. Use only IP numbers. If you use this option, all Host column values in the grant tables must be IP numbers or localhost. See Section 7.5.11, “How MySQL Uses DNS”.
I have been able to avoid this? Perhaps, but considering that I used MySQL in production for the first time, it is unlikely to think so.
Long live the reverse DNS, cheers!
Creating a secure JMX Agent in JDK 1.5
What is JMX?
Java Management Extension is an open technology for management, and monitoring that can be deployed wherever management and monitoring are needed. The most common use in a web application is for application management. This is very often an afterthought which results in many unmanaged application deployments.
You can monitor you application for availability and performance but in the same time you can use the JMX to manage and monitor you application from business perspective. Application’s runtime metrics can be expose through JMX, or in a service oriented architecture you could use JMX to control your services.
All good but when you start to work with JMX and JDK 1.5 soon you will discover one big limitation that was fixed in jdk 1.6 update 16 if i recall correctly:
Default RMI JMX agent for remote access opens 2 ports, one which is set by the -Dcom.sun.management.jmxremote.port=XXXX and one randomly assigned port.. What about firewalls?
JMX service url
service:jmx:rmi://hostname:port1/jndi/rmi//hostname:port2/jmxrmi
Where:
- port1 is the port number on which the RMIServer and RMIConnection remote objects are exported
- port2 is the port number of the RMI Registry
Configure Apache and Tomcat severs together
The most common way to deploy your application in the production environment is to hide the Tomcat behind Apache. This has good and bad parts but it gives you a lot of flexibility and support from Apache. There are a couple of alternatives to put these two severs together:
- mod_jk, this is the old connector developed under the Tomcat project and it is using the Tomcat’s AJP protocol. It is expected to be faster than the HTTP protocol which is text based.
- mod_proxy, is the support module for HTTP protocol. It is TCP based and uses the HTTP which is plain text. When a web client makes a request to Apache, the Apache will make the same call to the Tomcat and then the Tomcat’s response is passed back to the web client. This connector is part of the Apache for a very long time and it is available also for older versions of Apache. This is the simplest way to put the Apache in front of a Tomcat but also the slowest way to do it.
- mod_proxy_ajp, is new and is part of the Apache 2.2. It is working like mod_proxy, but as the name says it is using the AJP connector for sending and getting data from Tomcat. It is using also TCP and it is expected to be faster than plain mod_proxy
Tomcat Clustering & Java Servlet Specification
After I read more about Tomcat Clustering I realized that the main purpose of Tomcat clustering is to offer fault tolerance, failover and high availability support. I read a lot about load balancing but when it comes to Java Servlets I found out that the only choice you have in terms of balancing is to use sticky sessions. This is more a limitation that comes from Java Servlet Specification and not from Tomcat, but it make sense.
For an application to be “distributed” you have to mark it as “distributable” by add the <distributable/> tag in web.xml.
<web-app>
<distributable />
</web-app>
There are multiple ways to balance the client request to your server pool but when it comes to Java Servlet Specification you have only one choice, as the specs say:
“Within an application that is marked as distributable, all requests that are part of a session can only be handled on a single JVM at any one time.”
“You may have multiple JVMs, each handling requests from different clients concurrently for any given distributable web application”
So, I guess you can kiss goodbye the round robin and all other load balancing options, but at least Tomcat will provide you failover, scalability and high availability.
Tomcat clustering configuration
The following steps assume that you have installed a Tomcat 5.5.x bundle or latest, i only tested on 5.5.27 but is should work for other configuration as well. The network configuration apply to Linux and may vary with the distribution. It should work as is for distributions based on Red Hat.
For Tomcat clustering we have two main things to configure:
- Configure the network environment for clustering (open ports, add multicast route),
- Configure Tomcat clustering support.
Configure the network support for cluster
Opening Specific HTTP Ports (e.g. Port 45564, 4001)
The cluster class will start up a membership service (multicast) and a replication service (tcp unicast). See also http://www.cyberciti.biz/faq/howto-rhel-linux-open-port-using-iptables/ for a brief article on this. You will need to have root access as noted above to complete this.
Your server may or may not already have this entry. Open iptables:
> vi /etc/sysconfig/iptables
Add the following entries:
-A RH-Firewall-1-INPUT -p udp -m udp --dport 45564 -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m tcp --dport 45564 -j ACCEPT -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 4001 -j ACCEPT
Save and close the above file and after restart the iptables
> /etc/init.d/iptables restart
Configure the multicast address and routes.
Clustering membership is established using very simple multicast pings. Each Tomcat instance will periodically send out a multicast ping, in the ping message the instance will broad cast its IP and TCP listen port for replication. If an instance has not received such a ping within a given timeframe, the member is considered dead.
Add route (the server’s ip address)
sudo /sbin/route add 228.0.0.4 gw 10.72.10.1 dev bond0
Edit rc.local to make the change persistent through restarts.
sudo vim /etc/rc.d/rc.local
Add this line at the end (the server’s ip address)
/sbin/route add 228.0.0.4 gw 10.72.10.1 dev bond0
Configure Tomcat to support clustering.
Application clustering with Tomcat has two steps:
- Enable clustering support,
- Make you application clusterizable.
Enable Tomcat clustering support
You need to enable the cluster support in Tomcat by editing the server.xml file. Open server.xml
sudo vim /usr/local/tomcat-5.5.27/conf/server.xml
Enable clustering configuration in the configuration file, notice that the default configuration is using the DeltaManager which will replicate only the session’s changes and not the entire object:
<Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster" managerClassName="org.apache.catalina.cluster.session.DeltaManager" expireSessionsOnShutdown="false" useDirtyFlag="true" notifyListenersOnReplication="true"> <Membership className="org.apache.catalina.cluster.mcast.McastService" mcastAddr="228.0.0.4" mcastPort="45564" mcastFrequency="500" mcastDropTime="3000"/> <Receiver className="org.apache.catalina.cluster.tcp.ReplicationListener" tcpListenAddress="10.72.10.1" tcpListenPort="4001" tcpSelectorTimeout="100" tcpThreadCount="6"/> <Sender className="org.apache.catalina.cluster.tcp.ReplicationTransmitter" replicationMode="pooled" ackTimeout="15000" waitForAck="true"/> <Valve className="org.apache.catalina.cluster.tcp.ReplicationValve" filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/> <Deployer className="org.apache.catalina.cluster.deploy.FarmWarDeployer" tempDir="/tmp/war-temp/" deployDir="/tmp/war-deploy/" watchDir="/tmp/war-listen/" watchEnabled="false"/> <ClusterListener className="org.apache.catalina.cluster.session.ClusterSessionListener"/> </Cluster>
One main condition for replication to work is that your session content is serializable. Add a _jvmRoute_ to your Tomcat Engine section From
<Engine name="Catalina" defaultHost="localhost">
To
<Engine name="Catalina" defaultHost="localhost" jvmRoute="tomcat1">
jvmRoute identifies unique a Tomcat instance in a cluster. If multiple servers are used I recommend you to use descriptive names.
Make your application clusterizable
Configuring Tomcat clustering is not enough to cluster your application. For that you need to tell Tomcat which application you want to be clusterizable. This is achieved in two ways:
- by modifying the ROOT.xml (the context configuration file”
- by modifying the web.xml
Enable application clustering by ROOT.xml
Edit ROOT.xml file
sudo vim /usr/local/tomcat-5.5.27/conf/Catalina/localhost/ROOT.xml
Look for
<Context path="" cookies="true" distributable="true" crossContext="true">
Change it to
<Context path="" debug="0" reloadable="true" cookies="true" crossContext="false" privileged="false" >
Enable application clustering by editing the web.xml
Edit the web.xml file
sudo vim /usr/local/tomcat-5.5.27/webapps/ROOT/WEB-INF/web.xml
Look for:
<web-app xmlns="http://java.sun.com/xml/ns/j2ee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" version="2.4"> <context-param> <param-name>contextClass</param-name> .............
Change it to:
<web-app xmlns="http://java.sun.com/xml/ns/j2ee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" version="2.4"> <distributable/> <context-param> <param-name>contextClass</param-name> .............
Restart Tomcat
cd /usr/local/tomcat-5.5.27/bin/ sudo ./shutdown.sh sudo ./startup.sh or if you have a init script sudo /etc/init.d/tomcat5 restart
You need to configure all the nodes in the cluster as detailed above. Every node should have unique name provided by “jvmRoute” attribute.
Further reading
Cluster-howto | http://tomcat.apache.org/tomcat-5.5-doc/cluster-howto.html