<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>bogdan@j3e &#187; Architecture</title>
	<atom:link href="http://www.bserban.org/category/architecture/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.bserban.org</link>
	<description>Web, Java, J2EE, SaaS, Tips&#38;Tricks</description>
	<lastBuildDate>Thu, 08 Jul 2010 11:53:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>How a DNS problem can put your Mysql server down</title>
		<link>http://www.bserban.org/2010/01/how-a-dns-problem-can-put-your-mysql-server-down/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=how-a-dns-problem-can-put-your-mysql-server-down</link>
		<comments>http://www.bserban.org/2010/01/how-a-dns-problem-can-put-your-mysql-server-down/#comments</comments>
		<pubDate>Sat, 16 Jan 2010 15:10:39 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[MySql]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[DNS]]></category>
		<category><![CDATA[Mysql client]]></category>

		<guid isPermaLink="false">http://www.bserban.org/?p=288</guid>
		<description><![CDATA[Last week i was waked up from bed by the monitoring team from my company. There was a problem with my system, there was a DNS problem undergoing but as a side effect my app was down. Since it has a lot of traffic it had to be solved immediately. I jumped to the computer [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2010%2F01%2Fhow-a-dns-problem-can-put-your-mysql-server-down%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2010%2F01%2Fhow-a-dns-problem-can-put-your-mysql-server-down%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=DNS,MySql,Mysql+client" height="61" width="50" /><br />
			</a>
		</div>
<p>Last week i was waked up from bed by the monitoring team from my company. There was a problem with my system, there was a DNS problem undergoing but as a side effect my app was down. Since it has a lot of traffic it had to be solved immediately.</p>
<p>I jumped to the computer and I quickly diagnosed the system. Everything was fine except the Mysql connection pool which was exhausted. The first thing that crossed my mind is that it was just a coincidence and I quickly ran <em>show processlist</em> to see a list of MySQL processes. The output was an infinite list of load balancer&#8217;s ip address having &#8220;login&#8221; text as status. In order to achieve high availability i am using Mysql by having a balanced ip address between two Mysql servers. The balancer runs a quick check every 5 seconds by connecting to Mysql and does a simple select on a table.</p>
<p>So for a particular reason the &#8220;load balancer&#8221; was not able to finish its login attempts and it was overloading my Mysql servers. While I was in the middle of the investigation  the problem suddenly stopped. I was happy but somehow scared, i had no idea what the hell happened.</p>
<p>A quick search into Mysql documentation reveals that Mysql is doing a reverse DNS lookup which was the cause of my problems. Since the DNS server had a problem,  the operation of reverse DNS was taking far more that 5 seconds to time out. This resulted in overloading the database servers. Check this explanation in the official documentation, <a href="http://dev.mysql.com/doc/refman/5.0/en/dns.html">How MySQL Uses DNS</a></p>
<p>After reading tha page I think that mysql needs this reverse DNS lookup only for its permission module and if you don&#8217;t use host names with the grant option then you are safe to disable this option. I quote here the parameter which does this:</p>
<blockquote><p>&#8211;skip-name-resolve</p>
<p>Do not resolve host names when checking client connections. Use only IP numbers. If you use this option, all Host column values in the grant tables must be IP numbers or localhost. See Section 7.5.11, “How MySQL Uses DNS”.</p></blockquote>
<p>I have been able to avoid this? Perhaps, but considering that I used MySQL in production for the first time, it is unlikely to think so.</p>
<p>Long live the reverse DNS, cheers!
<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2010%2F01%2Fhow-a-dns-problem-can-put-your-mysql-server-down%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2010%2F01%2Fhow-a-dns-problem-can-put-your-mysql-server-down%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=DNS,MySql,Mysql+client" height="61" width="50" /><br />
			</a>
		</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bserban.org/2010/01/how-a-dns-problem-can-put-your-mysql-server-down/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating a secure JMX Agent in JDK 1.5</title>
		<link>http://www.bserban.org/2009/10/creating-a-secure-jmx-agent-in-jdk-1-5/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=creating-a-secure-jmx-agent-in-jdk-1-5</link>
		<comments>http://www.bserban.org/2009/10/creating-a-secure-jmx-agent-in-jdk-1-5/#comments</comments>
		<pubDate>Sat, 31 Oct 2009 17:03:39 +0000</pubDate>
		<dc:creator>bserban</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[JMX]]></category>
		<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Agent]]></category>
		<category><![CDATA[JMX Architecture]]></category>
		<category><![CDATA[JMX JDK 1.5]]></category>
		<category><![CDATA[JMXConnectorServer]]></category>
		<category><![CDATA[MBeanServer]]></category>
		<category><![CDATA[SSL JMX]]></category>

		<guid isPermaLink="false">http://www.bserban.org/?p=254</guid>
		<description><![CDATA[What is JMX? Java Management Extension is an open technology for management, and monitoring that can be deployed wherever management and monitoring are needed. The most common use in a web application is for application management. This is very often an afterthought which results in many unmanaged application deployments. You can monitor you application for [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F10%2Fcreating-a-secure-jmx-agent-in-jdk-1-5%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F10%2Fcreating-a-secure-jmx-agent-in-jdk-1-5%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=Agent,JMX,JMX+Architecture,JMX+JDK+1.5,JMXConnectorServer,MBeanServer,SSL+JMX" height="61" width="50" /><br />
			</a>
		</div>
<h2>What is JMX?</h2>
<p>Java Management Extension is an open technology for management, and monitoring that can be deployed wherever management and monitoring are needed. The most common use in a web application is for application management. This is very often an afterthought which results in many unmanaged application deployments.</p>
<p>You can monitor you application for availability and performance but in the same time you can use the JMX to manage and monitor you application from business perspective. Application&#8217;s runtime metrics can be expose through JMX, or in a service oriented architecture you could use JMX to control your services.</p>
<p>All good but when you start to work with <strong>JMX and JDK 1.5</strong> soon you will discover one big limitation that was fixed in jdk 1.6 update 16 if i recall correctly:</p>
<blockquote><p>Default RMI JMX agent for remote access opens 2 ports, one which is set by the -Dcom.sun.management.jmxremote.port=XXXX <strong>and one randomly assigned port.</strong>. What about firewalls?</p></blockquote>
<h2>JMX service url</h2>
<p>service:jmx:rmi://hostname:<strong>port1</strong>/jndi/rmi//hostname:<strong>port2</strong>/jmxrmi</p>
<p>Where:</p>
<ul>
<li><strong>port1 </strong>is the port number on which the <strong>RMIServer </strong>and <strong>RMIConnection </strong>remote objects are exported</li>
<li><strong>port2 </strong>is the port number of the <strong>RMI Registry</strong></li>
</ul>
<p><span id="more-254"></span><br />
To access the RMIAgent you only need to know were the RMI registry is located from which to obtain the connection objects. I guess this was the reason to randomly assign the port1, but you have pretty high chances to have a firewall problem.</p>
<p>The solution is to replace the default agent and to create your own version of JMX Agent to provide access to RMI connection on a specific port.</p>
<h2>JMX Architecture</h2>
<p>There are three main components that makes the JMX possible:</p>
<ul>
<li>Instrumentation<strong> </strong>layer, the managed beans and their resources. What you want to manage</li>
<li>JMX Agent, standard management agent that directly controls resources and makes them available to remote management applications. It is a mean to expose the managed beans, mbean server, monitoring, timing, relation, and class-loading services.</li>
<li>Remote management, permits the interaction between remote clients and the JMX Agent. There are a couple of default adapters that are built in, these are: HTTP Adapter (for viewing management data), RMI Adapter, SOAP Adapter, and SNMP Adapter.</li>
</ul>
<h2>Creating the Agent</h2>
<p>To overcome the problems in JDK 1.5 explained above we need to create an java agent to export the RMI Registry and the RMIServer on specific ports.</p>
<p>This is pretty straight forward, we just need to export the RMIRegistry o a specific port, to get the MBeanServer and to create a JMXConnectorServer between those two. Then we just need to start the connector server and it is done</p>
<pre class="brush: java">....................
LocateRegistry.createRegistry(port);
MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
final String hostname = InetAddress.getLocalHost().getHostName();
JMXServiceURL url = new JMXServiceURL(&quot;service:jmx:rmi://&quot;+hostname+&quot;:&quot;+port+&quot;/jndi/rmi://&quot;+hostname+&quot;:&quot;+port+&quot;/jmxrmi&quot;);
JMXConnectorServer cs = JMXConnectorServerFactory.newJMXConnectorServer(url, env, mbs);
cs.start();
</pre>
<p>Simple as it gets, but right now we don;t have any security in place. If we want to add SSL and authorization things will complicate a little.</p>
<h2>Securing the Agent</h2>
<p>To make the access secure we have to expose the RMIRegisty over the SSL. For this we need to make the following modification:</p>
<pre class="brush: java">
..............
SslRMIClientSocketFactory csf = new SslRMIClientSocketFactory();
SslRMIServerSocketFactory ssf = new SslRMIServerSocketFactory();
Registry registry = LocateRegistry.createRegistry(port, csf, ssf);
................

// Now specify the SSL Socket Factories:
//
// For the client side (remote)
//
env.put(RMIConnectorServer.RMI_CLIENT_SOCKET_FACTORY_ATTRIBUTE, csf);
// For the server side (local)
//
env.put(RMIConnectorServer.RMI_SERVER_SOCKET_FACTORY_ATTRIBUTE, ssf);
// For binding the JMX RMI Connector Server with the registry
// created above:
//
env.put(&quot;com.sun.jndi.rmi.factory.socket&quot;, csf);

final RMIServerImpl stub = new RMIJRMPServerImpl(port, csf, ssf, env);

final JMXConnectorServer cs =
new RMIConnectorServer(new JMXServiceURL(&quot;rmi&quot;, hostname, port),
env, stub, mbs) {
@Override
public JMXServiceURL getAddress() {
return url;
}
@Override
public synchronized void start() throws IOException {
try {
registry.bind(&quot;jmxrmi&quot;, stub);
} catch (AlreadyBoundException x) {
final IOException io = new IOException(x.getMessage());
io.initCause(x);
throw io;
}
super.start();
}
};
cs.start();</pre>
<p>As you see we had to secure also the connection between ConnectorServer and JMX agent and for RMI server and client. Not that simple as the unsecure version. To add the authorization we just need to provide the credentials. Usually these are store in files and passed to the jmx connector server using environment variables:</p>
<pre class="brush: java">env.put(&quot;jmx.remote.x.password.file&quot;, pasword);
env.put(&quot;jmx.remote.x.access.file&quot;, access);
</pre>
<p>And their content:</p>
<pre class="brush: java">admin password1
monitor password2
..................
admin readwrite
monitor readonly
</pre>
<p>In order that SSL connections to work we need to create a keystore. This is done using keytool and can be create like this:</p>
<pre class="brush: java">
keytool -genkey -keyalg RSA -keysize 1024 -dname &quot;CN=org.bserban.www&quot; -keystore ./jmx-demo.jks -storepass bserban
</pre>
<h3>Running the Agent</h3>
<pre class="brush: java">
java -javaagent:jmx-agent.jar -Djavax.net.ssl.trustStore=./jmx-demo.jks -Djavax.net.ssl.trustPassword=bserban -Djavax.net.ssl.keyStore=./jmx-demo.jks -Djavax.net.ssl.keyStorePassword=bserban org.abserban.jmx.agent.StartAgentStandalone
</pre>
<h3>Running the Client</h3>
<pre class="brush: java">
java -Djavax.net.ssl.trustStore=./jmx-demo.jks -Djavax.net.ssl.trustPassword=bserban -Djavax.net.ssl.keyStore=./jmx-demo.jks -Djavax.net.ssl.keyStorePassword=bserban org.abserban.jmx.client.StartClient -host:localhost -port:8787 -status:on
</pre>
<p>See the comple code from Resources section. It includes the complete java code, ant file and the keystore.</p>
<h3>Resources</h3>
<ul>
<li><a href="http://www.bserban.org/wp-content/uploads/2009/10/jmx-agent.zip">Comple source code</a></li>
<li><a href="http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/best-practices.jsp">JMX Best practices</a></li>
<li><a href="http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/articles.jsp">JMX Articles</a></li>
</ul>
<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F10%2Fcreating-a-secure-jmx-agent-in-jdk-1-5%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F10%2Fcreating-a-secure-jmx-agent-in-jdk-1-5%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=Agent,JMX,JMX+Architecture,JMX+JDK+1.5,JMXConnectorServer,MBeanServer,SSL+JMX" height="61" width="50" /><br />
			</a>
		</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bserban.org/2009/10/creating-a-secure-jmx-agent-in-jdk-1-5/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Configure Apache and Tomcat severs together</title>
		<link>http://www.bserban.org/2009/08/configure-apache-and-tomcat-severs-together/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=configure-apache-and-tomcat-severs-together</link>
		<comments>http://www.bserban.org/2009/08/configure-apache-and-tomcat-severs-together/#comments</comments>
		<pubDate>Sat, 08 Aug 2009 08:12:00 +0000</pubDate>
		<dc:creator>bserban</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Tomcat]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[mod_proxy_ajp]]></category>
		<category><![CDATA[AJP]]></category>
		<category><![CDATA[Apache]]></category>
		<category><![CDATA[Apache Benchmark]]></category>
		<category><![CDATA[mod_proxy]]></category>

		<guid isPermaLink="false">http://www.bserban.org/?p=209</guid>
		<description><![CDATA[The most common way to deploy your application in the production environment is to hide the Tomcat behind Apache. This has good and bad parts but it gives you a lot of flexibility and support from Apache. There are a couple of alternatives to put these two severs together: mod_jk, this is the old connector [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F08%2Fconfigure-apache-and-tomcat-severs-together%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F08%2Fconfigure-apache-and-tomcat-severs-together%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=AJP,Apache,Apache+Benchmark,mod_proxy,mod_proxy_ajp,Tomcat" height="61" width="50" /><br />
			</a>
		</div>
<p>The most common way to deploy your application in the production environment is to hide the Tomcat behind Apache. This has good and bad parts but it gives you a lot of flexibility and support from Apache. There are a couple of alternatives to put these two severs together:</p>
<ul>
<li><em>mod_jk</em>, this is the old connector developed under the Tomcat project and it is using the Tomcat&#8217;s AJP protocol. It is expected to be faster than the HTTP protocol which is text based.</li>
<li><em>mod_proxy</em>, is the support module for HTTP protocol. It is TCP based and uses the HTTP which is plain text. When a web client makes a request to Apache, the Apache will make the same call to the Tomcat and then the Tomcat&#8217;s response is passed back to the web client. This connector is part of the Apache for a very long time and it is available also for older versions of Apache. This is the simplest way to put the Apache in front of a Tomcat but also the slowest way to do it.</li>
<li><em>mod_proxy_ajp</em>, is new and is part of the Apache 2.2. It is working like <em>mod_proxy</em>, but as the name says it is using the AJP connector for sending and getting data from Tomcat. It is using also TCP and it is expected to be faster than plain <em>mod_proxy</em></li>
</ul>
<p><span id="more-209"></span></p>
<h1>Using mod_proxy</h1>
<p>Create your own tomcat-httpd.conf file and configure the proxy:</p>
<p><code>#<br />
# Server will not close the connection after each request allowing the browser to use the same connection<br />
#<br />
KeepAlive On<br />
MaxKeepAliveRequests 100<br />
KeepAliveTimeout 5</code></p>
<p>#<br />
# Load mod_proxy modules<br />
#<br />
&lt;IfModule !proxy_module&gt;<br />
LoadModule proxy_module modules/mod_proxy.so<br />
&lt;/IfModule&gt;</p>
<p>&lt;IfModule !proxy_http_module&gt;<br />
LoadModule proxy_http_module modules/mod_proxy_http.so<br />
&lt;/IfModule&gt;</p>
<p>ProxyRequests Off<br />
ProxyPreserveHost On<br />
ProxyTimeout 1000<br />
TimeOut 1000</p>
<p>#<br />
# Configure the mod_proxy<br />
#<br />
ProxyPass / http://127.0.0.1:8080/<br />
ProxyPassReverse / http://127.0.0.1:8080/</p>
<p>Include your configuration into Apache <em>httpd.conf</em> using Include directive:</p>
<p><code>#load Tomcat proxy configuration<br />
Include /usr/local/tomcat6/conf/tomcat-httpd.conf</code></p>
<h1>Using mod_proxy_ajp</h1>
<p>The only difference between mod_proxy and mod_proxy_ajp is that you have to load mod_proxy_ajp and proxy the request to Tomcat using the ajp protocol.</p>
<p><code>#<br />
# Load mod_proxy modules<br />
#<br />
&lt;IfModule !proxy_module&gt;<br />
LoadModule proxy_module modules/mod_proxy.so<br />
&lt;/IfModule&gt;</code></p>
<p>&lt;IfModule !proxy_http_module&gt;<br />
LoadModule proxy_http_module modules/mod_proxy_http.so<br />
&lt;/IfModule&gt;</p>
<p>&lt;IfModule !proxy_ajp_module&gt;<br />
LoadModule proxy_ajp_module modules/mod_proxy_ajp.so<br />
&lt;/IfModule&gt;</p>
<p>ProxyRequests Off<br />
ProxyPreserveHost On<br />
ProxyTimeout 1000<br />
TimeOut 1000<br />
#<br />
# Enable the AJP proxy<br />
#<br />
ProxyPass / ajp://localhost:8009/<br />
ProxyPassReverse / ajp://localhost:8009/</p>
<p>Include your configuration into Apache <em>httpd.conf</em> using Include directive:</p>
<p><code>#load Tomcat proxy configuration<br />
Include /usr/local/tomcat6/conf/tomcat-httpd.conf</code></p>
<h1>Using mod_jk</h1>
<p>Edit your <em>tomcat-httpd.conf</em> file and add mod_jk configuration:<br />
<code><br />
#<br />
# Load mod_jk is is not loaded already<br />
#<br />
&lt;IfModule !jk_module&gt;<br />
LoadModule jk_module modules/mod_jk.so<br />
&lt;/IfModule&gt;<br />
#<br />
#<br />
#<br />
JkWorkersFile /etc/httpd/conf/workers.properties<br />
JkLogFile /var/logs/httpd/mod_jk.log<br />
JkLogLevel info<br />
JkLogStampFormat "[%a %b %d %H:%M:%S %Y] "<br />
JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories<br />
JkRequestLogFormat "%w %V %T"<br />
JkMount /test/* worker1</code></p>
<p>Include your configuration into Apache httpd.conf using Include directive:</p>
<p><code>#load mod_jk configuration configuration<br />
Include /usr/local/tomcat6/conf/tomcat-httpd.conf<br />
</code><br />
Now edit the workers.properties and configure your worker:<br />
<code><br />
worker.list=worker1<br />
worker.worker1.type=ajp13<br />
worker.worker1.host=localhost<br />
worker.worker1.port=8009<br />
worker.worker1.connection_pool_size=150<br />
worker.worker1.connection_pool_timeout=600<br />
worker.worker1.socket_keepalive=1</code></p>
<h1>Test each configuration using Apache Benchmark tool</h1>
<p>Apache benchmark it is a great tool for testing the above configurations. All you need is to create a typical application page that should be hit with the ab tool. This tool takes a single url and makes requests repeatedly in separates threads. The number of threads is controlled by command line arguments. It also supports keep alive connections.</p>
<p>For more details about Apache Benchmark check this page <a href="http://httpd.apache.org/docs/2.0/programs/ab.html">Apache Benchmark</a></p>
<p>For testing purposes i have created a test war which has a test.jsp page. Because what we test does not influence the processing time overall we don’t need a complete test which includes a database call or working with certain frameworks. In the end all we need is the output of an application and to test how this output reach the browser using one of the three modes explained in the section above.</p>
<p>The test page will include a previous post of mine which is medium size: http://www.bserban.org/2009/05/put-together-struts2-jpa-hibernate-and-spring/. I have right-clicked in the browser, chosen the view source and copy and paste the content into test.jsp. The file has 112Kb in size.</p>
<p><em><br />
bserban-mac:~ bserban$ ls -la ~/bin/srv/apache-tomcat-5.5.27/webapps/test/test.jsp<br />
-rw-r&#8211;r&#8211;  1 bserban  staff  113698 Aug  7 09:39 /Users/bserban/bin/srv/apache-tomcat-5.5.27/webapps/test/test.jsp<br />
</em></p>
<p>I am going to hit the Tomcat and the Apache with 10,000 request using 50 threads. The ab command looks like this:</p>
<p><em><br />
ab -k -n 10000 -c 50 http://localhost/test/test.jsp<br />
</em></p>
<p>To access directly the Tomcat i am going to hit the 8080 port. To preserve the similar enviroment and test conditions I will restart the Apache and Tomcat after each test.</p>
<h2>Test results</h2>
<p>The table below summarize the results obtained.</p>
<table class="design5" border="0">
<thead>
<tr>
<th></th>
<th>Direct</th>
<th>mode_proxy</th>
<th>mode_proxy_ajp</th>
</tr>
</thead>
<tbody>
<tr>
<td>Throughput</td>
<td>992 reg/s</td>
<td>667 req/s</td>
<td>702 req/s</td>
</tr>
<tr>
<td>Average Response Time</td>
<td>50 ms</td>
<td>75 ms</td>
<td>71 ms</td>
</tr>
<tr>
<td>90% Response line</td>
<td>75 ms</td>
<td>90 ms</td>
<td>71 ms</td>
</tr>
<tr>
<td>100% Response line</td>
<td>207 ms</td>
<td>980 ms</td>
<td>972 ms</td>
</tr>
</tbody>
</table>
<p>As you see, the fastest connecting mode is to connect directly to tomcat using HTTP. Direct HTTP connect will server more request per second that the other modes. The second choice is mode_proxy_ajp followed very closely by the mod_proxy. However the overhead added the the Apache will leverage for real life applications because the application processing time will minimize the impact of using Apache in front of Tomcat. Probably in real world the differences between direct HTTP connect and mod_proxy_ajp will not exceed 5-10% percents in terms of throughput and average time per request. This is the price to pay for the flexibility brought by Apache, because having the Apache in front of the Tomcat will give access to the whole Apache functionality and support.</p>
<p>For those who want to see test results in detail, I included in the post the tests logs.</p>
<h2>Test result trace for Tomcat HTTP</h2>
<p><code>bserban-mac:~ bserban$ ab -k -n 10000 -c 50 http://localhost:8080/test/test.jsp<br />
This is ApacheBench, Version 2.3 &lt;$Revision: 655654 $&gt;<br />
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/<br />
Licensed to The Apache Software Foundation, http://www.apache.org/</code></p>
<p>Benchmarking localhost (be patient)<br />
Completed 1000 requests<br />
Completed 2000 requests<br />
Completed 3000 requests<br />
Completed 4000 requests<br />
Completed 5000 requests<br />
Completed 6000 requests<br />
Completed 7000 requests<br />
Completed 8000 requests<br />
Completed 9000 requests<br />
Completed 10000 requests<br />
Finished 10000 requests</p>
<p>Server Software:        Apache-Coyote/1.1<br />
Server Hostname:        localhost<br />
Server Port:            8080</p>
<p>Document Path:          /test/test.jsp<br />
Document Length:        113678 bytes</p>
<p>Concurrency Level:      50<br />
Time taken for tests:   10.079 seconds<br />
Complete requests:      10000<br />
Failed requests:        0<br />
Write errors:           0<br />
Keep-Alive requests:    0<br />
Total transferred:      1139157786 bytes<br />
HTML transferred:       1137007356 bytes<br />
Requests per second:    992.18 [#/sec] (mean)<br />
Time per request:       50.394 [ms] (mean)<br />
Time per request:       1.008 [ms] (mean, across all concurrent requests)<br />
Transfer rate:          110376.03 [Kbytes/sec] received</p>
<p>Connection Times (ms)<br />
min  mean[+/-sd] median   max<br />
Connect:        0    7   7.3      5      67<br />
Processing:     5   43  22.4     37     182<br />
Waiting:        0   16  19.9     11     161<br />
Total:          7   50  22.4     45     207</p>
<p>Percentage of the requests served within a certain time (ms)<br />
50%     45<br />
66%     52<br />
75%     57<br />
80%     61<br />
90%     75<br />
95%     91<br />
98%    119<br />
99%    147<br />
100%    207 (longest request)</p>
<h2>Test result trace for mod_proxy_ajp</h2>
<p><code>bserban-mac:~ bserban$ ab -k -n 10000 -c 50 http://localhost/test/test.jsp<br />
This is ApacheBench, Version 2.3 &lt;$Revision: 655654 $&gt;<br />
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/<br />
Licensed to The Apache Software Foundation, http://www.apache.org/</code></p>
<p>Benchmarking localhost (be patient)<br />
Completed 1000 requests<br />
Completed 2000 requests<br />
Completed 3000 requests<br />
Completed 4000 requests<br />
Completed 5000 requests<br />
Completed 6000 requests<br />
Completed 7000 requests<br />
Completed 8000 requests<br />
Completed 9000 requests<br />
Completed 10000 requests<br />
Finished 10000 requests</p>
<p>Server Software:<br />
Server Hostname:        localhost<br />
Server Port:            80</p>
<p>Document Path:          /test/test.jsp<br />
Document Length:        113678 bytes</p>
<p>Concurrency Level:      50<br />
Time taken for tests:   14.251 seconds<br />
Complete requests:      10000<br />
Failed requests:        0<br />
Write errors:           0<br />
Keep-Alive requests:    0<br />
Total transferred:      1140884936 bytes<br />
HTML transferred:       1139000800 bytes<br />
Requests per second:    701.70 [#/sec] (mean)<br />
Time per request:       71.256 [ms] (mean)<br />
Time per request:       1.425 [ms] (mean, across all concurrent requests)<br />
Transfer rate:          78179.54 [Kbytes/sec] received</p>
<p>Connection Times (ms)<br />
min  mean[+/-sd] median   max<br />
Connect:        0    7   8.3      5      87<br />
Processing:     6   64  89.8     47     967<br />
Waiting:        0   38  86.9     21     920<br />
Total:          6   71  89.4     52     972</p>
<p>Percentage of the requests served within a certain time (ms)<br />
50%     52<br />
66%     58<br />
75%     64<br />
80%     69<br />
90%     83<br />
95%    115<br />
98%    507<br />
99%    689<br />
100%    972 (longest request)</p>
<h2>Test result trace for mod_proxy</h2>
<p><code>bserban-mac:~ bserban$ ab -k -n 10000 -c 50 http://localhost/test/test.jsp<br />
This is ApacheBench, Version 2.3 &lt;$Revision: 655654 $&gt;<br />
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/<br />
Licensed to The Apache Software Foundation, http://www.apache.org/</code></p>
<p>Benchmarking localhost (be patient)<br />
Completed 1000 requests<br />
Completed 2000 requests<br />
Completed 3000 requests<br />
Completed 4000 requests<br />
Completed 5000 requests<br />
Completed 6000 requests<br />
Completed 7000 requests<br />
Completed 8000 requests<br />
Completed 9000 requests<br />
Completed 10000 requests<br />
Finished 10000 requests</p>
<p>Server Software:        Apache-Coyote/1.1<br />
Server Hostname:        localhost<br />
Server Port:            80</p>
<p>Document Path:          /test/test.jsp<br />
Document Length:        113678 bytes</p>
<p>Concurrency Level:      50<br />
Time taken for tests:   14.985 seconds<br />
Complete requests:      10000<br />
Failed requests:        0<br />
Write errors:           0<br />
Keep-Alive requests:    0<br />
Total transferred:      1140715856 bytes<br />
HTML transferred:       1138561771 bytes<br />
Requests per second:    667.34 [#/sec] (mean)<br />
Time per request:       74.925 [ms] (mean)<br />
Time per request:       1.498 [ms] (mean, across all concurrent requests)<br />
Transfer rate:          74339.90 [Kbytes/sec] received</p>
<p>Connection Times (ms)<br />
min  mean[+/-sd] median   max<br />
Connect:        0    9  10.1      5     123<br />
Processing:     9   66  83.2     52     975<br />
Waiting:        0   41  83.7     27     949<br />
Total:          9   74  82.6     59     980</p>
<p>Percentage of the requests served within a certain time (ms)<br />
50%     59<br />
66%     66<br />
75%     72<br />
80%     76<br />
90%     98<br />
95%    123<br />
98%    192<br />
99%    675<br />
100%    980 (longest request)
<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F08%2Fconfigure-apache-and-tomcat-severs-together%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F08%2Fconfigure-apache-and-tomcat-severs-together%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=AJP,Apache,Apache+Benchmark,mod_proxy,mod_proxy_ajp,Tomcat" height="61" width="50" /><br />
			</a>
		</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bserban.org/2009/08/configure-apache-and-tomcat-severs-together/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Tomcat Clustering &amp; Java Servlet Specification</title>
		<link>http://www.bserban.org/2009/08/tomcat-clustering-java-servlet-specification/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=tomcat-clustering-java-servlet-specification</link>
		<comments>http://www.bserban.org/2009/08/tomcat-clustering-java-servlet-specification/#comments</comments>
		<pubDate>Tue, 04 Aug 2009 11:34:51 +0000</pubDate>
		<dc:creator>bserban</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Tomcat]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[Distributable]]></category>
		<category><![CDATA[Servlet Specification]]></category>
		<category><![CDATA[Sticky Sessions]]></category>
		<category><![CDATA[Tomcat clustering]]></category>

		<guid isPermaLink="false">http://www.bserban.org/2009/08/tomcat-clustering-java-servlet-specification/</guid>
		<description><![CDATA[After I read more about Tomcat Clustering I realized that the main purpose of Tomcat clustering is to offer fault tolerance, failover  and high availability support. I read a lot about load balancing but when it comes to Java Servlets I found out that the only choice you have in terms of balancing is to [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F08%2Ftomcat-clustering-java-servlet-specification%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F08%2Ftomcat-clustering-java-servlet-specification%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=Distributable,Servlet+Specification,Sticky+Sessions,Tomcat+clustering" height="61" width="50" /><br />
			</a>
		</div>
<p>After I read more about Tomcat Clustering I realized that the main purpose of Tomcat clustering is to offer fault tolerance, failover  and high availability support. I read a lot about load balancing but when it comes to Java Servlets I found out that the only choice you have in terms of balancing is to use sticky sessions. This is more a limitation that comes from Java Servlet Specification and not from Tomcat, but it make sense.</p>
<p>For an application to be &#8220;distributed&#8221; you have to mark  it as &#8220;distributable&#8221; by add the &lt;distributable/&gt; tag in web.xml.</p>
<p>&lt;web-app&gt;<br />
&lt;distributable /&gt;<br />
&lt;/web-app&gt;</p>
<p>There are multiple ways to balance the client request to your server pool but when it comes to Java Servlet Specification you have only one choice, as the specs say:</p>
<p>&#8220;<em>Within an application that is marked as distributable, all requests that are part of a session can only be handled on a single JVM at any one time.</em>&#8221;</p>
<p>&#8220;<em>You may have multiple JVMs, each handling requests from different clients concurrently for any given distributable web application</em>&#8221;</p>
<p>So, I guess you can kiss goodbye the round robin and all other load balancing options, but at least Tomcat will provide you  failover, scalability  and high availability.
<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F08%2Ftomcat-clustering-java-servlet-specification%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F08%2Ftomcat-clustering-java-servlet-specification%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=Distributable,Servlet+Specification,Sticky+Sessions,Tomcat+clustering" height="61" width="50" /><br />
			</a>
		</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bserban.org/2009/08/tomcat-clustering-java-servlet-specification/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Tomcat clustering configuration</title>
		<link>http://www.bserban.org/2009/06/tomcat-clustering-configuration/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=tomcat-clustering-configuration</link>
		<comments>http://www.bserban.org/2009/06/tomcat-clustering-configuration/#comments</comments>
		<pubDate>Tue, 09 Jun 2009 08:33:36 +0000</pubDate>
		<dc:creator>bserban</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Tomcat]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[deltamanager]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[Tomcat clustering]]></category>
		<category><![CDATA[tomcat session replication]]></category>

		<guid isPermaLink="false">http://www.bserban.org/?p=180</guid>
		<description><![CDATA[The following steps assume that you have installed a Tomcat 5.5.x bundle or latest, i only tested on 5.5.27 but is should work for other configuration as well. The network configuration apply to Linux and may vary with the distribution. It should work as is for distributions based on Red Hat. For Tomcat clustering we [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F06%2Ftomcat-clustering-configuration%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F06%2Ftomcat-clustering-configuration%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=clustering,deltamanager,replication,Tomcat+clustering,tomcat+session+replication" height="61" width="50" /><br />
			</a>
		</div>
<p align="justify">The following steps assume that you have installed a Tomcat 5.5.x bundle or latest, i only tested on 5.5.27 but is should work for other configuration as well. The network configuration apply to Linux and may vary with the distribution. It should work as is for distributions based on Red Hat.</p>
<p>For Tomcat clustering we have two main things to configure:</p>
<ul>
<li>Configure the network environment for clustering (open ports, add multicast route),</li>
<li>Configure Tomcat clustering support.</li>
</ul>
<h2>Configure the network support for cluster</h2>
<h3>Opening Specific HTTP Ports (e.g. Port 45564, 4001)</h3>
<p align="justify">The cluster class will start up a membership service (multicast) and a replication service (tcp unicast). See also http://www.cyberciti.biz/faq/howto-rhel-linux-open-port-using-iptables/ for a brief article on this. You will need to have root access as noted above to complete this.</p>
<p>Your server may or may not already have this entry. Open iptables:</p>
<pre class="xml:nogutter:nocontrols">&gt; vi /etc/sysconfig/iptables</pre>
<p>Add the following entries:</p>
<pre class="xml:nogutter:nocontrols">-A RH-Firewall-1-INPUT -p udp -m udp --dport 45564 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 45564 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 4001 -j ACCEPT</pre>
<p>Save and close the above file and after restart the iptables</p>
<pre class="xml:nogutter:nocontrols">  &gt; /etc/init.d/iptables restart</pre>
<h3>Configure the multicast address and routes.</h3>
<p align="justify">Clustering membership is established using very simple multicast pings. Each Tomcat instance will periodically send out a multicast ping, in the ping message the instance will broad cast its IP and TCP listen port for replication. If an instance has not received such a ping within a given timeframe, the member is considered dead.</p>
<p>Add route  (the server&#8217;s ip address)</p>
<pre class="xml:nogutter:nocontrols">sudo /sbin/route add 228.0.0.4 gw 10.72.10.1 dev bond0</pre>
<p>Edit rc.local to make the change persistent through restarts.</p>
<pre class="xml:nogutter:nocontrols">sudo vim /etc/rc.d/rc.local</pre>
<p>Add this line at the end (the server&#8217;s ip address)</p>
<pre class="xml:nogutter:nocontrols">/sbin/route add 228.0.0.4 gw 10.72.10.1 dev bond0</pre>
<h2>Configure Tomcat to support clustering.</h2>
<p>Application clustering with Tomcat has two steps:</p>
<ul>
<li>Enable clustering support,</li>
<li>Make you application clusterizable.</li>
</ul>
<h3>Enable Tomcat clustering support</h3>
<p>You need to enable the cluster support in Tomcat by editing the server.xml file. Open server.xml</p>
<pre class="xml:nogutter:nocontrols">sudo vim /usr/local/tomcat-5.5.27/conf/server.xml</pre>
<p>Enable clustering configuration in the configuration file, notice that the default configuration is using the DeltaManager which will replicate only the session&#8217;s changes and not the entire object:</p>
<pre class="xml:nogutter:nocontrols">&lt;Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
managerClassName="org.apache.catalina.cluster.session.DeltaManager"
	expireSessionsOnShutdown="false"
	useDirtyFlag="true"
	notifyListenersOnReplication="true"&gt;
&lt;Membership className="org.apache.catalina.cluster.mcast.McastService"
	mcastAddr="228.0.0.4"
	mcastPort="45564"
	mcastFrequency="500"
	mcastDropTime="3000"/&gt;
&lt;Receiver className="org.apache.catalina.cluster.tcp.ReplicationListener"
	tcpListenAddress="10.72.10.1"
	tcpListenPort="4001"
	tcpSelectorTimeout="100"
	tcpThreadCount="6"/&gt;
&lt;Sender className="org.apache.catalina.cluster.tcp.ReplicationTransmitter"
	replicationMode="pooled"
	ackTimeout="15000"
	waitForAck="true"/&gt;
&lt;Valve className="org.apache.catalina.cluster.tcp.ReplicationValve"
  filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/&gt;
&lt;Deployer className="org.apache.catalina.cluster.deploy.FarmWarDeployer"
	tempDir="/tmp/war-temp/"
	deployDir="/tmp/war-deploy/"
	watchDir="/tmp/war-listen/"
	watchEnabled="false"/&gt;
&lt;ClusterListener className="org.apache.catalina.cluster.session.ClusterSessionListener"/&gt;
  &lt;/Cluster&gt;</pre>
<p>One main condition for replication to work is that your session content is serializable. Add a _jvmRoute_ to your Tomcat Engine section From</p>
<pre class="xml:nogutter:nocontrols">  &lt;Engine name="Catalina" defaultHost="localhost"&gt;</pre>
<p>To</p>
<pre class="xml:nogutter:nocontrols">  &lt;Engine name="Catalina" defaultHost="localhost" jvmRoute="tomcat1"&gt;</pre>
<p><em>jvmRoute</em> identifies unique a Tomcat instance in a cluster. If multiple servers are used I recommend you to use descriptive names.</p>
<h3>Make your application clusterizable</h3>
<p>Configuring Tomcat clustering is not enough to cluster your application. For that you need to tell Tomcat which application you want to be clusterizable. This is achieved in two ways:</p>
<ul>
<li>by modifying the ROOT.xml (the context configuration file&#8221;</li>
<li>by modifying the web.xml</li>
</ul>
<h4>Enable application clustering by ROOT.xml</h4>
<p>Edit ROOT.xml file</p>
<pre class="xml:nogutter:nocontrols"> sudo vim /usr/local/tomcat-5.5.27/conf/Catalina/localhost/ROOT.xml</pre>
<p>Look for</p>
<pre class="xml:nogutter:nocontrols"> &lt;Context path="" cookies="true" distributable="true" crossContext="true"&gt;</pre>
<p>Change it to</p>
<pre class="xml:nogutter:nocontrols"> &lt;Context path="" debug="0" reloadable="true"
cookies="true" crossContext="false" privileged="false" &gt;</pre>
<h4>Enable application clustering by editing the web.xml</h4>
<p>Edit the web.xml file</p>
<pre class="xml:nogutter:nocontrols"> sudo vim /usr/local/tomcat-5.5.27/webapps/ROOT/WEB-INF/web.xml</pre>
<p>Look for:</p>
<pre class="xml:nogutter:nocontrols"> &lt;web-app xmlns="http://java.sun.com/xml/ns/j2ee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" version="2.4"&gt;
	&lt;context-param&gt;
	&lt;param-name&gt;contextClass&lt;/param-name&gt;
	.............</pre>
<p>Change it to:</p>
<pre class="xml:nogutter:nocontrols"> &lt;web-app xmlns="http://java.sun.com/xml/ns/j2ee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" version="2.4"&gt;
	&lt;distributable/&gt;
	&lt;context-param&gt;
	&lt;param-name&gt;contextClass&lt;/param-name&gt;
	.............</pre>
<p>Restart Tomcat</p>
<pre class="xml:nogutter:nocontrols"> cd /usr/local/tomcat-5.5.27/bin/
sudo ./shutdown.sh
sudo ./startup.sh
or if you have a init script
sudo /etc/init.d/tomcat5 restart</pre>
<p>You need to configure all the nodes in the cluster as detailed above. Every node should have unique name provided by &#8220;jvmRoute&#8221; attribute.</p>
<h2>Further reading</h2>
<p>Cluster-howto | http://tomcat.apache.org/tomcat-5.5-doc/cluster-howto.html
<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F06%2Ftomcat-clustering-configuration%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F06%2Ftomcat-clustering-configuration%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=clustering,deltamanager,replication,Tomcat+clustering,tomcat+session+replication" height="61" width="50" /><br />
			</a>
		</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bserban.org/2009/06/tomcat-clustering-configuration/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Key-Value Storage using MemcacheDB</title>
		<link>http://www.bserban.org/2009/03/key-value-storage-using-memcachedb/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=key-value-storage-using-memcachedb</link>
		<comments>http://www.bserban.org/2009/03/key-value-storage-using-memcachedb/#comments</comments>
		<pubDate>Mon, 30 Mar 2009 21:08:37 +0000</pubDate>
		<dc:creator>bserban</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Memcachedb]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[distributed storage]]></category>
		<category><![CDATA[Entity Attribute value]]></category>
		<category><![CDATA[high performance]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://www.bserban.org/?p=146</guid>
		<description><![CDATA[What is Entity-Attribute-Value model (aka key-value storage) This is also know as Entity-Attribute-Value model, and it is used in circumstances where the number of attributes (properties) that can be used to describe an entity  is very vast but the number of attributes that will actually be used is modest. Let’s think in terms of a [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F03%2Fkey-value-storage-using-memcachedb%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F03%2Fkey-value-storage-using-memcachedb%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=Architecture,distributed+storage,Entity+Attribute+value,high+performance,Memcachedb,Scalability" height="61" width="50" /><br />
			</a>
		</div>
<h2>What is Entity-Attribute-Value model (aka key-value storage)</h2>
<p>This is also know as Entity-Attribute-Value model, and it is used in circumstances where the number of attributes (properties) that can be used to describe an entity  is very vast but the number of attributes that will actually be used is modest.</p>
<p>Let’s think in terms of a database how an Entity-Attribute-Value model would look like for storing an user profile.</p>
<table border="0" cellspacing="0" cellpadding="2" width="400">
<tbody>
<tr>
<td width="100" valign="top">id</td>
<td width="100" valign="top">user_id</td>
<td width="100" valign="top">key</td>
<td width="100" valign="top">value</td>
</tr>
<tr>
<td width="100" valign="top">1</td>
<td width="100" valign="top">101</td>
<td width="100" valign="top">screen_name</td>
<td width="100" valign="top">john</td>
</tr>
<tr>
<td width="100" valign="top">2</td>
<td width="100" valign="top">101</td>
<td width="100" valign="top">first_name</td>
<td width="100" valign="top">John</td>
</tr>
<tr>
<td width="100" valign="top">3</td>
<td width="100" valign="top">101</td>
<td width="100" valign="top">last_name</td>
<td width="100" valign="top">Smith</td>
</tr>
</tbody>
</table>
<p align="justify">The table has one row for each Attribute-Value pair. In practice, we prefer to separate values based on data type to let the database to perform type validation checks and to support proper indexing. So programmers tend to create separate EAV tables for strings, real and integer numbers, dates, long text and BLOBS.</p>
<p align="justify">The benefits of such structure are:</p>
<ol>
<li>Flexibility, there is no limit on attributes used to describe an entity. No schema redesign.</li>
<li>The storage is efficient on sparse data.</li>
<li>Easy to put the data into an XML format for interchange.</li>
</ol>
<p align="justify">There are also some important drawbacks:</p>
<ol>
<li>No real use of data types</li>
<li>Awkward use of database constraints</li>
<li>There are several problems in querying such a structure.</li>
</ol>
<h2>What is MemcacheDB</h2>
<p align="justify">Memcachedb is a <strong>distributed key-value storage</strong> system designed for persistence. It is a very <strong>fast an reliable</strong> distributed storage. It includes transaction and replication. It is using Berkeley DB as persistence storage.</p>
<p>Why is better than a database?</p>
<ol>
<li>Faster, no SQL engine on top of MemcacheDB</li>
<li>Designed for concurrency, design for millions of requests</li>
<li>Optimized for small data</li>
</ol>
<p align="justify">Memcachedb is suitable for Messaging, metadata storage, Identity Management (Accounts, Profiles, Preferences, etc), index, counters, flags, etc.</p>
<p>The main features for Memcachedb are:</p>
<ul>
<li>
<div>High performance read/write for a key-value based object<br />
Rapid set/get for a key-value based object, not relational. Benchmark<br />
will tell you the true later.</div>
</li>
<li>
<div>High reliable persistent storage with transaction Transaction is used to make your data more reliable</div>
</li>
<li>
<div>High availability data storage with replication Replication rocks! Achieve your HA, spread your read, make your transaction durable</div>
</li>
<li>
<div>Memcache protocol compatibility Lots of Memcached Client APIs can be used for Memcachedb, almost in any language, Perl, C, Python, Java</div>
</li>
</ul>
<h3>Storage, replication and recovery</h3>
<p align="justify">Berkeley DB stores data quickly and easily without the overhead found in other databases. Read more about Berkeley DB <a href="http://www.oracle.com/technology/products/berkeley-db/db/index.html">here</a></p>
<p><img style="display: block; float: none; margin-left: auto; margin-right: auto" src="http://oracleimg.com/technology/products/berkeley-db/images/berkeley-db-stack.gif" alt="" /></p>
<p align="justify">MemcacheDB supports replication using Masters and Slaves nodes. The exact deployment design must chosen according with your application needs. A MemcacheDB environment consists intro three things:</p>
<ul>
<li>Database files, files that store your data</li>
<li>Log files, all your transaction commit first into logs</li>
<li>Region files, back the share memory region</li>
</ul>
<p align="justify">One problem could be spot in Log files, that record you transaction, over time they will contain a lot of data making the recovery a pain moment. For this Memcache DB has a <em>Checkpoint</em>. The checkpoint empties the in-memory cache, writes a checkpoint record, flushes the logs and writes a list of open database files.</p>
<p align="justify">Berkeley DB also allows hot backups and uses gzip and tar to compress the backup.</p>
<h3>Monitoring</h3>
<p align="justify">
<p align="justify">Memcache DB has a lot of built in commands for monitoring, such as:</p>
<ul>
<li>
<div>Current status: <em>stats</em></div>
</li>
<li>
<div>Database engine status: <em>stats db</em></div>
</li>
<li>
<div>Replication status: <em>stats rep </em></div>
</li>
</ul>
<p align="justify">What i liked most at Memcached is that you can use telnet to log on the running process and issue commands from command prompt. The same thing is valid also for MemcacheDB.</p>
<p>Besides memcached built function the Berkeley DB engine comes with his own stats command:</p>
<p>db_stats, –c locking statistics, –l logging statistics, –m cache statistics, –r replication statistics, –t transaction statistics.</p>
<p align="justify">Overall i liked what i saw about this alternative and i think that this is the most suitable solution for storing user profiles and user data that don’t need to be queried. When you need to scale this is for sure a very reliable solution. Have fun!</p>
<h2>Further reading</h2>
<p>Homepage: <a href="http://memcachedb.org">http://memcachedb.org</a></p>
<p>Mailing list: <a href="http://groups.google.com/group/memcachedb">http://groups.google.com/group/memcachedb</a>
<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F03%2Fkey-value-storage-using-memcachedb%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F03%2Fkey-value-storage-using-memcachedb%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=Architecture,distributed+storage,Entity+Attribute+value,high+performance,Memcachedb,Scalability" height="61" width="50" /><br />
			</a>
		</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bserban.org/2009/03/key-value-storage-using-memcachedb/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Facebook temporarily lost data.</title>
		<link>http://www.bserban.org/2009/03/facebook-temporarily-lost-data/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=facebook-temporarily-lost-data</link>
		<comments>http://www.bserban.org/2009/03/facebook-temporarily-lost-data/#comments</comments>
		<pubDate>Thu, 12 Mar 2009 22:51:20 +0000</pubDate>
		<dc:creator>bserban</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[data loss]]></category>
		<category><![CDATA[distributed storage]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[heavy load]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://www.bserban.org/?p=134</guid>
		<description><![CDATA[Last Sunday Facebook reported a data loss. We are talking about approximately 15% of users&#8217; photos. Loosing your client’s data is the worst thing that could happen to you and reminded me what a guy said once in a tech talk: “The main rules in running an online community service are: Never lose data and [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F03%2Ffacebook-temporarily-lost-data%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F03%2Ffacebook-temporarily-lost-data%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=data+loss,distributed+storage,facebook,heavy+load,memcached,storage" height="61" width="50" /><br />
			</a>
		</div>
<p>Last Sunday <a href="http://blog.facebook.com/blog.php?post=58637767130">Facebook reported</a> a data loss. We are talking about approximately 15% of users&#8217; photos. Loosing your client’s data is the worst thing that could happen to you and reminded me what a guy said once in a tech talk: “The main rules in running an online community service are: Never lose data and never go to jail.”</p>
<p>Facebook has not yet made public the details of what happened but only assured users that their photos will be restored using a backup. The official report states that we are talking about a hardware failure at storage level.</p>
<p>First of some key facts about Facebook</p>
<ul>
<li><a href="http://www.alexa.com/data/details/main/facebook.com">Facebook is the number 5</a> site in the world, which means it has a huge traffic (source Alexa.com),</li>
<li>They have 10,000 servers including 1,800 MySQL servers (administrated only by two guys, they say),</li>
<li>Last October the users uploaded <a href="http://www.facebook.com/note.php?note_id=30695603919">10 billion pictures on Facebook</a> and considering that they keep 4 back-up copies it means that they have to store 40 billion pictures,</li>
<li>2-3 Terabytes of photos are uploaded every day,</li>
<li>They serve 15 billion photo images per day,</li>
<li>Daily uploads are around 100 million photos,</li>
<li>The peak is about 450,000 images per second.</li>
</ul>
<p>Based on the above numbers it means that they lost approximately <strong>1.5 billion</strong> of pictures. Waw!</p>
<p>How is Facebook handling user’s images? Last year Jason Sobel, Manager of the Facebook Infrastructure Group, presented some insights about the current Facebook storage solution and the future one. We don’t know right now whether the new storage solution failed or the old one is to blame.</p>
<h2>Writing files using the old way</h2>
<p>They were using upload servers and stored images via NFS into a NetApp storage (last year they were planning to replace it). Each image is stored 4 times. This solution experienced heavy workload when processing metadata.</p>
<h2>Reading files using the old way</h2>
<p>Here all resumes to speed.</p>
<ul>
<li>First level of Cache is done using CDN, which has a hit rate of 99.8% for profiles and 92% for the rest.</li>
<li>Second Level of Cache is done using Cachr for profiles which is a modified evhttpd with memcached as storage,  and a File Handle Cache (lighttpd and memcached) for the rest of it to reduce metadata workload on NetApp.</li>
<li>NetApp storage via NFS. They tried to optimize it and to reduce the number of I/O access because of the the metadata heavy workload.</li>
</ul>
<p>The main concerns with the above architecture are:  Netapp storage is overwhelmed, they rely too much on CDNs.</p>
<p>Obviously when your app grows like hell, you start to think that is better to make your own toys, fully customized and optimized for your particular problem. So did Amazon back in 2001 and  Google too.  This is how the Facebook storage was born: <strong>Haystack</strong></p>
<h2>Haystack</h2>
<p>The answer was to develop in house a distributed file system like GFS (Google File System). Haystack should run on inexpensive commodity hardware, and it should deliver high aggregate performance to a large number of clients.</p>
<p>Haystack is file based and stores arbitrary data in files. For 1Gb disk data file they create 1M in memory index. In this way they have one disk seek which is much better than NetApp which had 3.</p>
<p>The Haystack format is rather simple and efficient, Version number, Magic number (supplied by the client to prevent brute force attack), length, data, checksum.  The index simply stores the Version, Photo key, Photo size, start, length.</p>
<p>Using a Haystack server</p>
<div>
<div style="border-style: none; padding: 0px; overflow: visible; font-size: 8pt; width: 100%; color: black; line-height: 12pt; font-family: consolas,'Courier New',courier,monospace; background-color: #f4f4f4;">
<pre style="border-style: none; margin: 0em; padding: 0px; overflow: visible; font-size: 8pt; width: 100%; color: black; line-height: 12pt; font-family: consolas,'Courier New',courier,monospace; background-color: white;">To write uses POST</pre>
<pre style="border-style: none; margin: 0em; padding: 0px; overflow: visible; font-size: 8pt; width: 100%; color: black; line-height: 12pt; font-family: consolas,'Courier New',courier,monospace; background-color: #f4f4f4;">/write/[pvid]_[key]_[magic]_[size].jpg</pre>
<pre style="border-style: none; margin: 0em; padding: 0px; overflow: visible; font-size: 8pt; width: 100%; color: black; line-height: 12pt; font-family: consolas,'Courier New',courier,monospace; background-color: white;">- writes data on disk haystack</pre>
<pre style="border-style: none; margin: 0em; padding: 0px; overflow: visible; font-size: 8pt; width: 100%; color: black; line-height: 12pt; font-family: consolas,'Courier New',courier,monospace; background-color: #f4f4f4;">- writes data on in memory index</pre>
<pre style="border-style: none; margin: 0em; padding: 0px; overflow: visible; font-size: 8pt; width: 100%; color: black; line-height: 12pt; font-family: consolas,'Courier New',courier,monospace; background-color: #f4f4f4;">To read uses GET</pre>
<pre style="border-style: none; margin: 0em; padding: 0px; overflow: visible; font-size: 8pt; width: 100%; color: black; line-height: 12pt; font-family: consolas,'Courier New',courier,monospace; background-color: #f4f4f4;">/[pvid]_[key]_[magic]_[size].jpg</pre>
<pre style="border-style: none; margin: 0em; padding: 0px; overflow: visible; font-size: 8pt; width: 100%; color: black; line-height: 12pt; font-family: consolas,'Courier New',courier,monospace; background-color: white;">- uses the in memory index to retrieve the offset</pre>
<pre style="border-style: none; margin: 0em; padding: 0px; overflow: visible; font-size: 8pt; width: 100%; color: black; line-height: 12pt; font-family: consolas,'Courier New',courier,monospace; background-color: #f4f4f4;">- reads data from the on-disk file</pre>
</div>
</div>
<p>This simple approach allows Facebook to easily balance the reads and writes using Haystack clusters but to speed up the reads they still plan to use CDNs in areas where they don’t have datacenters and Cachr for profiles. This is their first step to create their own CDN network.</p>
<h2>Additional readings</h2>
<p><a href="http://www.flowgram.com/p/2qi3k8eicrfgkv/">Needle in a haystack: efficient storage of billions of photos</a></p>
<p><a href="http://www.facebook.com/note.php?note_id=30695603919">Engineering[at]Facebook’s Notes</a>
<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F03%2Ffacebook-temporarily-lost-data%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F03%2Ffacebook-temporarily-lost-data%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=data+loss,distributed+storage,facebook,heavy+load,memcached,storage" height="61" width="50" /><br />
			</a>
		</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bserban.org/2009/03/facebook-temporarily-lost-data/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Simple way to scale your Web App &#8211; Part 1</title>
		<link>http://www.bserban.org/2009/02/simple-way-to-scale-your-wep-app-part-1/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=simple-way-to-scale-your-wep-app-part-1</link>
		<comments>http://www.bserban.org/2009/02/simple-way-to-scale-your-wep-app-part-1/#comments</comments>
		<pubDate>Sun, 08 Feb 2009 12:12:50 +0000</pubDate>
		<dc:creator>bserban</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[High Availability]]></category>
		<category><![CDATA[MySql]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Tomcat]]></category>

		<guid isPermaLink="false">http://www.bserban.org/?p=3</guid>
		<description><![CDATA[There is simple way to scale your application and it doesn't cost much. Having such an architecture will make you sleep confident in the night.]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F02%2Fsimple-way-to-scale-your-wep-app-part-1%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F02%2Fsimple-way-to-scale-your-wep-app-part-1%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=Architecture,High+Availability,MySql,Scalability,Tomcat" height="61" width="50" /><br />
			</a>
		</div>
<p>Every time i made an J2EE application, my main concern was about how many requests it could handle in the end. In my early dawns of my career I have been in the situation when my app was his own victim of success, to much load for a single server. We always had a reactive attitude and tried to deal with the problem when it happened, but some times it was too damn late. To be able to scale you application it must be made to be scaled.</p>
<p>But what would be a simple scalable architecture?</p>
<p><img class="alignleft size-full wp-image-70" src="http://www.bserban.org/wp-content/uploads/2009/02/simple-scalable-architecture1.png" alt="Simple Web Application Architecture" width="291" height="439" /></p>
<p>Let&#8217;s consider the diagram from the left.  Our application will be splited in three logical clusters.  First one is the application cluster, second database cluster and finally logging and statistic cluster.</p>
<h1>The Load Balancer</h1>
<p>A load balancer distributes the traffic among you application servers.  It can be  <strong>software </strong>or a <strong>hardware </strong>device.  A handy solution is to use a software balancer, such an Apache, but the software solution is not so robust and performant as a hardware balancer.</p>
<h2>Hardware</h2>
<p>A dedicated device for load balancing is more suitable and gives you more performance. Thus, this comes with an additional costs but there many devices on the market and you should choose the best one on cost/ features. When evaluating a load balancer some things must be kept in mind</p>
<ul>
<li><strong>Maximum connection supported</strong></li>
<li><strong>Throughput</strong></li>
<li><strong>RIPS <strong>(</strong></strong>Real IP Address of a Real Server)</li>
<li><strong>SSL</strong></li>
<li><span style="font-weight: bold;">Gb-NICs</span></li>
<li><strong>Price</strong></li>
</ul>
<h2>Software alternatives</h2>
<ul>
<li>Apache, mode_rewrite, it can be used to provide simple url based balancing capabilities.</li>
<li>Apache, <a title="mode_proxy_balancer" href="http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html">mod_proxy_balancer</a>, It provides load balancing support for     <code>HTTP</code>, <code>FTP</code> and <code>AJP13</code> protocols. The best option for a medium web application.</li>
<li>Apache, <a title="apache mod backhand" href="http://www.backhand.org/">mod backhand</a>, http and log spread.  It is considered more performant than mod_proxy_balancer</li>
<li><a title="Perball" href="http://www.danga.com/perlbal/">Perball </a>from Danga, already used for  <a href="http://www.livejournal.com/">LiveJournal</a>, <a href="http://www.vox.com/">Vox</a> and <a href="http://www.typepad.com/">TypePad</a></li>
<li><a title="high-performance HTTP accelerator" href="http://varnish.projects.linpro.no/">Varnish</a>, open source http server, a new commer. Has also load balancing capabilities , see here <a title="Varnish FAQ" href="http://varnish.projects.linpro.no/wiki/FAQ">http://varnish.projects.linpro.no/wiki/FAQ</a></li>
<li><a title="Pure Load Balancer" href="http://plb.sunsite.dk/index.html">Pure Load Balancer [PLB]</a>, for http and smtp.</li>
<li><a title="Linux Virtual Server" href="http://www.linuxvirtualserver.org/">Linux Virtual Server</a>, an advanced load balancing solution     can be used to build highly scalable and highly available network     services, such as scalable web, cache, mail, ftp, media and VoIP     services. Though this solution is for bigger purposes than scale a simple application.</li>
</ul>
<p>To balance SSL connections your balancer should provide SSL termination capability. Otherwise being a connection level protocol the SSL connection should persists between server and client by allocating the same host to the same client.</p>
<p>More about Load Balancing in a future post.</p>
<h1>The Application Cluster</h1>
<p>To be able to scale a web application, it has to be designed to scale. The main issue is session replication. Depending on the load balancing algorithm you will need to replicate the session or to use sticky session.</p>
<h2>Stateless</h2>
<p>This mode doesn&#8217;t require much from a load balancer. This is very common to REST applications.  For a web 2.0 application this could be the common aproach. The application stores everything it needs on the client side.  The first user request goes on the first machine while de second will hit a different machine. No data has to be shared between web servers. To handle more requiest new servers can be added in the web pool and the system will scale out.</p>
<p>Having a stateless design allows us a seamless failover. This can be achieved no matter what language we use to develop our application</p>
<h2>Sticky session</h2>
<p>Some times our application needs to store user specific data at session level. This means that every time a user hits the server we need some data to be able to process user request, we need local data and state. This data must be available on all server where the user request come. To cope with this problem we can use &#8220;sticky session&#8221; which means we need to ensure that the user will hit the same server as the initial request.</p>
<p>The most common aproach with sticky sessions  is:</p>
<ul>
<li>a cookie that holds the routing information</li>
<li>configuration that defines this cookie id for the balancer</li>
<li>configuration that defines routing for each back-end</li>
</ul>
<h2>Session replication</h2>
<p>This technique is very common in J2EE where the Servlet Containers provide a way to replicate the session between the servers. There are several condition for a session to be replicated but it can be a viable alternative to sticky session.</p>
<h1>The Database cluster</h1>
<p>Obviously, my immediate choice will be to use MySQL as database server. It&#8217;s free, has a lot of community, it serves a lot of well known web 2.0 sites.</p>
<p>Using MySql you can scale horizontal by using MySql&#8217;s replication mechanism. When the database grows is time to partition our data horizontally into shards.</p>
<h2>Replication</h2>
<h4>Master-Slave</h4>
<p>This type of configuration will help you to scale the reads from the write.  The reads will always go to the Slave while the transactions that alter the data will go to the Master.  You can have one Master and multiple slaves.</p>
<h4>Master &#8211; Master</h4>
<p>Such an approach will distribute the load evenly and it also provides High Availability. MySQL 5.0 provides replication at statement level which often can crash the replication because of conflicts. The most common conflicts i ever met are for unique indexes but you can coupe with this problem by using &#8220;replace&#8221; command instead of insert. Anyway the MySQL 5.1 will have some semnificative changes in the replication module.</p>
<h4>Tree Replication</h4>
<p>This gives you a lot of posibilities, you can conbine master-master with master-slave in a tree structure and conbined with sharding it can result into a fine tuned MySQL cluster. More about Tree Replication and data partitioning in further posts.</p>
<h1>Logging &amp; Batch processing</h1>
<p>In the end we reached the final component of our application.: the logging server and the batch processing machine. Why do we need them? Simply because somebody has to do them. Every application has some batch processing to be done. This is done usually by using cron and scripts. I recommend to use a scripting language for batch processing such as: bash, ruby, python etc.</p>
<p>For logging I would suggest you to use syslog or syslog-ng which has more advanced features than syslog, better performance in terms of cpu and supports UTF-8.</p>
<h1>What&#8217;s next?</h1>
<p>Next I would like to walk step by step through designing a J2EE based on the architecture discussed above by using Apache, Tomcat, Struts, Hibernate and MySQL.
<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F02%2Fsimple-way-to-scale-your-wep-app-part-1%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.bserban.org%2F2009%2F02%2Fsimple-way-to-scale-your-wep-app-part-1%2F&amp;source=bserban&amp;style=normal&amp;service=bit.ly&amp;hashtags=Architecture,High+Availability,MySql,Scalability,Tomcat" height="61" width="50" /><br />
			</a>
		</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bserban.org/2009/02/simple-way-to-scale-your-wep-app-part-1/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>
