This weekend I tried to make an “Air” client for search using “Adobe Community Help. I had no prior experience with Flex and AIR. My first attempt ended as a simple project for college, a spaghetti code. All examples found on the web or official documentation are based on massive use mxml code. I ended up having a glorious mxml file. Huge .
Some smart guys are telling you to split your mxml using an ActionScript file but is like putting the trash under the carpet. It is still spaghetti code but in two dishes.
I tried to remember how I would have programmed it in Java Swing. 9 years have passed since then but I remembered. TableModel, DefaultListModel, ButtonModel etc. they all came out from a dark corner of my memory. Yes ! I remembered, Swing MVC.
A typical sample from documentation…..
Who hard-codes his application in this way? In a typical application you get this stuff from files or database. But let’s see the good part, at least is giving you a hint, a ComboBox and List are accepting an Array as input.
MVC applied to Flex
First of all a briefing about MVC
- Model, is the data. Manipulates the internal state and fire events when the internal state changed.
- View, the visual representation for the Model’s data (controls on the screen)
- Controller, is responsible for interpreting the user actions on the view and make changes to the model. (usually an event handler in flex)
In reality there is no 100% demarcation between these three layers. Is not that easy to make them completely decoupled and usually we end up making some tradeoffs.
The controller will always know about the view and the view about the controller. Controller also knows about the model. In the end I could say that the model is the only piece of the MVC that can be “100% decoupled”.
Let’s try to put a simple screen like this into MVC.
We fill in some text and then press Add. After we press add the text box is cleared and the text is added into de the list. In the end wee look into it and try to make a reusable component.
The model contains the actual data, and for our example it should a class that hold list values. As I pointed out earlier we could use an ArrayCollection to store the list. For this we make an actionscript class named ListModel
The view should display model’s changes and for this we mark the model class as [Bindable]. This is nice feature of ActionScript, no such thing in Java Swing.
Now, to add a new item in the list we simply nee an addElement function. In this example we make the model a singleton. This means that if we make more than one list they will share the same data.
The controller is responsible for the interaction between view and model. In our case will make the validation of the text, it doesn’t allow the view to add empty text into the list. Let’s make a new class named Controller. As an exercise add also a sort function.
The controller holds a reference to the model and provides functions to the view. In a more advanced implementation the Controller would listen for events from the view and the decides what action should do.
The view is the graphical representation of the component. Usually it is a an mxml file but it can be also an actions script file for advanced programmers. It binds his data to the model and use the controller to process the view events. Let’s name it SimpleAirMVC.mxml. I added Air to its name because i decided to make an air project in Flex Builder.
This a simple way to implement a MVC in a small application. Things can become more complicated if the application is big. If this is the case then you should look over Pure MVC framework or Cairngorm. They eliminate the dependencies between MVC layer by using events. It is an event driven approach.
You may download the source code here SimpleAirMVC
Recently I had to do an API for my application. Coming from the world of J2EE, my first thought was to make a web service based on SOAP, but I soon realized that this type of J2EE web services is heavy. They are slow and cumbersome and requires the use of specialized frameworks or j2ee containers that support such services. After a careful study of the problem I have concluded that the best solution would be using services REST like, based on XML and JSON.
Read more about REST services in Roy Thomas Fielding’s dissertation paper Representational State Transfer (REST). This will give you some insides about what REST should be.
Anyway, I don’t plan to write about REST, I just want to share you some of the best practices for developing an web API. When you design an API you should be aware that from the moment that it’s launched to the public, changing it will become impossible An API evolves over time, but because you already have customers, you need to be compatible with earlier versions, otherwise customers will leave
Some things to keep in mind.
- Create a subdomain for the API, it will help you a lot to load balance your traffic. You could also have an URL path, but still will have the same entry point as the main application. However, the best is to create a subdomain for API.
- Version the API by including the version in the URL. This will help you stay compatible with earlier versions of the API, until everyone will upgrade to new version. Example:
- You should split your API in packages by using the URL namespace, Example
- Create API keys. You need a way to see who is using your API and how. If you do not have such keys you’ll never know how many customers you have.This practice would allow the measurement of service usage by customers and to impose limits for use.
- Monitor everything. Use your access log to monitor use of services. You need to know how many accesses, errors, readings, queries, changes you have for each service.
- Create API documentation with examples. Create applications for demo purposes.
- Use GET for read and POST for change. If the changes do not require a large volume of data, transmit data via POST URL, in this way you can log them into access.log. This is useful for statistics.
- You should use data collected in access logs to improve service or to create personalization and recommendation engines
Keep an eye on this post, because I intend to update it regularly. Know other good practices? If yes, then leave a message. Thanks!
On 21 January JBoss announced the first GA of RESTeasy was released
Like any other java nuts and bolts framework it is “certified” against JAX-RS specification which makes me worry about being a heavy approach.
JBoss RESTEasy is a framework that allows you to write RESTFul Web Services in Java. It is a fully certified and portable implementation of JAX-RS specification.
It can be run in Servlet container such Tomcat but the full benefits come when integrated with JBoss AS. What is new is that despite JAX-RS which is a server side specification the JBoss team innovated on the client side and they implemented JAX-RS Client framework to speed the development process.
* Fully certified JAX-RS implementation
* Portable to any app-server/Tomcat that runs on JDK 5 or higher
* Embeddedable server implementation for junit testing
* Rich set of providers for: XML, JSON, YAML, Fastinfoset, Atom, etc.
* JAXB marshalling into XML, JSON, Fastinfoset, and Atom as well as wrappers for arrays, lists, and sets of JAXB Objects.
* Asynchronous HTTP (Comet) abstractions for JBoss Web, Tomcat 6, and Servlet 3.0
* EJB, Spring, and Spring MVC integration
* Client framework that leverages JAX-RS annotations so that you can write HTTP clients easily (JAX-RS only defines server bindings)
I know that it sounds like another heavy J2EE framework but give it a try.
When you create an index at design time, you only can guess what would happen with you application in production. After you are live, the real hunt for creating indexes is just begining. Very often we tend to choose bad indexes ignoring basic rational thinking. I’ve seen a lot indexes that were slowing the application instead of increasing speed.
Right now, the current databases advanced so far that the differences between indexing a number column and indexing a varchar column are not so obvious anymore. Either you think to create an index on a varchar or on a number column, first you need to lay down the selects that you are going to run against that table. Not doing so is a waste of time. Next, think at the distribution.
For example let’s assume that we store 11,000,000 users in a table called my_users. We would have the following fields: username, email, first name, last name, nickname, birthday, age, gender, country. Our application will search for users using the following fields:
- first name, last name, gender
For 1 and 3 the indexes are obvious, we could make an index on username and an index on email. If the username is unique their selectivity will be equal with the primary key selectivity. But what is selectivity? A simple definition is that the selectivity of a column is the ratio between # of distinct values and # of total values. A primary key has selectivity 1.
So coming back to case number 2, what would be the best indexes? Let’s take the gender column first. Here we have only two possible values M and F. This means a selectivity of 2/11,000,000 which is 0, an awful index. If you have such an index you may well drop it because a full table scan could be more efficient than using this index.
How about first name and last name? This is a little more complicated, it differs on what names are you storing. If you have 30,000 of Johns and 50,000 of Smiths the index is useless, a simple select distinct on each column will give you the number to calculate the selectivity. If it is above 15% then the index is good, otherwise drop it. But one great thing is that in this case you could create a composed index on first name+last name which it will give you a higher selectivity for sure . Don’t forget that in MySQL an index on string columns allows only 1000 characters, which means when using UTF-8 you can only inlcude 333 characters.
Select the worst performing indexes by using the following sql. Note that composed indexes will apear multiple times, in the result, for each column so you need to pick the last apearance. Everything is below 15% it needs to be analyzed.
/* SQL script to grab the worst performing indexes in the whole server */ SELECT t.TABLE_SCHEMA AS `db` , t.TABLE_NAME AS `table` , s.INDEX_NAME AS `inde name` , s.COLUMN_NAME AS `field name` , s.SEQ_IN_INDEX `seq in index` , s2.max_columns AS `# cols` , s.CARDINALITY AS `card` , t.TABLE_ROWS AS `est rows` , ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %` FROM INFORMATION_SCHEMA.STATISTICS s INNER JOIN INFORMATION_SCHEMA.TABLES t ON s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAME INNER JOIN ( SELECT TABLE_SCHEMA , TABLE_NAME , INDEX_NAME , MAX(SEQ_IN_INDEX) AS max_columns FROM INFORMATION_SCHEMA.STATISTICS WHERE TABLE_SCHEMA != 'mysql' GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME ) AS s2 ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAME WHERE t.TABLE_SCHEMA != 'mysql' /* Filter out the mysql system DB */ AND t.TABLE_ROWS > 10 /* Only tables with some rows */ AND s.CARDINALITY IS NOT NULL /* Need at least one non-NULL value in the field */ AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* Selectivity < 1.0 b/c unique indexes are perfect anyway */ ORDER BY `sel %`, s.TABLE_SCHEMA, s.TABLE_NAME /* Switch to `sel %` DESC for best non-unique indexes */
This is taken from MySQL forge
Every time i made an J2EE application, my main concern was about how many requests it could handle in the end. In my early dawns of my career I have been in the situation when my app was his own victim of success, to much load for a single server. We always had a reactive attitude and tried to deal with the problem when it happened, but some times it was too damn late. To be able to scale you application it must be made to be scaled.
But what would be a simple scalable architecture?
Let’s consider the diagram from the left. Our application will be splited in three logical clusters. First one is the application cluster, second database cluster and finally logging and statistic cluster.
The Load Balancer
A load balancer distributes the traffic among you application servers. It can be software or a hardware device. A handy solution is to use a software balancer, such an Apache, but the software solution is not so robust and performant as a hardware balancer.
A dedicated device for load balancing is more suitable and gives you more performance. Thus, this comes with an additional costs but there many devices on the market and you should choose the best one on cost/ features. When evaluating a load balancer some things must be kept in mind
- Maximum connection supported
- RIPS (Real IP Address of a Real Server)
- Apache, mode_rewrite, it can be used to provide simple url based balancing capabilities.
- Apache, mod_proxy_balancer, It provides load balancing support for
AJP13protocols. The best option for a medium web application.
- Apache, mod backhand, http and log spread. It is considered more performant than mod_proxy_balancer
- Perball from Danga, already used for LiveJournal, Vox and TypePad
- Varnish, open source http server, a new commer. Has also load balancing capabilities , see here http://varnish.projects.linpro.no/wiki/FAQ
- Pure Load Balancer [PLB], for http and smtp.
- Linux Virtual Server, an advanced load balancing solution can be used to build highly scalable and highly available network services, such as scalable web, cache, mail, ftp, media and VoIP services. Though this solution is for bigger purposes than scale a simple application.
To balance SSL connections your balancer should provide SSL termination capability. Otherwise being a connection level protocol the SSL connection should persists between server and client by allocating the same host to the same client.
More about Load Balancing in a future post.
The Application Cluster
To be able to scale a web application, it has to be designed to scale. The main issue is session replication. Depending on the load balancing algorithm you will need to replicate the session or to use sticky session.
This mode doesn’t require much from a load balancer. This is very common to REST applications. For a web 2.0 application this could be the common aproach. The application stores everything it needs on the client side. The first user request goes on the first machine while de second will hit a different machine. No data has to be shared between web servers. To handle more requiest new servers can be added in the web pool and the system will scale out.
Having a stateless design allows us a seamless failover. This can be achieved no matter what language we use to develop our application
Some times our application needs to store user specific data at session level. This means that every time a user hits the server we need some data to be able to process user request, we need local data and state. This data must be available on all server where the user request come. To cope with this problem we can use “sticky session” which means we need to ensure that the user will hit the same server as the initial request.
The most common aproach with sticky sessions is:
- a cookie that holds the routing information
- configuration that defines this cookie id for the balancer
- configuration that defines routing for each back-end
This technique is very common in J2EE where the Servlet Containers provide a way to replicate the session between the servers. There are several condition for a session to be replicated but it can be a viable alternative to sticky session.
The Database cluster
Obviously, my immediate choice will be to use MySQL as database server. It’s free, has a lot of community, it serves a lot of well known web 2.0 sites.
Using MySql you can scale horizontal by using MySql’s replication mechanism. When the database grows is time to partition our data horizontally into shards.
This type of configuration will help you to scale the reads from the write. The reads will always go to the Slave while the transactions that alter the data will go to the Master. You can have one Master and multiple slaves.
Master – Master
Such an approach will distribute the load evenly and it also provides High Availability. MySQL 5.0 provides replication at statement level which often can crash the replication because of conflicts. The most common conflicts i ever met are for unique indexes but you can coupe with this problem by using “replace” command instead of insert. Anyway the MySQL 5.1 will have some semnificative changes in the replication module.
This gives you a lot of posibilities, you can conbine master-master with master-slave in a tree structure and conbined with sharding it can result into a fine tuned MySQL cluster. More about Tree Replication and data partitioning in further posts.
Logging & Batch processing
In the end we reached the final component of our application.: the logging server and the batch processing machine. Why do we need them? Simply because somebody has to do them. Every application has some batch processing to be done. This is done usually by using cron and scripts. I recommend to use a scripting language for batch processing such as: bash, ruby, python etc.
For logging I would suggest you to use syslog or syslog-ng which has more advanced features than syslog, better performance in terms of cpu and supports UTF-8.
Next I would like to walk step by step through designing a J2EE based on the architecture discussed above by using Apache, Tomcat, Struts, Hibernate and MySQL.