Zend Framework: The Cost of Flexibility is Complexity

The Zend Framework is a very flexible system, in fact, it can be used to build practically anything. But, the problem with using a flexible system is that flexibility costs. And the cost of flexibility is complexity.

Martin Fowler wrote:

Every time you put extra stuff into your code to make it more flexible, you are usually adding more complexity. If your guess about the flexibility needs of your software is correct, then you are ahead of the game. You’ve gained. But if you get it wrong, you’ve only added complexity that makes it more difficult to change your software.

Don’t assume that just because you’re using an object-oriented framework you are writing reusable and maintainable code. You are just structuring your spaghetti code. As Paul Graham put it: “Object-oriented programming offers a sustainable way to write spaghetti code”. The main problem with flexibility is that most developers give up trying to understand. I don’t blame them, no one likes dealing with complexity. However, flexibility is sometimes required and generally always desirable. But, sometimes we forget that frameworks exist to help us solve or address complex issues, not to fix bad design decisions for us. So if your code is difficult to understand or unit test, then you are probably doing something wrong.

Just because you can, doesn’t mean you should

What’s the best way to render a view script in the Zend Framework? Think, don’t give up yet. Is it enabling or disabling the ViewRenderer action helper? Remember, the framework allows you to shape the Zend_Controller_Action class to your application’s needs.

The manual clearly states that:

By default, the front controller enables the ViewRenderer action helper. This helper takes care of injecting the view object into the controller, as well as automatically rendering views.

However, no one stops you from doing it manually:

Zend_Controller_Action provides a rudimentary and flexible mechanism for view integration. Two methods accomplish this, initView() and render(); the former method lazy-loads the $view public property, and the latter renders a view based on the current requested action, using the directory hierarchy to determine the script path.

Developer A

class IndexController extends Zend_Controller_Action
{
    public function searchAction()
    {
        $query = $this->_getParam('q', null);
        if (null === $query) {
            $this->view->error = 'Invalid query';
            return;
        }

        $this->view->query = $query;
        if (is_numeric($query)) {
            $this->setUserDetailsById($query);
        } else {
            $this->setUserDetailsByName($query);
        }
    }

    public function setUserDetailsByName($name)
    {
        $row = $this->findUserByName($name);
        $this->view->userId = $row['id'];
        $this->view->userName = $row['name'];
    }

    public function setUserDetailsById($id)
    {
        $row = $this->findUserById($id);
        $this->view->userId = $row['id'];
        $this->view->userName = $row['name'];
    }
}

Developer B

class IndexController extends Zend_Controller_Action
{
    protected $_viewParams = array(
        'error'    => null,
        'query'    => null,
        'userId'   => null,
        'userName' => null,
    );

    public function init()
    {
        $this->_helper->removeHelper('viewRenderer');
    }

    public function searchAction()
    {
        $query = $this->_getParam('q', null);
        if (null === $query) {
            $this->_viewParams['error'] = 'Invalid query';
            $this->renderView('search');
            return false;
        }

        return $this->displayUserDetails($query);
    }

    public function displayUserDetails($query)
    {
        if (is_numeric($query)) {
            $row = $this->getUserDetailsById($query);
        } else {
            $row = $this->getUserDetailsByName($query);
        }

        if (!$row) {
            $this->_viewParams['error'] = 'User not found';
            $this->renderView('search');
            return false;
        }

        foreach ($this->_viewParams as $key => $value) {
            if (!array_key_exists($key, $row) {
                continue;
            }
            $this->_viewParams[$key] = $row[$key];
        }

        $this->_viewParams['query'] = $query;
        $this->renderView('search');

        return true;
    }

    public function renderView($template)
    {
        $view = $this->initView();
        foreach ($this->_viewParams as $key => $value) {
            $view->$key = $value;
        }
        $this->render($template);
    }

    public function getUserDetailsByName($name)
    {
        return $this->findUserByName($name);
    }

    public function getUserDetailsById($id)
    {
        return $this->findUserById($id);
    }
}

Same problem, different solutions.

Now, guess who is doing test-driven development, has all his code unit tested, puts commonly used code into libraries, abstracts out common patterns of behaviour, writes his code using the top-down fashion and uses the principle of “least astonishment”.

Developer A or B?

Conclusion

Many requirements lie in the future and are unknowable at the time our application is designed and built. To avoid burdensome maintenance costs developers must therefore rely on a system’s ability to change gracefully its flexibility. The combination of flexibility and smart design can reduce the maintenance burden. It’s up to us, the developers, to change our thinking when using a flexible framework. Like I said before, frameworks exist to help us solve complex issues, not to make decisions for us.

Geo Proximity Search: The Haversine Equation

I’m working on a project that requires Geo proximity search. Basically, what I’m doing is plotting a radius around a point on a map, which is defined by the distance between two points on the map given their latitudes and longitudes. To achieve this I’m using the Haversine formula (spherical trigonometry). This equation is important in navigation, it gives great-circle distances between two points on a sphere from their longitudes and latitudes. You can see it in action here: Radius From UK Postcode.

This has already been covered in some blogs, however, I found most of the information to be inaccurate and, in some cases, incorrect. The Haversine equation is very straight forward, so there’s no need to complicate things.

I’ve implemented the solution in SQL, Python and PHP. Use the one that suits you best.

SQL implementation

set @latitude=53.754842;
set @longitude=-2.708077;
set @radius=20;

set @lng_min = @longitude - @radius/abs(cos(radians(@latitude))*69);
set @lng_max = @longitude + @radius/abs(cos(radians(@latitude))*69);
set @lat_min = @latitude - (@radius/69);
set @lat_max = @latitude + (@radius/69);

SELECT * FROM postcode
WHERE (longitude BETWEEN @lng_min AND @lng_max)
AND (latitude BETWEEN @lat_min and @lat_max);

Python implementation

from __future__ import division
import math

longitude = float(-2.708077)
latitude = float(53.754842)
radius = 20

lng_min = longitude - radius / abs(math.cos(math.radians(latitude)) * 69)
lng_max = longitude + radius / abs(math.cos(math.radians(latitude)) * 69)
lat_min = latitude - (radius / 69)
lat_max = latitude + (radius / 69)

print 'lng (min/max): %f %f' % (lng_min, lng_max)
print 'lat (min/max): %f %f' % (lat_min, lat_max)

PHP implementation

$longitude = (float) -2.708077;
$latitude = (float) 53.754842;
$radius = 20; // in miles

$lng_min = $longitude - $radius / abs(cos(deg2rad($latitude)) * 69);
$lng_max = $longitude + $radius / abs(cos(deg2rad($latitude)) * 69);
$lat_min = $latitude - ($radius / 69);
$lat_max = $latitude + ($radius / 69);

echo 'lng (min/max): ' . $lng_min . '/' . $lng_max . PHP_EOL;
echo 'lat (min/max): ' . $lat_min . '/' . $lat_max;

It outputs the same result:

lng (min/max): -3.1983251898421/-2.2178288101579
lat (min/max): 53.464986927536/54.044697072464

That’s it. Happy Geolocating!

Zend Framework Automatic Dependency Tracking

When you develop or deploy an application, dependency tracking is one of the problems you must solve. Keeping track of dependencies for every application you develop is not an easy task. To solve this problem I’ve created Zend_Debug_Include, a Zend Framework component that supports automatic dependency tracking.

We all agree that dependencies cannot be maintained by hand, it’s almost impossible, specially if you are using packages from Zend, Solar, Maintainable and/or Zym (if you don’t know any of these frameworks, check them out, they are pretty cool). Every time you add an include/require statement or create an instance of an object, you introduce a new dependency. And that’s why tracking internal project dependencies can become complex when using a framework.

The concept behind Zend_Debug_Include is that the dependencies for each source file are stored in a separate file. If the source file is modified, the file containing that source file’s dependencies is rebuilt. This concept enables you to determine run-time dependencies of files using arbitrary components. This solution is also useful if you are deploying your application using Linux packages. But, dependency tracking isn’t just useful for deploying applications, it can also be used to evaluate packages. Sometimes packages create unnecessary dependencies and this is something that we need to monitor.

Zend_Debug_Include comes with 3 built-in adapters: File, Package and Url.

Tracking File Dependencies

To track file dependencies you need to create an instance of Zend_Debug_Include_Manager and set the Zend_Debug_Include_Adapter_File adapter. This needs to happen inside your bootstrapper file, before the Front Controller dispatches the request.

$included = new Zend_Debug_Include_Manager();
$included->setAdapter(new Zend_Debug_Include_Adapter_File());
$included->setOutputDir('/var/www/my-app/dependencies');

/* Dispatch request */
$frontController->dispatch();

This creates a zf-files.txt in your output directory containing all the files included or required on that request. Every time the Front Controller dispatches a request, Zend_Debug_Include checks for new dependencies and adds them to the zf-files.txt file:

/var/www/my-app/public/index.php
/var/www/my-app/application/bootstrap.php
/var/www/my-app/library/Zend/Loader.php
/var/www/my-app/library/Zend/Controller/Front.php
/var/www/my-app/library/Zend/Controller/Action/HelperBroker.php
/var/www/my-app/library/Zend/Controller/Action/HelperBroker/PriorityStack.php
/var/www/my-app/library/Zend/Controller/Exception.php
/var/www/my-app/library/Zend/Exception.php
/var/www/my-app/library/Zend/Controller/Plugin/Broker.php
/var/www/my-app/library/Zend/Controller/Plugin/Abstract.php
/var/www/my-app/library/Zend/Controller/Dispatcher/Standard.php
/var/www/my-app/library/Zend/Controller/Dispatcher/Abstract.php
/var/www/my-app/library/Zend/Controller/Dispatcher/Interface.php
/var/www/my-app/library/Zend/Controller/Request/Abstract.php
/var/www/my-app/library/Zend/Controller/Response/Abstract.php
/var/www/my-app/library/Zend/Controller/Router/Route.php
/var/www/my-app/library/Zend/Controller/Router/Route/Abstract.php
/var/www/my-app/library/Zend/Controller/Router/Route/Interface.php
/var/www/my-app/library/Zend/Config.php
... (63 files in total) ...

To change the name of the file:

$included = new Zend_Debug_Include_Manager();
$included->setOutputDir('/var/www/my-app/dependencies');
$included->setFilename('files.dep');
...

Tracking Package Dependencies

Similar to Zend_Debug_Include_Adapter_File, but groups all the files into packages.

$included = new Zend_Debug_Include_Manager();
$included->setAdapter(new Zend_Debug_Include_Adapter_Package());
$included->setOutputDir('/var/www/my-app/dependencies');

The code above creates a zf-packages.txt file and adds the following data:

Zend/Loader.php
Zend/Controller
Zend/Exception.php
Zend/Config.php
Zend/Debug
Zend/View.php
Zend/View
Zend/Loader
Zend/Uri.php
Zend/Filter
Zend/Filter.php

Now, if you introduce a new dependency, for example, Zend_Mail:

class IndexController extends Zend_Controller_Action
{
    public function indexAction()
    {
        $mail = new Zend_Mail();
        $this->view->message  = 'test';
    }
}

The next time the Front Controller dispatches the request and calls the index action, Zend_Debug_Include will automatically add the Zend_Mail package and all its dependencies to the zf-packages.txt file:

Zend/Loader.php
Zend/Controller
Zend/Exception.php
Zend/Config.php
Zend/Debug
Zend/View.php
Zend/View
Zend/Loader
Zend/Uri.php
Zend/Filter
Zend/Filter.php
Zend/Mail.php
Zend/Mail
Zend/Mime.php
Zend/Mime

You can then use this information to keep track of dependencies and tell your build tool the name of the files and directories you need to copy and package.

External Dependencies

Here is where everything starts to make sense. Zend_Debug_Include allows you to search for external dependencies as well, you just need to tell the Adapter the libraries you are using. For example:

$libraries = array('Zend', 'Solar');
$adapter = new Zend_Debug_Include_Adapter_Package($libraries);

$included = new Zend_Debug_Include_Manager();
$included->setAdapter($adapter);
$included->setOutputDir('/var/www/my-app/dependencies');

The Solar packages will also be added to the zf-packages.txt file:

Zend/Loader.php
Zend/Controller
Zend/Exception.php
Zend/Config.php
Zend/Debug
Zend/View.php
Zend/View
Zend/Loader
Zend/Uri.php
Zend/Filter
Zend/Filter.php
Zend/Mail.php
Zend/Mail
Zend/Mime.php
Zend/Mime
Solar/Base.php
Solar/File.php
Solar/Factory.php
Solar/Sql.php
Solar/Sql
Solar/Cache.php
Solar/Cache

URL Adapter

If you want to create a different file for each request, use the URL adapter instead:

$included = new Zend_Debug_Include_Manager();
$included->setAdapter(new Zend_Debug_Include_Adapter_Url());
$included->setOutputDir('/var/www/my-app/dependencies');

The URL adapter maps the URL path to a filename. So, if you request the following URI:

http://my-app/blog/2009/02/01

It creates the file blog_2009_02_01.txt.

SVN

$ cd libraries/Zend
$ svn co http://svn.fedecarg.com/repo/Zend/Debug

Top Posts 2008

I looked through this blog’s statistics to find out which posts got the most page views in 2008. Here are the top 10 posts:

  1. 30 Useful PHP Classes and Components
  2. Where is the include coming from?
  3. Open-source PHP applications that changed the world
  4. 10 great articles for optimizing MySQL queries
  5. 20 MediaWiki Extensions You Should Be Using
  6. Zend Framework Architecture
  7. Loading models within modules in the Zend Framework
  8. Scalable and Flexible Directory Structure for Web Applications
  9. BBC’s New Infrastructure: Java and PHP
  10. Code Refactoring Guidelines

Each post was written for a reason. “Open-source applications that changed the world” is my favourite one. It not only brings back good memories, but also shows our commitment to the open-source community and our passion for developing and supporting open-source software. Zend Framework was the most popular Web application in 2008. Yes, Dow Jones, HSBC and even the BBC is using it. In case you haven’t notice, I spent the last 12 months promoting the Zend Framework, reason why it gets mentioned in 6 of the posts. I’m also glad that “Code Refactoring Guidelines” made it to the top 10. It shows that the PHP community cares.

Running Quercus in Jetty Web Server

Jetty Web server can be invoked and installed as a stand alone application server. It has a flexible component based architecture that allows it to be easily deployed and integrated in a diverse range of instances. The project is supported by a growing community and a team with a history of being responsive to innovations and changing requirements. More info here.

Installing Jetty

First you need to download Jetty. It’s distributed as a platform independent zip file containing source, javadocs and binaries. The most recent distro can be downloaded from Codehaus:

$ wget http://dist.codehaus.org/jetty/jetty-6.1.14/jetty-6.1.14.zip
$ unzip jetty-6.1.14.zip
$ cp -R jetty-6.1.14 /opt/
$ cd /opt
$ ln -s /opt/jetty-6.1.14 jetty

Problems installing Jetty? More info here.

Running Jetty

Running jetty is as simple as going to your jetty installation directory and typing:

$ cd /opt/jetty
$ java -jar start.jar etc/jetty.xml

This will start jetty and deploy a demo application available at:

http://localhost:8080/test

That’s it. Now stop Jetty with cntrl-c in the same terminal window as you started it.

Installing Quercus

Quercus is a complete implementation of the PHP language and libraries in Java. It gives both Java and PHP developers a fast, safe, and powerful alternative to the standard PHP interpreter. Quercus is available for download as a WAR file which can be easily deployed on Jetty:

$ wget -P ~/quercus http://quercus.caucho.com/download/quercus-3.2.1.war
$ jar xf ~/quercus/quercus-3.2.1.war

Unpack the WAR file and copy all the jars to Jetty’s global library directory:

$ cp ~/quercus/WEB-INF/lib/* /opt/jetty/lib

Configuring Jetty

Edit the web.xml file:

$ vi /opt/jetty/webapps/test/WEB-INF/web.xml

Add the following between the web-app tags:

<servlet>
    <servlet-name>Quercus Servlet</servlet-name>
    <servlet-class>com.caucho.quercus.servlet.QuercusServlet</servlet-class>
</servlet>
<servlet-mapping>
    <servlet-name>Quercus Servlet</servlet-name>
    <url-pattern>*.php</url-pattern>
</servlet-mapping>
<welcome-file-list>
    <welcome-file>index.html</welcome-file>
    <welcome-file>index.php</welcome-file>
    <welcome-file>index.jsp</welcome-file>
</welcome-file-list>

Create a PHP file inside the test application:

$ cat /opt/jetty/webapps/test/index.php
<?php phpinfo(); ?>

This file will be available at:

http://localhost:8080/index.php

It works! You are now ready to:

Instantiate objects by class name

<?php
$a = new Java("java.util.Date", 123);
print $a->time;

Import classes

<?php
import java.util.Date;

$a = new Date(123);
print $a->time;

Call Java methods

<?php
import java.util.Date;

$a = new Date(123);
print $a->getTime();
print $a->setTime(456);

print $a->time;
$a->time = 456;

And much, much more.

The Multimaster Replication Problem

Replication has its problems, specially if you have a multimaster replication system. To make matters worse, none of the PHP frameworks support multimaster replication systems nor handle master failover. Symfony uses Propel and only supports master-slave replication systems. When the master fails, it’s true that you have the slaves ready to replace it, but the process of detecting the failure and acting upon it requires human intervention. Zend Framework, on the other hand, doesn’t support replication at all.

I strongly believe that a master failover needs to be handled appropriately on the application side. Of course, you can always use an SQL proxy or any other server-side solution, but they are either limited or unreliable.

From Digg’s Blog:

The Digg database access layer is written in PHP and lives at the level of the application server. Basically, when the application decides it needs to do a read query, it passes off the query with a descriptor to a method that grabs a list of servers for the database pool that can satisfy the query, then picks one at random, submits the query, and returns the results to the calling method.

If the server picked won’t respond in a very small amount of time, the code moves on to the next server in the list. So if MySQL is down on a database machine in one of the pools, the end-user of Digg doesn’t notice. This code is extremely robust and well-tested. We worry neither that shutting down MySQL on a read slave in the Digg cluster, nor a failure in alerting on a DB slave that dies will cause site degradation.

Every few months we consider using a SQL proxy to do this database pooling, failover, and balancing, but it’s a tough sell since the code is simple and works extremely well. Furthermore, the load balancing function itself is spread across all our Apaches. Hence there is no “single point of failure” as there would be if we had a single SQL proxy.

If you are building your own solution and need a sandbox to test it, I recommend using MySQL Sandbox. Also, you might find this script useful: MySQL Master-Master Replication Manager

Memcached consistent hashing mechanism

If you are using the Memcache functions through a PECL extension, you can set global runtime configuration options by specifying the values within your php.ini file. One of them is memcache.hash_strategy. This option sets the hashing mechanism used to select and specifies which hash strategy to use: Standard (default) or Consistent.

It’s recommended that you set to Consistent to allow servers to be added or removed from the pool without causing the keys to be remapped to other servers. When set to standard, an older strategy is used that potentially uses different servers for storage.

With PHP, the connections to the memcached instances are kept open as long as the PHP and associated Apache instance remain running. When adding a removing servers from the list in a running instance, the connections will be shared, but the script will only select among the instances explicitly configured within the script.

So, to ensure that changes to the server list within a script do not cause problems, make sure to use the consistent hashing mechanism.