Federico Cargnelutti

Simple is better than complex. Complex is better than complicated. | @fedecarg

Archive for the ‘Programming’ Category

Most Visited Posts of 2009

leave a comment »

Written by Federico

April 2, 2010 at 8:50 pm

Posted in Programming

Implementing Dynamic Finders and Parsing Method Expressions

with 3 comments

Most ORMs support the concept of dynamic finders. A dynamic finder looks like a normal method invocation, but the method itself doesn’t exist, instead, it’s generated dynamically and processed via another method at runtime.

A good example of this is Ruby. When you invoke a method that doesn’t exist, it raises a NoMethodError exception, unless you define “method_missing”. Rails ActiveRecord::Base class implements some of its magic thanks to this method. For example, find_by_title(title) and find_by_title_and_date(title, date) are turned into:

find(:first, :conditions => ["title = ?", title])
find(:first, :conditions => ["title = ? AND date = ?", title, date])

What’s nice about Ruby is that the language allows you to define methods dynamically using the “define_method” method. That’s how Rails defines each dynamic finder in the class after it is first invoked, so that future attempts to use it do not run through the “method_missing” method.

Method Expressions

GORM, Grails ORM library, introduces the concept of dynamic method expressions. A method expression is made up of the prefix such as “findBy” followed by an expression that combines one or more properties. Grails takes advantage of Groovy features to provide dynamic methods:

findByTitle("Example")
findByTitleLike("Exa%")

Method expressions can also use a boolean operator to combine two criteria:

findAllByTitleLikeAndDateGreaterThan("Exampl%", '2010-03-23')

In this case we are using AND in the middle of the query to make sure both conditions are satisfied, but you could equally use OR:

findAllByTitleLikeOrDateGreaterThan("Exampl%", '2010-03-23')

Parsing Method Expressions

MethodExpressionParser is a PHP library for parsing method expressions. It’s designed to quickly and easily parse method expressions and construct conditions based on attribute names and arguments.

Description

[finderMethod]([attribute][expression][logicalOperator])?[attribute][expression]

Expressions

  • LessThan: Less than the given value
  • LessThanEquals: Less than or equal a give value
  • GreaterThan: Greater than a given value
  • GreaterThanEquals: Greater than or equal a given value
  • Like: Equivalent to a SQL like expression
  • NotEqual: Negates equality
  • IsNotNull: Not a null value (doesn’t require an argument)
  • IsNull: Is a null value (doesn’t require an argument)

Examples

findByTitleAndDate('Example', date('Y-m-d'));
SELECT * FROM book WHERE title = ? AND date = ?

findByTitleOrDate('Example', date('Y-m-d'))
SELECT * FROM book WHERE title = ? OR date = ?

findByPublisherOrTitleAndDate('Name', 'Example', date('Y-m-d'))
SELECT * FROM book WHERE publisher = ? OR (title = ? AND date = ?)

findByPublisherInAndTitle(array('Name1', 'Name2'), 'Example')
SELECT * FROM book WHERE publisher IN (?, ?) AND date = ?

findByTitleLikeAndDateNotNull('Examp%')
SELECT * FROM book WHERE title LIKE ? AND date NOT NULL

findByIdOrTitleAndDateNotNull(1, 'Example')
SELECT * FROM book WHERE (id = ?) OR (title = ? AND date NOT NULL)

Example 1:

findByTitleLikeAndDateNotNull('Examp%');

Outputs:

array
  0 =>
    array
      0 =>
        array
          'attribute' => string 'title'
          'expression' => string 'Like'
          'format' => string '%s LIKE ?'
          'placeholders' => int 1
          'argument' => string 'Examp%'
      1 =>
        array
          'attribute' => string 'date'
          'expression' => string 'NotNull'
          'format' => string '%s IS NOT NULL'
          'placeholders' => int 0
          'argument' => null

Example 2:

findByTitleAndPublisherNameOrTitleAndPublisherName('Title', 'a', 'Title', 'b');

Outputs:

array
  0 =>
    array
      0 =>
        array
          'attribute' => string 'title'
          'expression' => string 'Equals'
          'format' => string '%s = ?'
          'placeholders' => int 1
          'argument' => string 'Title'
      1 =>
        array
          'attribute' => string 'publisher_name'
          'expression' => string 'Equals'
          'format' => string '%s = ?'
          'placeholders' => int 1
          'argument' => string 'a'
  1 =>
    array
      0 =>
        array
          'attribute' => string 'title'
          'expression' => string 'Equals'
          'format' => string '%s = ?'
          'placeholders' => int 1
          'argument' => string 'Title'
      1 =>
        array
          'attribute' => string 'publisher_name'
          'expression' => string 'Equals'
          'format' => string '%s = ?'
          'placeholders' => int 1
          'argument' => string 'b'

See more examples: Project Wiki

Usage

class EntityRepository
{
    private $methodExpressionParser;

    // Return a single instance of MethodExpressionParser
    public function getMethodExpressionParser() {
    }

    // Finder methods
    public function findBy($conditions) {
        var_dump($conditions);
    }
    public function findAllBy($conditions) {
        var_dump($conditions);
    }

    // Invoke finder methods
    public function __call($method, $args) {
        if ('f' === $method{0}) {
            try {
                $result = $this->getMethodExpressionParser()->parse($method, $args);
                $finderMethod = key($result);
                $conditions = $result[$finderMethod];
            } catch (MethodExpressionParserException $e) {
                $message = sprintf('%s: %s()', $e->getMessage(), $method);
                throw new EntityRepositoryException($message);
            }
            return $this->$finderMethod($conditions);
        }

        $message = 'Invalid method call: ' . __METHOD__;
        throw new BadMethodCallException($message);
    }
}

Performance

PHP doesn’t allow you to define methods dynamically, this means that every time you invoke a finder method the parser has to search, extract and map all the attribute names and expressions. To avoid introducing this performance overhead you can cache the attribute names. For example:

class EntityRepository
{
    private $methodExpressionParser;
    private $classMetadata;

    // Return a single instance of MethodExpressionParser
    public function getMethodExpressionParser() {
    }

    // Return a single instance of ClassMetadata
    public function getClassMetadata() {
    }

    // Invoke finder methods
    public function __call($method, $args) {
        if ('f' === $method{0}) {
            $parser = $this->getMethodExpressionParser();
            $classMetadata = $this->getClassMetadata();
            try {
                $finderMethod = $parser->determineFinderMethod($method);
                if ($classMetadata->hasMissingMethod($method)) {
                    $attributes = $classMetadata->getMethodAttributes($method);
                    $conditions = $parser->map($args, $attributes);
                } else {
                    $expressions = substr($method, strlen($finderMethod));
                    $attributes = $this->extractAttributeNames($expressions);
                    $conditions = $parser->map($args, $attributes);
                    $classMetadata->setMethodAttributes($method, $attributes);
                }
            } catch (MethodExpressionParserException $e) {
                $message = sprintf('%s: %s()', $e->getMessage(), $method);
                throw new EntityRepositoryException($message);
            }
            return $this->$finderMethod($conditions);
        }

        $message = 'Invalid method call: ' . __METHOD__;
        throw new BadMethodCallException($message);
    }
}

The Expression objects are lazy-loaded, depending on the expressions found in the method name.

Extensibility

The MethodExpressionParser class was designed with extensibility in mind, allowing you to add new Expressions to the library.

abstract class Expression {
}
class EqualsExpression extends Expression {
}

Source Code

Browse source code:

http://fedecarg.com/repositories/show/expressionparser

Check out the current development trunk with:

$ svn checkout http://svn.fedecarg.com/repo/Zf/Orm

Written by Federico

March 22, 2010 at 11:51 pm

Database Replication Adapter for Zend Framework Applications

with 10 comments

Last updated: 21 Feb, 2010

Database replication is an option that allows the content of one database to be replicated to another database or databases, providing a mechanism to scale out the database. Scaling out the database allows more activities to be processed and more users to access the database by running multiple copies of the databases on different machines.

The problem with monolithic database designs is that they don’t establish an infrastructure that allows for rapid changes in business requirements. Here is where database replication comes into play. Replication can be used effectively for many different purposes, such as separating data entry and reporting, distributing load across servers, providing high availability, etc.

Zf_Orm_DataSource is a Zend Framework Replication Adapter class flexible enough to support the most commonly used replication scenarios:

Single-Master Replication

In the simplest replication scenario, the master copy of directory data is held in a single read-write replica on one server called the supplier server. The supplier server also maintains changelog for this replica. On another server, called the consumer server, there can be multiple read-only replicas.

Configuration array:

$config = array(
    'adapter'        => 'Pdo_Mysql',
    'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
    'username'       => 'root',
    'password'       => 'root',
    'dbname'         => 'test',
    'master_servers' => 1,
    'servers'        => array(
        array('host' => 'db.master-1.com'),
        array('host' => 'db.slave-1.com'),
        array('host' => 'db.slave-2.com')
    )
);

// or ...

$config = array(
    'adapter'        => 'Pdo_Mysql',
    'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
    'dbname'         => 'test',
    'master_servers' => 1,
    'servers'        => array(
        array('host' => 'db.master-1.com', 'username' => 'user1', 'password'=>'pass1'),
        array('host' => 'db.slave-1.com', 'username' => 'user2', 'password' => 'pass2'),
        array('host' => 'db.slave-2.com', 'username' => 'user3', 'password' => 'pass3')
    )
);

In the setup above, all writes will go to the master connection and all reads will be randomly distributed across the available slaves.

Multi-Master Replication

This type of configuration can work with any number of consumer servers. Each consumer server holds a read-only replica. The consumers can receive updates from all the suppliers. The consumers also have referrals defined for all the suppliers to forward any update requests that the consumers receive.

$config = array(
    'adapter'        => 'Pdo_Mysql',
    'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
    'username'       => 'root',
    'password'       => 'root',
    'dbname'         => 'test',
    'master_servers' => 2,
    'master_read'    => true,
    'servers'        => array(
        array('host' => 'db.master-1.com'),
        array('host' => 'db.master-2.com')
    )
);

Using a distributed memory caching system

Database connections are expensive and it’s very inefficient for an application to try to connect to a server that is down or not responding. A distributed memory caching system can help alleviate this problem by keeping a list of all the failed connections in memory, sharing that information across multiple servers and allowing the application to access it before attempting to open a connection.

To enable this option, you have to pass an instance of the Memcached adapter class:

class Bootstrap extends Zend_Application_Bootstrap_Base
{
    protected function _initCache()
    {
        ...
    }

    protected function _initDatabase()
    {
        $config = include APPLICATION_PATH . '/config/database.php';
        $cache = $this->getResource('cache');
        $dataSource = new Zf_Orm_DataSource($config, $cache, 'cache_tag');
        Zend_Registry::set('dataSource', $dataSource);
    }
}

And here is a short example of how the Replication Adapter might be used in a ZF application:

class TestDao
{
    public function fetchAll()
    {
        $db = Zend_Registry::get('dataSource')->getConnection('slave');
        $query = $db->select()->from('test');
        return $db->fetchAll($query);
    }

    public function insert($data)
    {
        $db = Zend_Registry::get('dataSource')->getConnection('master');
        $db->insert('test', $data);
        return $db->lastInsertId();
    }
}

Source Code:
http://fedecarg.com/repositories/show/replicationadapter

Written by Federico

October 2, 2009 at 3:29 pm

Zend Framework DAL: DAOs and DataMappers

with 15 comments

A Data Access Layer (DAL) is the layer of your application that provides simplified access to data stored in persistent storage of some kind. For example, the DAL might return a reference to an object complete with its attributes instead of a row of fields from a database table.

A Data Access Objects (DAO) is used to abstract and encapsulate all access to the data source. The DAO manages the connection with the data source to obtain and store data. Also, it implements the access mechanism required to work with the data source. The data source could be a persistent store like a database, a file or a Web service.

And finally, the DataMapper pattern is used to move data between the object and a database while keeping them independent of each other. The DataMapper main responsibility is to transfer data between the two and also to isolate them from each other.

Here’s an example of the DataMapper pattern:

Database Table Structure

CREATE TABLE `user` (
 `id` int(11) NOT NULL auto_increment,
 `first_name` varchar(100) NOT NULL,
 `last_name` varchar(100) NOT NULL,
 PRIMARY KEY  (`id`)
)

The User DAO

The DAO pattern provides a simple, consistent API for data access that does not require knowledge of an ORM interface. DAO does not just apply to simple mappings of one object to one relational table, but also allows complex queries to be performed and allows for stored procedures and database views to be mapped into data structures.

A typical DAO design pattern interface is shown below:

interface UserDao
{
    public function fetchRow($id);
    public function fetchAll();
    public function insert($data);
    public function update($id, $data);
    public function delete($id);
}

class UserDatabaseDao implements UserDao
{
    public function fetchRow($id)
    {
        $dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
        $db = $dataSource->getConnection('slave');
        $query = $db->select()->from('user')->where('id = ?', $id);
        return $db->fetchRow($query);
    }

    public function fetchAll()
    {
        $dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
        $db = $dataSource->getConnection('slave');
        $query = $db->select()->from('user');
        return $db->fetchAll($query);
    }

    public function insert($data)
    {
        $dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
        $db = $dataSource->getConnection('master');
        $db->insert('user', $data);
        return $db->lastInsertId();
    }

    public function update($id, $data)
    {
        $dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
        $db = $dataSource->getConnection('master');
        $condition = $db->quoteInto('id = ?', $id);
        return $db->update('user', $data, $condition);
    }

    public function delete($id)
    {
        $dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
        $db = $dataSource->getConnection('master');
        $condition = $db->quoteInto('id = ?', $id);
        return $db->delete('user', $condition);
    }
}

The User Entity

An Entity is anything that has continuity through a life cycle and distinctions independent of attributes that are important to the application’s user.

class User
{
    protected $id;
    protected $firstName;
    protected $lastName;

    public function setId() {
    }
    public function getId() {
    }
    public function setFirstName() {
    }
    public function getFirstName() {
    }
    public function setLastName() {
    }
    public function getLastName() {
    }
    public function toArray() {
    }
}

The User DataMapper

class UserDataMapper extends Zf_Orm_DataMapper
{
    public function __construct()
    {
        $this->setMap(
            array(
                'id'         => 'id',
                'first_name' => 'firstName',
                'last_name'  => 'lastName'
            )
        );
    }
}

Source Code: http://fedecarg.com/repositories/show/datamapper

The User Repository

Repositories play an important part in DDD, they speak the language of the domain and act as mediators between the domain and data mapping layers. They provide a common language to all team members by translating technical terminology into business terminology.

Lets create a UserRepository class to isolate the domain object from details of the DAO:

class UserRepository
{
    private $databaseDao;

    public funciton setDatabaseDao(UserDao $dao)
    {
        $this->databaseDao = $dao;
    }

    public function getDatabaseDao()
    {
        if (null === $this->databaseDao) {
            $this->setDatabaseDao(new UserDatabaseDao());
        }
        return $this->databaseDao;
    }

    public function find($id)
    {
        $row = $this->getDatabaseDao()->fetchRow($id);
        $mapper = new UserDataMapper();
        $user = $mapper->assign(new User(), $row);

        return $user;
    }
}

The User Controller

class UserController extends Zend_Controller_Action
{
    public function viewAction()
    {
        $userRepository = new UserRepository();
        $user = $userRepository->find($this->_getParam('id'));
        if ($user instanceof User) {
            $id = $user->getId();
            $firstName = $user->getFirstName();
            $lastName = $user->getLastName();
            // get an array of key value pairs
            $row = $user->toArray();
        }
    }
}

That’s all, I hope you’ve found this post useful.

Written by Federico

September 19, 2009 at 12:23 pm

Increase speed and reduce bandwidth usage

with 18 comments

Apache’s mod_deflate module provides the DEFLATE output filter that allows output from your server to be compressed before being sent to the client over the network.

There are two ways of enabling gzip compression:

  1. Using Apache’s mod_deflate
  2. Using output buffering

Encoding the output and setting the appropriate headers manually makes the code more portable. Keep in mind that there are hundreds of Linux distributions, each slightly different to significantly different. To allow portability the application should not make assumptions about the OS or config involved.

Using Apache

1. Enable mod_deflate

Debian/Ubuntu:

$ a2enmod deflate
$ /etc/init.d/apache2 force-reload

2. Configure mode_deflate

$ nano /etc/apache2/mods-enabled/deflate.conf

#
# mod_deflate configuration
#
<IfModule mod_deflate.c>
 AddOutputFilterByType DEFLATE text/plain
 AddOutputFilterByType DEFLATE text/html
 AddOutputFilterByType DEFLATE text/xml
 AddOutputFilterByType DEFLATE text/css
 AddOutputFilterByType DEFLATE application/xml
 AddOutputFilterByType DEFLATE application/xhtml+xml
 AddOutputFilterByType DEFLATE application/rss+xml
 AddOutputFilterByType DEFLATE application/javascript
 AddOutputFilterByType DEFLATE application/x-javascript

 DeflateCompressionLevel 9

 BrowserMatch ^Mozilla/4 gzip-only-text/html
 BrowserMatch ^Mozilla/4\.0[678] no-gzip
 BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

 DeflateFilterNote Input instream
 DeflateFilterNote Output outstream
 DeflateFilterNote Ratio ratio
</IfModule>

Using output buffering

Create a gzip compressed string in your bootstrap file:

try {
    $frontController = Zend_Controller_Front::getInstance();
    if (@strpos($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip') !== false) {
        ob_start();
        $frontController->dispatch();
        $output = gzencode(ob_get_contents(), 9);
        ob_end_clean();
        header('Content-Encoding: gzip');
        echo $output;
    } else {
        $frontController->dispatch();
    }
} catch (Exeption $e) {
    if (Zend_Registry::isRegistered('Zend_Log')) {
        Zend_Registry::get('Zend_Log')->err($e->getMessage());
    }
    $message = $e->getMessage() . "\n\n" . $e->getTraceAsString();
    /* trigger event */
}

Reference

Use mod_deflate to compress Web content delivered by Apache

Written by Federico

July 6, 2009 at 11:09 am

Format a time interval with the requested granularity

with 8 comments

This class, a refactored version of Drupal’s format_interval function, makes it relatively easy to format an interval value. The format will automatically format as compactly as possible. For example: if the difference between the two dates is only a few hours and both dates occur on the same day, the year, month, and day parts of the date will be omitted.

class DateIntervalFormat
{
    /**
     * Format an interval value with the requested granularity.
     *
     * @param integer $timestamp The length of the interval in seconds.
     * @param integer $granularity How many different units to display in the string.
     * @return string A string representation of the interval.
     */
    public function getInterval($timestamp, $granularity = 2)
    {
        $seconds = time() - $timestamp;
        $units = array(
            '1 year|:count years' => 31536000,
            '1 week|:count weeks' => 604800,
            '1 day|:count days' => 86400,
            '1 hour|:count hours' => 3600,
            '1 min|:count min' => 60,
            '1 sec|:count sec' => 1);
        $output = '';
        foreach ($units as $key => $value) {
            $key = explode('|', $key);
            if ($seconds >= $value) {
                $count = floor($seconds / $value);
                $output .= ($output ? ' ' : '');
                if ($count == 1) {
                    $output .= $key[0];
                } else {
                    $output .= str_replace(':count', $count, $key[1]);
                }
                $seconds %= $value;
                $granularity--;
            }
            if ($granularity == 0) {
                break;
            }
        }

        return $output ? $output : '0 sec';
    }
}

Usage:

$dateFormat = new DateIntervalFormat();
$timestamp = strtotime('2009-06-21 20:46:11');
print sprintf('Submitted %s ago',  $dateFormat->getInterval($timestamp));

Outputs:

Submitted 3 days 4 hours ago

Written by Federico

June 25, 2009 at 12:49 am

Java, C, Python and nested loops

with 4 comments

Java has no goto statement, to break or continue multiple-nested loop or switch constructs, Java programmers place labels on loop and switch constructs, and then break out of or continue to the block named by the label. The following example shows how to use java break statement to terminate the labeled loop:

public class BreakLabel
{
    public static void main(String[] args)
    {
        int[][] array = new int[][]{{1,2,3,4},{10,20,30,40}};
        boolean found = false;
        System.out.println("Searching 30 in two dimensional int array");

        Outer:
        for (int intOuter = 0; intOuter < array.length ; intOuter++) {
            Inner:
            for (int intInner = 0; intInner < array[intOuter].length; intInner++) {
                if (array[intOuter][intInner] == 30) {
                    found = true;
                    break Outer;
                }
            }
        }

        if (found == true) {
            System.out.println("30 found in the array");
        } else {
            System.out.println("30 not found in the array");
        }
    }
}

Use of labeled blocks in Java leads to considerable simplification in programming effort and a major reduction in maintenance.

On the other hand, the C continue statement can only continue the immediately enclosing block; to continue or exit outer blocks, programmers have traditionally either used auxiliary Boolean variables whose only purpose is to determine if the outer block is to be continued or exited; alternatively, programmers have misused the goto statement to exit out of nested blocks.

What’s interesting is that Python rejected the labeled break and continue proposal a while ago. And here’s why:

Guido van Rossum wrote:

I’m rejecting it on the basis that code so complicated to require this feature is very rare. While I’m sure there are some (rare) real cases where clarity of the code would suffer from a refactoring that makes it possible to use return, this is offset by two issues:

1. The complexity added to the language, permanently.

2. My expectation that the feature will be abused more than it will be used right, leading to a net decrease in code clarity (measured across all Python code written henceforth). Lazy programmers are everywhere, and before you know it you have an incredible mess on your hands of unintelligible code.

But what’s more interesting is that the idea of adding a goto statement was never mentioned. Common sense perhaps?

Written by Federico

June 16, 2009 at 9:19 pm

Google Page Speed: Web Performance Best Practices

with one comment

When you profile a web page with Page Speed, it evaluates the page’s conformance to a number of different rules. These rules are general front-end best practices you can apply at any stage of web development. Google provides documentation of each of the rules, so whether or not you run the Page Speed tool, you can refer to these pages at any time.

The best practices are grouped into five categories that cover different aspects of page load optimization:

  • Optimizing caching: Keeping your application’s data and logic off the network altogether
  • Minimizing round-trip times: Reducing the number of serial request-response cycles
  • Minimizing request size: Reducing upload size
  • Minimizing payload size: Reducing the size of responses, downloads, and cached pages
  • Optimizing browser rendering: Improving the browser’s layout of a page

Web Performance Best Practices

Written by Federico

June 8, 2009 at 9:40 pm

Posted in Programming, Tools

The Little Manual of API Design

leave a comment »

This manual gathers together the key insights into API design that were discovered through many years of software development on the Qt application development framework at Trolltech (now part of Nokia). When designing and implementing a library, you should also keep other factors in mind, such as efficiency and ease of implementation, in addition to pure API considerations. And although the focus is on public APIs, there is no harm in applying the principles described here when writing application code or internal library code.

The Little Manual of API Design (PDF)

Written by Federico

May 13, 2009 at 8:21 pm

E-Books Directory: More than 300 free programming books

with 2 comments

Here is a categorized list of online programming books available for free download. The books cover all major programming languages: Ada, Assembly, Basic, C, C#, C++, CGI, JavaScript, Perl, Delphi, Pascal, Haskell, Java, Lisp, PHP, Prolog, Python, Ruby, as well as some other languages, game programming, and software engineering. The books are in various formats for online reading or downloading.

Free Programming Books

Written by Federico

May 12, 2009 at 7:47 pm

Posted in Programming