Federico Cargnelutti

Simple is better than complex. Complex is better than complicated. | @fedecarg

Archive for the ‘Databases’ Category

Database Replication Adapter for Zend Framework Applications

with 10 comments

Database replication is an option that allows the content of one database to be replicated to another database or databases, providing a mechanism to scale out the database. Scaling out the database allows more activities to be processed and more users to access the database by running multiple copies of the databases on different machines.

The problem with monolithic database designs is that they don’t establish an infrastructure that allows for rapid changes in business requirements. Here is where database replication comes into play. Replication can be used effectively for many different purposes, such as separating data entry and reporting, distributing load across servers, providing high availability, etc.

In 2008, Paul M. Jones announced the release of an SQL adapter that allows Solar users to connect to master/slave database installations. My first reaction was: Great! This will inspire other FOSS developers to create similar components. And guess what, it did. I wrote my own ReplicationAdapter. It’s not great, but it’s flexible enough to support the most commonly used replication scenarios:

Single-Master Replication

In the simplest replication scenario, the master copy of directory data is held in a single read-write replica on one server called the supplier server. The supplier server also maintains changelog for this replica. On another server, called the consumer server, there can be multiple read-only replicas.

Configuration array:

$db = array(
    'adapter'        => 'Pdo_Mysql',
    'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
    'username'       => 'root',
    'password'       => 'root',
    'dbname'         => 'test',
    'master_servers' => 1,
    'servers'        => array(
        array('host' => 'db.master-1.com'),
        array('host' => 'db.slave-1.com'),
        array('host' => 'db.slave-2.com')
    )
);

// or ...

$db = array(
    'adapter'        => 'Pdo_Mysql',
    'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
    'dbname'         => 'test',
    'master_servers' => 1,
    'servers'        => array(
        array('host' => 'db.master-1.com', 'username' => 'user1', 'password'=>'pass1'),
        array('host' => 'db.slave-1.com', 'username' => 'user2', 'password' => 'pass2'),
        array('host' => 'db.slave-2.com', 'username' => 'user3', 'password' => 'pass3')
    )
);

Zend_Registry::set('db_config_array', $db);

In the setup above, all writes will go to the master connection and all reads will be randomly distributed across the available slaves.

Multi-Master Replication

This type of configuration can work with any number of consumer servers. Each consumer server holds a read-only replica. The consumers can receive updates from all the suppliers. The consumers also have referrals defined for all the suppliers to forward any update requests that the consumers receive.

$db = array(
    'adapter'        => 'Pdo_Mysql',
    'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
    'username'       => 'root',
    'password'       => 'root',
    'dbname'         => 'test',
    'master_servers' => 2,
    'master_read'    => true,
    'servers'        => array(
        array('host' => 'db.master-1.com'),
        array('host' => 'db.master-2.com')
    )
);

Zend_Registry::set('db_config_array', $db);

Using a distributed memory caching system

Database connections are expensive and it’s very inefficient for an application to try to connect to a server that is down or not responding. A distributed memory caching system can help alleviate this problem by keeping a list of all the failed connections in memory, sharing that information across multiple servers and allowing the application to access it before attempting to open a connection.

To enable this option, you have to register an instance of the Memcached adapter class:

Zend_Registry::set('db_config_array', $db);
...

$cache = Zend_Cache::factory('Core', 'Zend_Cache_Backend_Memcached', $f, $b);
Zend_Registry::set('Zend_Cache', $cache);

Zend Framework Example

Here is a short and simple example of how the ReplicationAdapter might be used in a ZF application:

class SingleMasterDb extends Zf_Db_ReplicationAdapter
{
    public function fecthAll()
    {
        $db = $this->getConnection('slave');
        $query = $db->select()->from('test');
        return $db->fetchAll($query);
    }

    public function insert($data)
    {
        $db = $this->getConnection('master');
        $db->insert('test', $data);
        return $db->lastInsertId();
    }

    public function update($id, $data)
    {
        $db = $this->getConnection('master');
        $where = $db->quoteInto('id = ?', $id);
        return $db->update('test', $data, $where);
    }
}

Source Code:
http://fedecarg.com/repositories/show/zfreplicationadapter

Written by Federico

October 2, 2009 at 3:29 pm

Zend Framework DAL: DAOs and DataMappers

with 12 comments

A Data Access Layer (DAL) is the layer of your application that provides simplified access to data stored in persistent storage of some kind. For example, the DAL might return a reference to an object complete with its attributes instead of a row of fields from a database table.

A Data Access Objects (DAO) is used to abstract and encapsulate all access to the data source. The DAO manages the connection with the data source to obtain and store data. Also, it implements the access mechanism required to work with the data source. The data source could be a persistent store like a database, a file or a Web service.

And finally, a DataMapper is used to move data between the object and a database while keeping them independent of each other. The DataMapper main responsibility is to transfer data between the two and also to isolate them from each other.

Zend Framework Example

app/
    controllers/
        UserController.php
    views/
lib/
    Project/
        Dao/
            Db/
                User.php
            Service/
                User.php
        DataMapper/
            User.php
        Entity/
            User.php
        Model/
            User.php
    Zf/
        DataSource/
            Dao/
                Mapper.php
        Domain/
            Entity.php
        Db/
            Adapter.php

Database Table Structure

CREATE TABLE `user` (
 `id` int(11) NOT NULL auto_increment,
 `first_name` varchar(100) NOT NULL,
 `last_name` varchar(100) NOT NULL,
 PRIMARY KEY  (`id`)
)

The User DAO

The DAO pattern provides a simple, consistent API for data access that does not require knowledge of an ORM interface. DAO does not just apply to simple mappings of one object to one relational table, but also allows complex queries to be performed and allows for stored procedures and database views to be mapped into data structures.

A typical DAO design pattern interface is shown below:

class Project_Dao_Db_User extends Zf_Db_Adapter
{
    public function find($id)
    {
        $db = $this->getConnection();
        $query = $db->select()->from('user')->where('id = ?', $id);
        return $db->fetchRow($query);
    }

    public function findAll()
    {
        $db = $this->getConnection();
        $query = $db->select()->from('user');
        return $db->fetchAll($query);
    }

    public function insert($data)
    {
        $db = $this->getConnection();
        $db->insert('user', $data);
        return $db->lastInsertId();
    }

    public function update($id, $data)
    {
        $db = $this->getConnection();
        $where = $db->quoteInto('id = ?', $id);
        return $db->update('user', $data, $where);
    }

    public function delete($id)
    {
        $db = $this->getConnection();
        $where = $db->quoteInto('id = ?', $id);
        return $db->delete('user', $where);
    }
}

Source Code: Zf_Db_Adapter

The User Entity

An Entity is anything that has continuity through a life cycle and distinctions independent of attributes that are important to the application’s user.

class Project_Entity_User extends Zf_Domain_Entity
{
    public $id;
    public $firstName;
    public $lastName;
}

Source Code: Zf_Domain_Entity

The User DataMapper

class Project_DataMapper_User extends Zf_DataSource_Dao_Mapper
{
    protected $_map = array(
        'id'         => 'id',
        'first_name' => 'firstName',
        'last_name'  => 'lastName'
    );

    public function find($id)
    {
        $dao = new Project_Dao_Db_User();
        $row = $dao->find($id);
        if (!$row) {
            return false;
        }
        return $this->map($row);
    }
}

Source Code: Zf_DataSource_Dao_Mapper

The User Model

The following class represents the User Relational Model:

class Project_Model_User
{
    public function getUser($id)
    {
        $mapper = new Project_DataMapper_User();
        $mapper->setEntity(new Project_Entity_User());
        $user = $mapper->find($id);

        return $user;
    }
}

The User Controller

class UserController extends Zend_Controller_Action
{
    public function viewAction()
    {
        $model = new Project_Model_User();
        $user = $model->getUser($this->_getParam('id'));
        if ($user) {
            $userId = $user->getId();
            $userFirstName = $user->getFirstName();
            $userLastName = $user->getLastName();

            // get an array of database fields and values
            $row = $user->getRow();
        }
    }
}

Keep in mind that ORM tools such as phpDataMapper and Doctrine offer an alternative way of modeling data. ORMs and ERMs are popular with Web frameworks, and the combination of an ORM and DDD makes DAOs redundant, however, it has not been proven to be better than a straightforward approach of implementing a collection of domain-specific data access functions.

That’s all, I hope you’ve found this post useful.

Written by Federico

September 19, 2009 at 12:23 pm

Redis: The Lamborghini of Databases

without comments

Antonio Cangiano wrote:

Redis is a key-value database written in C. It can be used like memcached, in front of a traditional database, or on its own thanks to the fact that the in-memory datasets are not volatile but instead persisted on disk. Redis provides you with the ability to define keys that are more than mere strings, as well as being able to handle multiple databases, lists, sets and even basic master-slave replication. It’s amazingly fast. On an entry level Linux box, Redis has been benchmarked performing 110,000 SET operations, and 81,000 GETs, per second.

Despite being a very young project, it already has client libraries for several languages: Python and PHP, Erlang, and Ruby. Salvatore Sanfilippo, the creator of Redis, has implemented a Twitter clone known as Retwis to showcase how you can use Redis and PHP to build applications without the need for a database like MySQL or any SQL query.

Salvatore will be publishing a beginner’s article based on the PHP Twitter clone he wrote, soon.

Full article: Introducing Redis: a fast key-value database

Written by Federico

March 9, 2009 at 7:22 pm

Database Migrations: The Next Killer PHP Application

with 16 comments

This post, written by Adam Wiggins, got me thinking:

Migrations bother me. On one hand, migrations are the best solution we have for the problem of versioning databases. The scope of that problem includes merging schema changes from different developers, applying schema changes to production data, and creating a DRY representation of the schema. But even though migrations is the best solution we have, it still isn’t a very good one.

Schema synchronization is the killer feature missing in phpMyAdmin!

This feature can be found in some commercial applications, like SQLyog Enterprise for example. SQLyog’s Database Synchronization Tool (DST) is a visual comparison/synchronization tool designed for developers who work between different MySQL servers or need to keep databases between two MySQL servers in sync. This means reporting the differences between tables, indexes, columns of two databases, and generating the necessary “Up” and “Down” scripts to bring them in sync.

However, SQLyog DST is not free and only runs on Windows.

So, if you want to build the next killer PHP application, port SQLyog’s database synchronization tool to PHP and open source it. People will love you for it.

Links

Written by Federico

March 3, 2009 at 1:16 pm

Server-side Marker Clustering with PHP and Google Maps

with 2 comments

with_cluster2

As maps get busier, marker clustering is likely to be needed. Marker clustering is a technique by which several points of interest can be represented by a single icon when they’re close to one another.

Mika Tuupola wrote a PHP library that divides the map into a given number of cells and represents all the markers present in the same cell by a single icon. This icon shows the number of markers it symbolizes.

He also wrote an excellent post explaining how marker clustering works.

Related Posts

Written by Federico

February 26, 2009 at 3:32 pm

MySQL Split String Function

with 13 comments

MySQL does not include a function to split a delimited string. However, it’s very easy to create your own function.

Create function syntax

A user-defined function is a way to extend MySQL with a new function that works like a native MySQL function.

CREATE [AGGREGATE] FUNCTION function_name
RETURNS {STRING|INTEGER|REAL|DECIMAL}

To create a function, you must have the INSERT privilege for the <mysql> database.

Split delimited strings

The following example function takes 3 parameters, performs an operation using an SQL function, and returns the result.

Function

CREATE FUNCTION SPLIT_STR(
  x VARCHAR(255),
  delim VARCHAR(12),
  pos INT
)
RETURNS VARCHAR(255)
RETURN REPLACE(SUBSTRING(SUBSTRING_INDEX(x, delim, pos),
       LENGTH(SUBSTRING_INDEX(x, delim, pos -1)) + 1),
       delim, '');

Usage

SELECT SPLIT_STR(string, delimiter, position)

Example

SELECT SPLIT_STR('a|bb|ccc|dd', '|', 3) as third;

+-------+
| third |
+-------+
| ccc   |
+-------+

Written by Federico

February 22, 2009 at 12:11 pm

Posted in Databases

ActiveRecord: JavaScript ORM Library

without comments

Aptana has just released a beta version of its ActiveRecord.js which is an ORM JavaScript library that implements the ActiveRecord pattern. It works with AIR and other environments:

ActiveRecord.js is a single file, MIT licensed, relies on no external JavaScript libraries, supports automatic table creation, data validation, data synchronization, relationships between models, life cycle callbacks and can use an in memory hash table to store objects if no SQL database is available.

Example:

var User = ActiveRecord.define('users',{
    username: '',
    email: ''
});
User.hasMany('articles');

var ryan = User.create({
    username: 'ryan',
    email: 'rjohnson@aptana.com'
});

var Article = ActiveRecord.define('articles',{
    name: '',
    body: '',
    user_id: 0
});
Article.belongsTo('user');

var a = Article.create({
    name: 'Announcing ActiveRecord.js',
    user_id: ryan.id
});
a.set('name','Announcing ActiveRecord.js!!!');
a.save();

a.getUser() == ryan;
ryan.getArticleList()[0] == a;

Links

Written by Federico

February 16, 2009 at 8:55 pm

Geo Proximity Search with PHP, Python and SQL

with 12 comments

I’m working on a project that requires Geo proximity search. Basically, what I’m doing is plotting a radius around a point on a map, which is defined by the distance between two points on the map given their latitudes and longitudes. To achieve this I’m using the Haversine formula (spherical trigonometry). This equation is important in navigation, it gives great-circle distances between two points on a sphere from their longitudes and latitudes. You can see it in action here: Radius From UK Postcode.

This has already been covered in some PHP blogs, however, I found most of the information to be inaccurate and, in some cases, incorrect.

Of course, there are always exceptions. In this post, Kevin uses the Haversine equation to develop a postcode radius facility for his website, like the one on freemaptools.com.

He assumes you have the following values:

$lon = (float) -2.708077;
$lat = (float) 53.754842;
$radius = 20; // in miles

And calculates the latitude (min/max) and longitude (min/max) based on that:

(float)$dpmLAT = 1 / 69.1703234283616;

// Latitude calculation
(float)$usrRLAT = $dpmLAT * $radius;
(float)$latMIN = $lat - $usrRLAT;
(float)$latMAX = $lat + $usrRLAT;

// Longitude calculation
(float)$mpdLON = 69.1703234283616 * cos($userLat * (pi/180));

// degrees per mile longintude
(float)$dpmLON = 1 / $mpdLON;
$usrRLON = $dpmLON * $radius;
$lonMIN = $lon - $usrRLON;
$lonMAX = $lon + $usrRLON;

The variable names are a bit confusing, but you can tell straight away he knows what he’s doing. Unfortunately, the script doesn’t work, $userLat is not defined. If you change $userLat to $lat, it works. Also, you can use the deg2rad() function to convert the $lat number to the radian equivalent.

Here is Kevin’s implementation refactored (Codepad)

// Degrees per mile latitude
$dpm_lat = (float) 1 / 69;

// Latitude calculation
$rlat = (float) $dpm_lat * $radius;
$lat_min = (float) $latitude - $rlat;
$lat_max = (float) $latitude + $rlat;

// Longitude calculation
$mpd_lng = (float) abs(cos(deg2rad($latitude)) * 69);
$dpm_lng = (float) 1 / $mpd_lng;

$rlng = $dpm_lng * $radius;
$lng_min = $longitude - $rlng;
$lng_max = $longitude + $rlng; 

echo 'lng (min/max): ' . $lng_min . '/' . $lng_max;
echo 'lat (min/max): ' . $lat_min . '/' . $lat_max;

Outputs:

lng (min/max): -3.1983251898421/-2.2178288101579
lat (min/max): 53.464986927536/54.044697072464

Cool, it works. Now, for the sake of James Inman, lets refactor the script again and simplify the hole thing. The Haversine equation is very straight forward, so there’s no need to complicate things.

I’ve implemented the solution in PHP, Python and SQL. Use the one that suits you best.

PHP implementation (Codepad)

$longitude = (float) -2.708077;
$latitude = (float) 53.754842;
$radius = 20; // in miles

$lng_min = $longitude - $radius / abs(cos(deg2rad($latitude)) * 69);
$lng_max = $longitude + $radius / abs(cos(deg2rad($latitude)) * 69);
$lat_min = $latitude - ($radius / 69);
$lat_max = $latitude + ($radius / 69);

echo 'lng (min/max): ' . $lng_min . '/' . $lng_max . PHP_EOL;
echo 'lat (min/max): ' . $lat_min . '/' . $lat_max;

It outputs the same result:

lng (min/max): -3.1983251898421/-2.2178288101579
lat (min/max): 53.464986927536/54.044697072464

SQL implementation (Codepad)

set @latitude=53.754842;
set @longitude=-2.708077;
set @radius=20;

set @lng_min = @longitude - @radius/abs(cos(radians(@latitude))*69);
set @lng_max = @longitude + @radius/abs(cos(radians(@latitude))*69);
set @lat_min = @latitude - (@radius/69);
set @lat_max = @latitude + (@radius/69);

SELECT * FROM postcode
WHERE (longitude BETWEEN @lng_min AND @lng_max)
AND (latitude BETWEEN @lat_min and @lat_max);

Python implementation (Codepad)

from __future__ import division
import math

longitude = float(-2.708077)
latitude = float(53.754842)
radius = 20

lng_min = longitude - radius / abs(math.cos(math.radians(latitude)) * 69)
lng_max = longitude + radius / abs(math.cos(math.radians(latitude)) * 69)
lat_min = latitude - (radius / 69)
lat_max = latitude + (radius / 69)

print 'lng (min/max): %f %f' % (lng_min, lng_max)
print 'lat (min/max): %f %f' % (lat_min, lat_max)

That’s it. Happy Geolocating!

Written by Federico

February 8, 2009 at 11:48 am

Four Great InfoQ Presentations

with one comment

Hope you like these recommendations and if you know of any other good tech-related video, then please let me know.

1. Developing Expertise: Herding Racehorses, Racing Sheep

One of my favourites. In this presentation Dave Thomas (The Pragmatic Programmer) talks about expanding people’s expertise in their domains of interest by not treating them uniformly as they had the same amount of knowledge and level of experience.

Developing Expertise

2. Real World Web Services

Another good presentation. In this one Scott Davis provides a pragmatic, down-to-earth introduction to Web services as used in the real world by public sites, including SOAP-based and REST examples.

Real World Web Services

3. CouchDB and me

This presentation is different, and that’s why I like it so much. Damien Katz shares his experiences and reminds people how difficult but at the same time gratifying is to be an open source developer. He talks about the history of CouchDB development from a very personal point of view. His inspirations for CouchDB and why he decided to move my wife and kids to a cheaper place and live off savings to build this thing.

CouchDB and me

4. Yahoo! Communities Architectures

In this presentation, Ian Flint tries to explain the infrastructure and architecture employed by Yahoo! to keep going a multitude of servers running of different platforms and offering different services. Very interesting!

Yahoo! Communities Architectures

Written by Federico

February 6, 2009 at 12:02 am

Planet Ubuntu: Mastering phpMyAdmin 2.11

without comments

Matthew Helmke wrote:

I am the sole person responsible for seven websites that use MySQL databases, and have worked on the back end of several others. Over the years I have worked on these databases in many different ways. I’ve done it from the command line, via ssh. I’ve done it using PHP scripts within the software being run on a site. I have also used phpMyAdmin. Each method has applications and moments where I would prefer using it to the others at my disposal.

I have generally used phpMyAdmin only in those moments when I had something quick and simple to do, or when a hosting company did not provide ssh or command line access. This was mainly because I did not realize how powerful and flexible the software can be.

Mastering phpMyAdmin 2.11

Written by Federico

January 19, 2009 at 12:09 am