Archive for the ‘Databases’ Category
Implementing Dynamic Finders and Parsing Method Expressions
Most ORMs support the concept of dynamic finders. A dynamic finder looks like a normal method invocation, but the method itself doesn’t exist, instead, it’s generated dynamically and processed via another method at runtime.
A good example of this is Ruby. When you invoke a method that doesn’t exist, it raises a NoMethodError exception, unless you define “method_missing”. Rails ActiveRecord::Base class implements some of its magic thanks to this method. For example, find_by_title(title) and find_by_title_and_date(title, date) are turned into:
find(:first, :conditions => ["title = ?", title]) find(:first, :conditions => ["title = ? AND date = ?", title, date])
What’s nice about Ruby is that the language allows you to define methods dynamically using the “define_method” method. That’s how Rails defines each dynamic finder in the class after it is first invoked, so that future attempts to use it do not run through the “method_missing” method.
Method Expressions
GORM, Grails ORM library, introduces the concept of dynamic method expressions. A method expression is made up of the prefix such as “findBy” followed by an expression that combines one or more properties. Grails takes advantage of Groovy features to provide dynamic methods:
findByTitle("Example")
findByTitleLike("Exa%")
Method expressions can also use a boolean operator to combine two criteria:
findAllByTitleLikeAndDateGreaterThan("Exampl%", '2010-03-23')
In this case we are using AND in the middle of the query to make sure both conditions are satisfied, but you could equally use OR:
findAllByTitleLikeOrDateGreaterThan("Exampl%", '2010-03-23')
Parsing Method Expressions
MethodExpressionParser is a PHP library for parsing method expressions. It’s designed to quickly and easily parse method expressions and construct conditions based on attribute names and arguments.
Description
[finderMethod]([attribute][expression][logicalOperator])?[attribute][expression]
Expressions
- LessThan: Less than the given value
- LessThanEquals: Less than or equal a give value
- GreaterThan: Greater than a given value
- GreaterThanEquals: Greater than or equal a given value
- Like: Equivalent to a SQL like expression
- NotEqual: Negates equality
- IsNotNull: Not a null value (doesn’t require an argument)
- IsNull: Is a null value (doesn’t require an argument)
Examples
findByTitleAndDate('Example', date('Y-m-d'));
SELECT * FROM book WHERE title = ? AND date = ?
findByTitleOrDate('Example', date('Y-m-d'))
SELECT * FROM book WHERE title = ? OR date = ?
findByPublisherOrTitleAndDate('Name', 'Example', date('Y-m-d'))
SELECT * FROM book WHERE publisher = ? OR (title = ? AND date = ?)
findByPublisherInAndTitle(array('Name1', 'Name2'), 'Example')
SELECT * FROM book WHERE publisher IN (?, ?) AND date = ?
findByTitleLikeAndDateNotNull('Examp%')
SELECT * FROM book WHERE title LIKE ? AND date NOT NULL
findByIdOrTitleAndDateNotNull(1, 'Example')
SELECT * FROM book WHERE (id = ?) OR (title = ? AND date NOT NULL)
Example 1:
findByTitleLikeAndDateNotNull('Examp%');
Outputs:
array
0 =>
array
0 =>
array
'attribute' => string 'title'
'expression' => string 'Like'
'format' => string '%s LIKE ?'
'placeholders' => int 1
'argument' => string 'Examp%'
1 =>
array
'attribute' => string 'date'
'expression' => string 'NotNull'
'format' => string '%s IS NOT NULL'
'placeholders' => int 0
'argument' => null
Example 2:
findByTitleAndPublisherNameOrTitleAndPublisherName('Title', 'a', 'Title', 'b');
Outputs:
array
0 =>
array
0 =>
array
'attribute' => string 'title'
'expression' => string 'Equals'
'format' => string '%s = ?'
'placeholders' => int 1
'argument' => string 'Title'
1 =>
array
'attribute' => string 'publisher_name'
'expression' => string 'Equals'
'format' => string '%s = ?'
'placeholders' => int 1
'argument' => string 'a'
1 =>
array
0 =>
array
'attribute' => string 'title'
'expression' => string 'Equals'
'format' => string '%s = ?'
'placeholders' => int 1
'argument' => string 'Title'
1 =>
array
'attribute' => string 'publisher_name'
'expression' => string 'Equals'
'format' => string '%s = ?'
'placeholders' => int 1
'argument' => string 'b'
See more examples: Project Wiki
Usage
class EntityRepository
{
private $methodExpressionParser;
// Return a single instance of MethodExpressionParser
public function getMethodExpressionParser() {
}
// Finder methods
public function findBy($conditions) {
var_dump($conditions);
}
public function findAllBy($conditions) {
var_dump($conditions);
}
// Invoke finder methods
public function __call($method, $args) {
if ('f' === $method{0}) {
try {
$result = $this->getMethodExpressionParser()->parse($method, $args);
$finderMethod = key($result);
$conditions = $result[$finderMethod];
} catch (MethodExpressionParserException $e) {
$message = sprintf('%s: %s()', $e->getMessage(), $method);
throw new EntityRepositoryException($message);
}
return $this->$finderMethod($conditions);
}
$message = 'Invalid method call: ' . __METHOD__;
throw new BadMethodCallException($message);
}
}
Performance
PHP doesn’t allow you to define methods dynamically, this means that every time you invoke a finder method the parser has to search, extract and map all the attribute names and expressions. To avoid introducing this performance overhead you can cache the attribute names. For example:
class EntityRepository
{
private $methodExpressionParser;
private $classMetadata;
// Return a single instance of MethodExpressionParser
public function getMethodExpressionParser() {
}
// Return a single instance of ClassMetadata
public function getClassMetadata() {
}
// Invoke finder methods
public function __call($method, $args) {
if ('f' === $method{0}) {
$parser = $this->getMethodExpressionParser();
$classMetadata = $this->getClassMetadata();
try {
$finderMethod = $parser->determineFinderMethod($method);
if ($classMetadata->hasMissingMethod($method)) {
$attributes = $classMetadata->getMethodAttributes($method);
$conditions = $parser->map($args, $attributes);
} else {
$expressions = substr($method, strlen($finderMethod));
$attributes = $this->extractAttributeNames($expressions);
$conditions = $parser->map($args, $attributes);
$classMetadata->setMethodAttributes($method, $attributes);
}
} catch (MethodExpressionParserException $e) {
$message = sprintf('%s: %s()', $e->getMessage(), $method);
throw new EntityRepositoryException($message);
}
return $this->$finderMethod($conditions);
}
$message = 'Invalid method call: ' . __METHOD__;
throw new BadMethodCallException($message);
}
}
The Expression objects are lazy-loaded, depending on the expressions found in the method name.
Extensibility
The MethodExpressionParser class was designed with extensibility in mind, allowing you to add new Expressions to the library.
abstract class Expression {
}
class EqualsExpression extends Expression {
}
Source Code
http://fedecarg.com/repositories/show/expressionparser
Check out the current development trunk with:
$ svn checkout http://svn.fedecarg.com/repo/Zf/Orm
Database Replication Adapter for Zend Framework Applications
Last updated: 21 Feb, 2010
Database replication is an option that allows the content of one database to be replicated to another database or databases, providing a mechanism to scale out the database. Scaling out the database allows more activities to be processed and more users to access the database by running multiple copies of the databases on different machines.
The problem with monolithic database designs is that they don’t establish an infrastructure that allows for rapid changes in business requirements. Here is where database replication comes into play. Replication can be used effectively for many different purposes, such as separating data entry and reporting, distributing load across servers, providing high availability, etc.
Zf_Orm_DataSource is a Zend Framework Replication Adapter class flexible enough to support the most commonly used replication scenarios:
Single-Master Replication
In the simplest replication scenario, the master copy of directory data is held in a single read-write replica on one server called the supplier server. The supplier server also maintains changelog for this replica. On another server, called the consumer server, there can be multiple read-only replicas.
Configuration array:
$config = array(
'adapter' => 'Pdo_Mysql',
'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
'username' => 'root',
'password' => 'root',
'dbname' => 'test',
'master_servers' => 1,
'servers' => array(
array('host' => 'db.master-1.com'),
array('host' => 'db.slave-1.com'),
array('host' => 'db.slave-2.com')
)
);
// or ...
$config = array(
'adapter' => 'Pdo_Mysql',
'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
'dbname' => 'test',
'master_servers' => 1,
'servers' => array(
array('host' => 'db.master-1.com', 'username' => 'user1', 'password'=>'pass1'),
array('host' => 'db.slave-1.com', 'username' => 'user2', 'password' => 'pass2'),
array('host' => 'db.slave-2.com', 'username' => 'user3', 'password' => 'pass3')
)
);
In the setup above, all writes will go to the master connection and all reads will be randomly distributed across the available slaves.
Multi-Master Replication
This type of configuration can work with any number of consumer servers. Each consumer server holds a read-only replica. The consumers can receive updates from all the suppliers. The consumers also have referrals defined for all the suppliers to forward any update requests that the consumers receive.
$config = array(
'adapter' => 'Pdo_Mysql',
'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
'username' => 'root',
'password' => 'root',
'dbname' => 'test',
'master_servers' => 2,
'master_read' => true,
'servers' => array(
array('host' => 'db.master-1.com'),
array('host' => 'db.master-2.com')
)
);
Using a distributed memory caching system
Database connections are expensive and it’s very inefficient for an application to try to connect to a server that is down or not responding. A distributed memory caching system can help alleviate this problem by keeping a list of all the failed connections in memory, sharing that information across multiple servers and allowing the application to access it before attempting to open a connection.
To enable this option, you have to pass an instance of the Memcached adapter class:
class Bootstrap extends Zend_Application_Bootstrap_Base
{
protected function _initCache()
{
...
}
protected function _initDatabase()
{
$config = include APPLICATION_PATH . '/config/database.php';
$cache = $this->getResource('cache');
$dataSource = new Zf_Orm_DataSource($config, $cache, 'cache_tag');
Zend_Registry::set('dataSource', $dataSource);
}
}
And here is a short example of how the Replication Adapter might be used in a ZF application:
class TestDao
{
public function fetchAll()
{
$db = Zend_Registry::get('dataSource')->getConnection('slave');
$query = $db->select()->from('test');
return $db->fetchAll($query);
}
public function insert($data)
{
$db = Zend_Registry::get('dataSource')->getConnection('master');
$db->insert('test', $data);
return $db->lastInsertId();
}
}
Source Code:
http://fedecarg.com/repositories/show/replicationadapter
Zend Framework DAL: DAOs and DataMappers
A Data Access Layer (DAL) is the layer of your application that provides simplified access to data stored in persistent storage of some kind. For example, the DAL might return a reference to an object complete with its attributes instead of a row of fields from a database table.
A Data Access Objects (DAO) is used to abstract and encapsulate all access to the data source. The DAO manages the connection with the data source to obtain and store data. Also, it implements the access mechanism required to work with the data source. The data source could be a persistent store like a database, a file or a Web service.
And finally, the DataMapper pattern is used to move data between the object and a database while keeping them independent of each other. The DataMapper main responsibility is to transfer data between the two and also to isolate them from each other.
Here’s an example of the DataMapper pattern:
Database Table Structure
CREATE TABLE `user` ( `id` int(11) NOT NULL auto_increment, `first_name` varchar(100) NOT NULL, `last_name` varchar(100) NOT NULL, PRIMARY KEY (`id`) )
The User DAO
The DAO pattern provides a simple, consistent API for data access that does not require knowledge of an ORM interface. DAO does not just apply to simple mappings of one object to one relational table, but also allows complex queries to be performed and allows for stored procedures and database views to be mapped into data structures.
A typical DAO design pattern interface is shown below:
interface UserDao
{
public function fetchRow($id);
public function fetchAll();
public function insert($data);
public function update($id, $data);
public function delete($id);
}
class UserDatabaseDao implements UserDao
{
public function fetchRow($id)
{
$dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
$db = $dataSource->getConnection('slave');
$query = $db->select()->from('user')->where('id = ?', $id);
return $db->fetchRow($query);
}
public function fetchAll()
{
$dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
$db = $dataSource->getConnection('slave');
$query = $db->select()->from('user');
return $db->fetchAll($query);
}
public function insert($data)
{
$dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
$db = $dataSource->getConnection('master');
$db->insert('user', $data);
return $db->lastInsertId();
}
public function update($id, $data)
{
$dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
$db = $dataSource->getConnection('master');
$condition = $db->quoteInto('id = ?', $id);
return $db->update('user', $data, $condition);
}
public function delete($id)
{
$dataSource = Zf_Orm_Manager::getInstance()->getDataSource();
$db = $dataSource->getConnection('master');
$condition = $db->quoteInto('id = ?', $id);
return $db->delete('user', $condition);
}
}
The User Entity
An Entity is anything that has continuity through a life cycle and distinctions independent of attributes that are important to the application’s user.
class User
{
protected $id;
protected $firstName;
protected $lastName;
public function setId() {
}
public function getId() {
}
public function setFirstName() {
}
public function getFirstName() {
}
public function setLastName() {
}
public function getLastName() {
}
public function toArray() {
}
}
The User DataMapper
class UserDataMapper extends Zf_Orm_DataMapper
{
public function __construct()
{
$this->setMap(
array(
'id' => 'id',
'first_name' => 'firstName',
'last_name' => 'lastName'
)
);
}
}
Source Code: http://fedecarg.com/repositories/show/datamapper
The User Repository
Repositories play an important part in DDD, they speak the language of the domain and act as mediators between the domain and data mapping layers. They provide a common language to all team members by translating technical terminology into business terminology.
Lets create a UserRepository class to isolate the domain object from details of the DAO:
class UserRepository
{
private $databaseDao;
public funciton setDatabaseDao(UserDao $dao)
{
$this->databaseDao = $dao;
}
public function getDatabaseDao()
{
if (null === $this->databaseDao) {
$this->setDatabaseDao(new UserDatabaseDao());
}
return $this->databaseDao;
}
public function find($id)
{
$row = $this->getDatabaseDao()->fetchRow($id);
$mapper = new UserDataMapper();
$user = $mapper->assign(new User(), $row);
return $user;
}
}
The User Controller
class UserController extends Zend_Controller_Action
{
public function viewAction()
{
$userRepository = new UserRepository();
$user = $userRepository->find($this->_getParam('id'));
if ($user instanceof User) {
$id = $user->getId();
$firstName = $user->getFirstName();
$lastName = $user->getLastName();
// get an array of key value pairs
$row = $user->toArray();
}
}
}
That’s all, I hope you’ve found this post useful.
Redis: The Lamborghini of Databases
Antonio Cangiano wrote:
Redis is a key-value database written in C. It can be used like memcached, in front of a traditional database, or on its own thanks to the fact that the in-memory datasets are not volatile but instead persisted on disk. Redis provides you with the ability to define keys that are more than mere strings, as well as being able to handle multiple databases, lists, sets and even basic master-slave replication. It’s amazingly fast. On an entry level Linux box, Redis has been benchmarked performing 110,000 SET operations, and 81,000 GETs, per second.
Despite being a very young project, it already has client libraries for several languages: Python and PHP, Erlang, and Ruby. Salvatore Sanfilippo, the creator of Redis, has implemented a Twitter clone known as Retwis to showcase how you can use Redis and PHP to build applications without the need for a database like MySQL or any SQL query.
Salvatore will be publishing a beginner’s article based on the PHP Twitter clone he wrote, soon.
Full article: Introducing Redis: a fast key-value database
Database Migrations: The Next Killer PHP Application
This post, written by Adam Wiggins, got me thinking:
Migrations bother me. On one hand, migrations are the best solution we have for the problem of versioning databases. The scope of that problem includes merging schema changes from different developers, applying schema changes to production data, and creating a DRY representation of the schema. But even though migrations is the best solution we have, it still isn’t a very good one.
Schema synchronization is the killer feature missing in phpMyAdmin!
This feature can be found in some commercial applications, like SQLyog Enterprise for example. SQLyog’s Database Synchronization Tool (DST) is a visual comparison/synchronization tool designed for developers who work between different MySQL servers or need to keep databases between two MySQL servers in sync. This means reporting the differences between tables, indexes, columns of two databases, and generating the necessary “Up” and “Down” scripts to bring them in sync.
However, SQLyog DST is not free and only runs on Windows.
So, if you want to build the next killer PHP application, port SQLyog’s database synchronization tool to PHP and open source it. People will love you for it.
Links
Server-side Marker Clustering with PHP and Google Maps

As maps get busier, marker clustering is likely to be needed. Marker clustering is a technique by which several points of interest can be represented by a single icon when they’re close to one another.
Mika Tuupola wrote a PHP library that divides the map into a given number of cells and represents all the markers present in the same cell by a single icon. This icon shows the number of markers it symbolizes.
He also wrote an excellent post explaining how marker clustering works.
Related Posts
MySQL Split String Function
MySQL does not include a function to split a delimited string. However, it’s very easy to create your own function.
Create function syntax
A user-defined function is a way to extend MySQL with a new function that works like a native MySQL function.
CREATE [AGGREGATE] FUNCTION function_name
RETURNS {STRING|INTEGER|REAL|DECIMAL}
To create a function, you must have the INSERT privilege for the <mysql> database.
Split delimited strings
The following example function takes 3 parameters, performs an operation using an SQL function, and returns the result.
Function
CREATE FUNCTION SPLIT_STR(
x VARCHAR(255),
delim VARCHAR(12),
pos INT
)
RETURNS VARCHAR(255)
RETURN REPLACE(SUBSTRING(SUBSTRING_INDEX(x, delim, pos),
LENGTH(SUBSTRING_INDEX(x, delim, pos -1)) + 1),
delim, '');
Usage
SELECT SPLIT_STR(string, delimiter, position)
Example
SELECT SPLIT_STR('a|bb|ccc|dd', '|', 3) as third;
+-------+
| third |
+-------+
| ccc |
+-------+
ActiveRecord: JavaScript ORM Library
Aptana has just released a beta version of its ActiveRecord.js which is an ORM JavaScript library that implements the ActiveRecord pattern. It works with AIR and other environments:
ActiveRecord.js is a single file, MIT licensed, relies on no external JavaScript libraries, supports automatic table creation, data validation, data synchronization, relationships between models, life cycle callbacks and can use an in memory hash table to store objects if no SQL database is available.
Example:
var User = ActiveRecord.define('users',{
username: '',
email: ''
});
User.hasMany('articles');
var ryan = User.create({
username: 'ryan',
email: 'rjohnson@aptana.com'
});
var Article = ActiveRecord.define('articles',{
name: '',
body: '',
user_id: 0
});
Article.belongsTo('user');
var a = Article.create({
name: 'Announcing ActiveRecord.js',
user_id: ryan.id
});
a.set('name','Announcing ActiveRecord.js!!!');
a.save();
a.getUser() == ryan;
ryan.getArticleList()[0] == a;
Links
Geo Proximity Search: The Haversine Equation
I’m working on a project that requires Geo proximity search. Basically, what I’m doing is plotting a radius around a point on a map, which is defined by the distance between two points on the map given their latitudes and longitudes. To achieve this I’m using the Haversine formula (spherical trigonometry). This equation is important in navigation, it gives great-circle distances between two points on a sphere from their longitudes and latitudes. You can see it in action here: Radius From UK Postcode.
This has already been covered in some blogs, however, I found most of the information to be inaccurate and, in some cases, incorrect. The Haversine equation is very straight forward, so there’s no need to complicate things.
I’ve implemented the solution in SQL, Python and PHP. Use the one that suits you best.
SQL implementation
set @latitude=53.754842; set @longitude=-2.708077; set @radius=20; set @lng_min = @longitude - @radius/abs(cos(radians(@latitude))*69); set @lng_max = @longitude + @radius/abs(cos(radians(@latitude))*69); set @lat_min = @latitude - (@radius/69); set @lat_max = @latitude + (@radius/69); SELECT * FROM postcode WHERE (longitude BETWEEN @lng_min AND @lng_max) AND (latitude BETWEEN @lat_min and @lat_max);
Python implementation
from __future__ import division import math longitude = float(-2.708077) latitude = float(53.754842) radius = 20 lng_min = longitude - radius / abs(math.cos(math.radians(latitude)) * 69) lng_max = longitude + radius / abs(math.cos(math.radians(latitude)) * 69) lat_min = latitude - (radius / 69) lat_max = latitude + (radius / 69) print 'lng (min/max): %f %f' % (lng_min, lng_max) print 'lat (min/max): %f %f' % (lat_min, lat_max)
PHP implementation
$longitude = (float) -2.708077; $latitude = (float) 53.754842; $radius = 20; // in miles $lng_min = $longitude - $radius / abs(cos(deg2rad($latitude)) * 69); $lng_max = $longitude + $radius / abs(cos(deg2rad($latitude)) * 69); $lat_min = $latitude - ($radius / 69); $lat_max = $latitude + ($radius / 69); echo 'lng (min/max): ' . $lng_min . '/' . $lng_max . PHP_EOL; echo 'lat (min/max): ' . $lat_min . '/' . $lat_max;
It outputs the same result:
lng (min/max): -3.1983251898421/-2.2178288101579 lat (min/max): 53.464986927536/54.044697072464
That’s it. Happy Geolocating!
Four Great InfoQ Presentations
Hope you like these recommendations and if you know of any other good tech-related video, then please let me know.
1. Developing Expertise: Herding Racehorses, Racing Sheep
One of my favourites. In this presentation Dave Thomas (The Pragmatic Programmer) talks about expanding people’s expertise in their domains of interest by not treating them uniformly as they had the same amount of knowledge and level of experience.
2. Real World Web Services
Another good presentation. In this one Scott Davis provides a pragmatic, down-to-earth introduction to Web services as used in the real world by public sites, including SOAP-based and REST examples.
3. CouchDB and me
This presentation is different, and that’s why I like it so much. Damien Katz shares his experiences and reminds people how difficult but at the same time gratifying is to be an open source developer. He talks about the history of CouchDB development from a very personal point of view. His inspirations for CouchDB and why he decided to move my wife and kids to a cheaper place and live off savings to build this thing.
4. Yahoo! Communities Architectures
In this presentation, Ian Flint tries to explain the infrastructure and architecture employed by Yahoo! to keep going a multitude of servers running of different platforms and offering different services. Very interesting!