Domain-Driven Design: Data Access Strategies

Part 1: Domain-Driven Design and MVC Architectures

The Domain Model

Here are some of the features a Domain-driven design framework should support:

  • A domain model that is independent and decoupled from the application.
  • A reusable library that can be used in many different domain-specific applications.
  • Dependency Injection in order to inject Repositories and Services into Domain Objects.
  • Integration with unit testing frameworks, such as PHPUnit.
  • Good integration with other frameworks, such as Zend, Symfony, Doctrine, etc.

The Zend Framework, for example, is part of the infrastructure layer and acts as a supporting library for all the other layers. However, the domain layer should be well isolated from the other layers of the application and should not be dependent on the framework you are using.

Data Access Strategies

When accessing data from a data source, you have to decide how your application will communicate with a data source. Each data access strategy has its own advantages and disadvantages. Here are some design patterns that support DDD:

Generic DAO’s

Data Access Objects (DAO’s) and Repositories play an important role in DDD. The goal of a DAO is to abstract and encapsulate all access to the data and provide an interface. The DAO always connects, reads and saves data to a data source. From the applications point of view, it makes no difference when it accesses a database, XML file or Web service:

$user = new UserDbDao();
$row = $user->findUserById(456);
echo $row->name;

$user = new UserXmlDao();
$row = $user->findUserById(456);
echo $row->name;

Table Data Gateway

An object that acts as a Gateway to a database table. One instance handles all the rows in the table. The Zend_Db_Table solution is an implementation of the Table Data Gateway pattern:

$user = new User();
$rows = $user->find(456);
$row = $rows->current();
echo $row->name;

More info: Table Data Gateway

Row Data Gateway

An object that acts as a Gateway to a single record in a data source. There is one instance per row. Zend_Db_Table_Row is an implementation of the Row Data Gateway pattern:

$user = new User();
$row = $user->fetchRow($user->select()->where('user_id = ?', 1));
echo $row->name;

More info: Row Data Gateway

Active Record

An object that wraps a row in a database table or view, encapsulates the database access, and adds domain logic on that data. The Mad_Model component is an  ORM layer that follows the Active Record pattern where tables map to classes, rows map to objects, and columns to object attributes:

$user = new User();
$userFinder = new UserFinder();
$row = $userFinder->find(456);

More info: Active Record

Data Mapper

A layer of Mappers that moves data between objects and a database while keeping them independent of each other and the mapper itself:

$user = new User();
$userMapper = new UserMapper();
$row = $userMapper->findByEmail($email);

More info: Data Mapper

Data Transfer Objects

A Data Transfer Object (DTO) in Domain-driven design is not the same as a Value Object. A DTO is often used in combination with a DAO to encapsulate data. It allows you to reduce communication effort when dealing with a lot of small data entities. For example:

class UserData
{
    public function setId($id) {}
    public function getId() {}
    public function setFirstName($firstName) {}
    public function getFirstName() {}
    public function setLastName($lastName) {}
    public function getLastName() {}
}

$user = new UserData();
$user->setFirstName('Jim');
$user->setLastName('Morrison');

$userDao = new UserDbDao();
$userDao->insert($user);

A DTO is a simple container for a set of aggregated data. It should contain no business logic and limit its behaviour to activities such as internal consistency checking and basic validation. Be careful not to make the DTO depend on any new classes as a result of implementing these methods.

Part 3: Domain-Driven Design: The Repository

Domain-Driven Design and MVC Architectures

According to Eric Evans, Domain-driven design (DDD) is not a technology or a methodology. It’s a different way of thinking about how to organize your applications and structure your code. This way of thinking complements very well the popular MVC architecture. The domain model provides a structural view of the system. Most of the time, applications don’t change, what changes is the domain. MVC, however, doesn’t really tell you how your model should be structured. That’s why some frameworks don’t force you to use a specific model structure, instead, they let your model evolve as your knowledge and expertise grows.

Domain-driven design separates the model layer “M” of MVC into an application, domain and infrastructure layer. The infrastructure layer is used to retrieve and store data. The domain layer is where the business knowledge or expertise is. The application layer is responsible for coordinating the infrastructure and domain layers to make a useful application. Typically, it would use the infrastructure to obtain the data, consult the domain to see what should be done, and then use the infrastructure again to achieve the results. Srini Penchikala explains this in more detail here: “Domain Driven Design and Development In Practice“.

Object-oriented programming is the most important element in the domain implementation. Domain objects are designed using classes and interfaces, and take advantage of OOP concepts like inheritance, encapsulation, and polymorphism. Most of the domain elements are true objects with both State (attributes) and Behaviour (methods or operations that act on the state). Entities and Value Objects in DDD are classic examples of OOP concepts since they have both state and behaviour.

Terminology used by Domain-driven design

  • Entity: An object which has a distinct identity. For example, a User entity has a distinct identity with an ID.
  • Value Object: An object which has no distinct identity, like numbers and dates. For example, if two Users have the same date of birth, you can have multiple copies of an object that represents the date 16 Jan 1982.
  • Factory: Used to create Entities. For example, you can use a Factory to create a Profile entity from a User entity.
  • Repository: Used to store, retrieve and delete domain objects from different storage implementations. For example, you can use a Repository to store the Profile your Factory created.
  • Service: When an operation does not conceptually belong to any object.

The purpose of this post was to provide a brief introduction to Domain-driven design.

Part 2: Domain-Driven Design: Data Access Strategies

Links

If you’re interested in learning more about Domain-driven design or Model-driven design, I recommend the following articles:

Zend Framework: The Cost of Flexibility is Complexity

The Zend Framework is a very flexible system, in fact, it can be used to build practically anything. But, the problem with using a flexible system is that flexibility costs. And the cost of flexibility is complexity.

Martin Fowler wrote:

Every time you put extra stuff into your code to make it more flexible, you are usually adding more complexity. If your guess about the flexibility needs of your software is correct, then you are ahead of the game. You’ve gained. But if you get it wrong, you’ve only added complexity that makes it more difficult to change your software.

Don’t assume that just because you’re using an object-oriented framework you are writing reusable and maintainable code. You are just structuring your spaghetti code. As Paul Graham put it: “Object-oriented programming offers a sustainable way to write spaghetti code”. The main problem with flexibility is that most developers give up trying to understand. I don’t blame them, no one likes dealing with complexity. However, flexibility is sometimes required and generally always desirable. But, sometimes we forget that frameworks exist to help us solve or address complex issues, not to fix bad design decisions for us. So if your code is difficult to understand or unit test, then you are probably doing something wrong.

Just because you can, doesn’t mean you should

What’s the best way to render a view script in the Zend Framework? Think, don’t give up yet. Is it enabling or disabling the ViewRenderer action helper? Remember, the framework allows you to shape the Zend_Controller_Action class to your application’s needs.

The manual clearly states that:

By default, the front controller enables the ViewRenderer action helper. This helper takes care of injecting the view object into the controller, as well as automatically rendering views.

However, no one stops you from doing it manually:

Zend_Controller_Action provides a rudimentary and flexible mechanism for view integration. Two methods accomplish this, initView() and render(); the former method lazy-loads the $view public property, and the latter renders a view based on the current requested action, using the directory hierarchy to determine the script path.

Developer A

class IndexController extends Zend_Controller_Action
{
    public function searchAction()
    {
        $query = $this->_getParam('q', null);
        if (null === $query) {
            $this->view->error = 'Invalid query';
            return;
        }

        $this->view->query = $query;
        if (is_numeric($query)) {
            $this->setUserDetailsById($query);
        } else {
            $this->setUserDetailsByName($query);
        }
    }

    public function setUserDetailsByName($name)
    {
        $row = $this->findUserByName($name);
        $this->view->userId = $row['id'];
        $this->view->userName = $row['name'];
    }

    public function setUserDetailsById($id)
    {
        $row = $this->findUserById($id);
        $this->view->userId = $row['id'];
        $this->view->userName = $row['name'];
    }
}

Developer B

class IndexController extends Zend_Controller_Action
{
    protected $_viewParams = array(
        'error'    => null,
        'query'    => null,
        'userId'   => null,
        'userName' => null,
    );

    public function init()
    {
        $this->_helper->removeHelper('viewRenderer');
    }

    public function searchAction()
    {
        $query = $this->_getParam('q', null);
        if (null === $query) {
            $this->_viewParams['error'] = 'Invalid query';
            $this->renderView('search');
            return false;
        }

        return $this->displayUserDetails($query);
    }

    public function displayUserDetails($query)
    {
        if (is_numeric($query)) {
            $row = $this->getUserDetailsById($query);
        } else {
            $row = $this->getUserDetailsByName($query);
        }

        if (!$row) {
            $this->_viewParams['error'] = 'User not found';
            $this->renderView('search');
            return false;
        }

        foreach ($this->_viewParams as $key => $value) {
            if (!array_key_exists($key, $row) {
                continue;
            }
            $this->_viewParams[$key] = $row[$key];
        }

        $this->_viewParams['query'] = $query;
        $this->renderView('search');

        return true;
    }

    public function renderView($template)
    {
        $view = $this->initView();
        foreach ($this->_viewParams as $key => $value) {
            $view->$key = $value;
        }
        $this->render($template);
    }

    public function getUserDetailsByName($name)
    {
        return $this->findUserByName($name);
    }

    public function getUserDetailsById($id)
    {
        return $this->findUserById($id);
    }
}

Same problem, different solutions.

Now, guess who is doing test-driven development, has all his code unit tested, puts commonly used code into libraries, abstracts out common patterns of behaviour, writes his code using the top-down fashion and uses the principle of “least astonishment”.

Developer A or B?

Conclusion

Many requirements lie in the future and are unknowable at the time our application is designed and built. To avoid burdensome maintenance costs developers must therefore rely on a system’s ability to change gracefully its flexibility. The combination of flexibility and smart design can reduce the maintenance burden. It’s up to us, the developers, to change our thinking when using a flexible framework. Like I said before, frameworks exist to help us solve complex issues, not to make decisions for us.

Four Great InfoQ Presentations

Hope you like these recommendations and if you know of any other good tech-related video, then please let me know.

1. Developing Expertise: Herding Racehorses, Racing Sheep

One of my favourites. In this presentation Dave Thomas (The Pragmatic Programmer) talks about expanding people’s expertise in their domains of interest by not treating them uniformly as they had the same amount of knowledge and level of experience.

Developing Expertise

2. Real World Web Services

Another good presentation. In this one Scott Davis provides a pragmatic, down-to-earth introduction to Web services as used in the real world by public sites, including SOAP-based and REST examples.

Real World Web Services

3. CouchDB and me

This presentation is different, and that’s why I like it so much. Damien Katz shares his experiences and reminds people how difficult but at the same time gratifying is to be an open source developer. He talks about the history of CouchDB development from a very personal point of view. His inspirations for CouchDB and why he decided to move my wife and kids to a cheaper place and live off savings to build this thing.

CouchDB and me

4. Yahoo! Communities Architectures

In this presentation, Ian Flint tries to explain the infrastructure and architecture employed by Yahoo! to keep going a multitude of servers running of different platforms and offering different services. Very interesting!

Yahoo! Communities Architectures

Memcached consistent hashing mechanism

If you are using the Memcache functions through a PECL extension, you can set global runtime configuration options by specifying the values within your php.ini file. One of them is memcache.hash_strategy. This option sets the hashing mechanism used to select and specifies which hash strategy to use: Standard (default) or Consistent.

It’s recommended that you set to Consistent to allow servers to be added or removed from the pool without causing the keys to be remapped to other servers. When set to standard, an older strategy is used that potentially uses different servers for storage.

With PHP, the connections to the memcached instances are kept open as long as the PHP and associated Apache instance remain running. When adding a removing servers from the list in a running instance, the connections will be shared, but the script will only select among the instances explicitly configured within the script.

So, to ensure that changes to the server list within a script do not cause problems, make sure to use the consistent hashing mechanism.

Getting Started With Message Queues

When you’re building an infrastructure that is distributed all over the internet, you’ll come to a point where you can’t rely on synchronous remote calls that, for example, synchronize data on 2 servers:

  1. You don’t have any failover system that resends messages if something went wrong (network outages, software failures).
  2. Messages are processed over time and you have no control if something goes overloaded by too many requests.

Even if you don’t have to send messages all over the Internet there are enough points of failures where something can go wrong. You want a reliable and durable system that fails gracefully and ensure.

Solutions

Dropr

Dropr is a distributed message queue framework written in PHP. The main goals are:

  • Reliable and durable (failsafe)-messaging over networks.
  • Decentralized architecture without a single (point of failure) server instance.
  • Easy to setup and use.
  • Modularity for queue storage and message transports (currently filesystem storage and curl-upload are implemented).

More info

Beanstalkd

Beanstalkd is a fast, distributed, in-memory workqueue service. Its interface is generic, but was originally designed for reducing the latency of page views in high-volume web applications by running most time-consuming tasks asynchronously.

It was developed to improve the response time for the Causes on Facebook application (with over 9.5 million users). Beanstalkd drastically decreased the average response time for the most common pages to a tiny fraction of the original, significantly improving the user experience.

More info

Zend Platform Job Queues

Job Queues is an approach to streamline offline processing of PHP scripts. Job Queue Server provides the ability to reroute and delay the execution of PHP scripts that are not essential during user interaction with the Web Server. Job Queues improve the response time during interactive web sessions and utilizes unused resources.

More info

Memcached as simple message queue

In this post, Olly explains how to use memcached as a simple message queue:

Some months ago at work we were in the need of a message queue, a very simple one, basically just a message buffer. The idea is simple, the webservers send there messages to the queue, the queue always accepts all messages and waits until the ETL processes request messages for further processing. As the webservers are time critical and the ETL processes aren’t you need something in between.

More info

Links

Make your site run 10 times faster

This is what Mike Peters says he can do: make your site run 10 times faster. His test bed is “half a dozen servers parsing 200,000 pages per hour over 40 IP addresses, 24 hours a day.” Before optimization CPU spiked to 90% with 50 concurrent connections. After optimization each machine “was effectively handling 500 concurrent connections per second with CPU at 8% and no degradation in performance.”

Mike identifies six major bottlenecks