TDD: Checking the return value of a Stub

State verification is used to ensure that after a method is run, the returned value of the SUT is as expected. Of course, you may need to use Stubs on a test double or a real object to tell the object to return a value in response to a given message.

In Java, you declare a method’s return type in its method declaration, this means that the type of the return value must match the declared return type or otherwise you will get a compiler error. In PHP, for example, you dynamically type the return value within the body of the method. This means that PHP mocking libraries cannot check the type of the return value and provide guarantees about what is being verified.

This leads to the awkward situation where a refactoring may change the SUT behaviour and leave a stub broken but with passing tests. For example, consider the following:

Developer (A) creates 2 classes, Presenter and Collaborator:

class Presenter
{
    protected $collaborator;

    public function __construct(Collaborator $obj)
    {
        $this->collaborator = $obj;
    }

    public function doSomething()
    {
        $limit = 1;
        $stories = $this->collaborator->getStories($limit);
        // ...
        return $stories;
    }
}

class Collaborator
{
    public function getStories($limit)
    {
        return array();
    }
}

Then writes a test case:

class PresenterTest extends PHPUnit_Framework_TestCase
{
    // Behaviour verification
    public function testBehaviour()
    {
        $mock = $this->getMock('Collaborator', array('getStories'));
        $mock->expects($this->once())
            ->method('getStories')
            ->with(
                $this->logicalAnd(
                    $this->equalTo(1), $this->isType('integer')
                )
            );

        $presenter = new Presenter($mock);
        $presenter->doSomething();
    }

    // State verification
    public function testState()
    {
        $stub = $this->getMock('Collaborator', array('getStories'));
        $stub->expects($this->once())
            ->method('getStories')
            ->will($this->returnValue(array()));

        $presenter = new Presenter($stub);
        $data = $presenter->doSomething();

        $this->assertEquals(array(), $data);
    }
}

The Developer (A) uses a mock to verify the behaviour (a mockist practitioner) and a stub to verify the method worked correctly. The first test asserts that the expectation is met and the second one that the given condition is true. Finally, the Developer runs and watches all of the tests pass. Great!

The next day Developer (B) decides to makes some changes to the Collaborator class and return NULL if there are no stories:

class Collaborator
{
    public function getStories($limit)
    {
        $stories = array();
        if (count($stories) < 1) {
            return;
        }

        return $stories;
    }
}

The implementation of the method-under-test changed, it now returns a different data type, null instead of array. This means that our second test should fail, but it doesn’t. The test still asserts that the given condition is true, even though the return type is different. This is a problem. It means that our second test is unable to verify the correct state of the SUT (and its collaborator).

This is because most PHP mocking libraries are heavily influenced by Java (PHPUnit was originally a port of JUnit), and Java doesn’t have this problem. In PHP, the method’s return type is not a required elements of a method declaration, so developers can define it at run time and return whatever type they want.

The solution

You can use DocBlock annotations to make sure the data type of the returned value matches the one defined in the DocBlock. For this to work you need to set the return value using ReturnValue instead of PHPUnit_Framework_MockObject_Stub_Return. For example:

class PresenterTest extends PHPUnit_Framework_TestCase
{
    // State verification
    public function testState()
    {
        $stub = $this->getMock('Collaborator', array('getStories'));
        $stub->expects($this->once())
            ->method('getStories')
            ->will(new ReturnValue(array()));

        $presenter = new Presenter($stub);
        $data = $presenter->doSomething();

        $this->assertEquals(array(), $data);
    }
}

Now if you run the test it fails with the following error message:

PHPUnit_Framework_Exception: Invalid method declaration; return type required

The test also fails if the returned type doesn’t match the expected one defined in the DocBlock:

class Collaborator
{
    /**
     * @return int
     */
    public function getStories($limit)
    {
        // ...
    }
}

Error message:

PHPUnit_Framework_Exception: array does not match expected type "int"

Or, if you specify more than one data type in the DocBlock:

class Collaborator
{
    /**
     * @return array|null
     */
    public function getStories($limit)
    {
        // ...
    }
}

Error message:

PHPUnit_Framework_Exception: getStories cannot return more than one type, 2 given (array, null)

This solution is not perfect but should work in most cases.

Check whether your web server is correctly configured

Last year Zone-H reported a record number of 1.5 million websites defacements. 1 million of those websites where running Apache.

When it comes to configuring a web server, some people tend to turn everything on by default. Developers are happy because the functionality that they wanted is available without any extra configuration, and there is a reduction in support calls due to functionality not working out-of-the-box. This has proven to be a major source of problems for security in general. A web server should start off with total restriction and then access rights should be applied appropriately.

You can check whether your web server is correctly configured by using Nikto, a great open source vulnerability scanners that is able to scan for quite a large number of web server vulnerabilities. From their site:

“Nikto is an Open Source (GPL) web server scanner which performs comprehensive tests against web servers for multiple items, including over 6400 potentially dangerous files/CGIs, checks for outdated versions of over 1200 servers, and version specific problems on over 270 servers. It also checks for server configuration items such as the presence of multiple index files, HTTP server options, and will attempt to identify installed web servers and software. Scan items and plugins are frequently updated and can be automatically updated.”

I’m going to run a default scan by just supplying the IP of the target:

$ cd nikto-2.1.4
$ ./nikto.pl -h 127.0.0.1

- ***** SSL support not available (see docs for SSL install) *****
- Nikto v2.1.4
---------------------------------------------------------------------------
+ Target IP:          127.0.0.1
+ Target Hostname:    localhost.localdomain
+ Target Port:        80
+ Start Time:         2011-12-12 13:06:59
---------------------------------------------------------------------------
+ Server: Apache
+ No CGI Directories found (use '-C all' to force check all possible dirs)
+ 6448 items checked: 0 error(s) and 0 item(s) reported on remote host
+ End Time:           2011-12-12 13:08:07 (68 seconds)
---------------------------------------------------------------------------
+ 1 host(s) tested

By looking at the last section of the Nikto report, I can see that there are no issues that need to be addressed.

Tools like Nikto and Skipfish serve as a foundation for professional web application security assessments. Remember, the more tools you use, the better.

Links

JavaScript: Retrieve and paginate JSON-encoded data

I’ve created a jQuery plugin that allows you to retrieve a large data set in JSON format from a server script and load the data into a list or table with client side pagination enabled. To use this plugin you need to:

Include jquery.min.js and jquery.paginate.min.js in your document:

http://js/jquery.min.js
http://js/jquery.paginate.min.js

Include a small css to skin the navigation links:

<style type="text/css">
a.disabled {
    text-decoration: none;
    color: black;
    cursor: default;
}
</style>

Define an ID on the element you want to paginate, for example: “listitems”. If you have a more than 10 child elements and you want to avoid displaying them before the javascript is executed, you can set the element as hidden by default:

<ul id="listitems" style="display:none"></ul>

Place a div in the place you want to display the navigation links:

Finally, include an initialization script at the bottom of your page like this:

$(document).ready(function() {
    $.getJSON('data.json', function(data) {
        var items = [];
        $.each(data.items, function(i, item) {
            items.push('
  • ' + item + '
  • ');         });         $('#listitems').append(items.join(''));         $('#listitems').paginate({itemsPerPage: 5});     }); });

    You can fork the code on GitHub or download it.

    Building a RESTful Web API with PHP and Apify

    Apify is a small and powerful open source library that delivers new levels of developer productivity by simplifying the creation of RESTful architectures. You can see it in action here. Web services are a great way to extend your web application, however, adding a web API to an existing web application can be a tedious and time-consuming task. Apify takes certain common patterns found in most web services and abstracts them so that you can quickly write web APIs without having to write too much code.

    Apify exposes similar APIs as the Zend Framework, so if you are familiar with the Zend Framework, then you already know how to use Apify. Take a look at the UsersController class.

    Building a RESTful Web API

    In Apify, Controllers handle incoming HTTP requests, interact with the model to get data, and direct domain data to the response object for display. The full request object is injected via the action method and is primarily used to query for request parameters, whether they come from a GET or POST request, or from the URL.

    Creating a RESTful Web API with Apify is easy. Each action results in a response, which holds the headers and document to be sent to the user’s browser. You are responsible for generating the response object inside the action method.

    class UsersController extends Controller
    {
        public function indexAction($request)
        {
            // 200 OK
            return new Response();
        }
    }

    The response object describes the status code and any headers that are sent. The default response is always 200 OK, however, it is possible to overwrite the default status code and add additional headers:

    class UsersController extends Controller
    {
        public function indexAction($request)
        {
            $response = new Response();
    
            // 401 Unauthorized
            $response->setCode(Response::UNAUTHORIZED);
    
            // Cache-Control header
            $response->setCacheHeader(3600);
    
            // ETag header
            $response->setEtagHeader(md5($request->getUrlPath()));
    
            // X-RateLimit header
            $limit = 300;
            $remaining = 280;
            $response->setRateLimitHeader($limit, $remaining);
    
            // Raw header
            $response->addHeader('Edge-control: no-store');
    
            return $response;
        }
    }

    Content Negotiation

    Apify supports sending responses in HTML, XML, RSS and JSON. In addition, it supports JSONP, which is JSON wrapped in a custom JavaScript function call. There are 3 ways to specify the format you want:

    • Appending a format extension to the end of the URL path (.html, .json, .rss or .xml)
    • Specifying the response format in the query string. This means a format=xml or format=json parameter for XML or JSON, respectively, which will override the Accept header if there is one.
    • Sending a standard Accept header in your request (text/html, application/xml or application/json).

    The acceptContentTypes method indicates that the request only accepts certain content types:

    class UsersController extends Controller
    {
        public function indexAction($request)
        {
        	// only accept JSON and XML
            $request->acceptContentTypes(array('json', 'xml'));
    
            return new Response();
        }
    }

    Apify will render the error message according to the format of the request.

    class UsersController extends Controller
    {
        public function indexAction($request)
        {
            $request->acceptContentTypes(array('json', 'xml'));
    
        	$response = new Response();
            if (! $request->hasParam('api_key')) {
                throw new Exception('Missing parameter: api_key', Response::FORBIDDEN);
            }
            $response->api_key = $request->getParam('api_key');
    
            return $response;
        }
    }

    Request

    GET /users.json

    Response

    Status: 403 Forbidden
    Content-Type: application/json
    {
        "code": 403,
        "error": {
            "message": "Missing parameter: api_key",
            "type": "Exception"
        }
    }

    Resourceful Routes

    Apify supports REST style URL mappings where you can map different HTTP methods, such as GET, POST, PUT and DELETE, to different actions in a controller. This basic REST design principle establishes a one-to-one mapping between create, read, update, and delete (CRUD) operations and HTTP methods:

    HTTP Method URL Path Action Used for
    GET /users index display a list of all users
    GET /users/:id show display a specific user
    POST /users create create a new user
    PUT /users/:id update update a specific user
    DELETE /users/:id destroy delete a specific user

     

    If you wish to enable RESTful mappings, add the following line to the index.php file:

    try {
        $request = new Request();
        $request->enableUrlRewriting();
        $request->enableRestfulMapping();
        $request->dispatch();
    } catch (Exception $e) {
        $request->catchException($e);
    }

    The RESTful UsersController for the above mapping will contain 5 actions as follows:

    class UsersController extends Controller
    {
        public function indexAction($request) {}
        public function showAction($request) {}
        public function createAction($request) {}
        public function updateAction($request) {}
        public function destroyAction($request) {}
    }

    By convention, each action should map to a particular CRUD operation in the database.

    Building a Web Application

    Building a web application can be as simple as adding a few methods to your controller. The only difference is that each method returns a view object.

    class PostsController extends Controller
    {
        /**
         * route: /posts/:id
         *
         * @param $request Request
         * @return View|null
         */
        public function showAction($request)
        {
            $id = $request->getParam('id');
            $post = $this->getModel('Post')->find($id);
            if (! isset($post->id)) {
                return $request->redirect('/page-not-found');
            }
    
            $view = $this->initView();
            $view->post = $post;
            $view->user = $request->getSession()->user
    
            return $view;
        }
    
        /**
         * route: /posts/create
         *
         * @param $request Request
         * @return View|null
         */
        public function createAction($request)
        {
            $view = $this->initView();
            if ('POST' !== $request->getMethod()) {
                return $view;
            }
    
            try {
                $post = new Post(array(
                    'title' => $request->getPost('title'),
                    'text'  => $request->getPost('text')
                ));
            } catch (ValidationException $e) {
                $view->error = $e->getMessage();
                return $view;
            }
    
            $id = $this->getModel('Post')->save($post);
            return $request->redirect('/posts/' . $id);
        }
    }

    The validation is performed inside the Post entity class. An exception is thrown if any given value causes the validation to fail. This allows you to easily implement error handling for the code in your controller.

    Entity Class

    You can add validation to your entity class to ensure that the values sent by the user are correct before saving them to the database:

    class Post extends Entity
    {
        protected $id;
        protected $title;
        protected $text;
    
        // sanitize and validate title (optional)
        public function setTitle($value)
        {
            $value = htmlspecialchars(trim($value), ENT_QUOTES);
            if (empty($value) || strlen($value) < 3) {
                throw new ValidationException('Invalid title');
            }
            $this->title = $title;
        }
    
        // sanitize text (optional)
        public function setText($value)
        {
            $this->text = htmlspecialchars(strip_tags($value), ENT_QUOTES);
        }
    }

    Routes

    Apify provides a slimmed down version of the Zend Framework router:

    $routes[] = new Route('/posts/:id',
        array(
            'controller' => 'posts',
            'action'     => 'show'
        ),
        array(
            'id'         => '\d+'
        )
    );
    $routes[] = new Route('/posts/create',
        array(
            'controller' => 'posts',
            'action'     => 'create'
        )
    );

    HTTP Request

    GET /posts/1

    Incoming requests are dispatched to the controller “Posts” and action “show”.

    Feedback

    • If you encounter any problems, please use the issue tracker.
    • For updates follow @fedecarg on Twitter.
    • If you like Apify and use it in the wild, let me know.

    JavaScript: Asynchronous Script Loading and Lazy Loading

    Most of the time remote scripts are included at the end of an HTML document, right before the closing body tag. This is because browsers are single threaded and when they encounter a script tag, they halt any other processes until they download and parse the script. By including scripts at the end, you allow the browser to download and render all page elements, style sheets and images without any unnecessary delay. Also, if the browser renders the page before executing any script, you know that all page elements are already available to retrieve.

    However, websites like Facebook for example, use a more advanced technique. They include scripts dynamically via DOM methods. This technique, which I’ll briefly explain here, is known as “Asynchronous Script Loading”.

    Lets take a look at the script that Facebook uses to download its JS library:

    (function () {
        var e = document.createElement('script');
        e.src = 'http://connect.facebook.net/en_US/all.js';
        e.async = true;
        document.getElementById('fb-root').appendChild(e);
    }());

    When you dynamically append a script to a page, the browser does not halt other processes, so it continues rendering page elements and downloading resources. The best place to put this code is right after the opening body tag. This allows Facebook initialization to happen in parallel with the initialization on the rest of the page.

    Facebook also makes non-blocking loading of the script easy to use by providing the fbAsyncInit hook. If this global function is defined, it will be executed when the library is loaded.

    window.fbAsyncInit = function () {
        FB.init({
            appId: 'YOUR APP ID',
            status: true,
            cookie: true,
            xfbml: true
        });
    };

    Once the library has loaded, Facebook checks the value of window.fbAsyncInit.hasRun and if it’s false it makes a call to the fbAsyncInit function:

    if (window.fbAsyncInit && !window.fbAsyncInit.hasRun) {
        window.fbAsyncInit.hasRun = true;
        fbAsyncInit();
    }

    Now, what if you want to load multiple files asynchronously, or you need to include a small amount of code at page load and then download other scripts only when needed? Loading scripts on demand is called “Lazy Loading”. There are many libraries that exist specifically for this purpose, however, you only need a few lines of JavaScript to do this.

    Here is an example:

    $L = function (c, d) {
        for (var b = c.length, e = b, f = function () {
                if (!(this.readyState
                		&& this.readyState !== "complete"
                		&& this.readyState !== "loaded")) {
                    this.onload = this.onreadystatechange = null;
                    --e || d()
                }
            }, g = document.getElementsByTagName("head")[0], i = function (h) {
                var a = document.createElement("script");
                a.async = true;
                a.src = h;
                a.onload = a.onreadystatechange = f;
                g.appendChild(a)
            }; b;) i(c[--b])
    };

    The best place to put this code is inside the head tag. You can then use the $L function to asynchronously load your scripts on demand. $L takes two arguments: an array (c) and a callback function (d).

    var scripts = [];
    scripts[0] = 'http://www.google-analytics.com/ga.js';
    scripts[1] = 'http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.js';
    
    $L(scripts, function () {
        console.log("ga and jquery scripts loaded");
    });
    
    $L(['http://connect.facebook.net/en_US/all.js'], function () {
        console.log("facebook script loaded");
        window.fbAsyncInit.hasRun = true;
        FB.init({
            appId: 'YOUR APP ID',
            status: true,
            cookie: true,
            xfbml: true
        });
    });

    You can see this script in action here (right click -> view page source).

    Collective Wisdom from the Experts

    I’ve finally had a chance to read a book I bought a while ago called “97 Things Every Software Architect Should Know – Collective Wisdom from the Experts“. Not the shortest title for a book, but very descriptive. I bought this book at the OSCON Conference in Portland last year. It’s an interesting book and I’m sure anyone involved in software development would benefit from reading it.

    More than 40 architects, including Neal Ford and Michael Nygard, offer advice for communicating with stakeholders, eliminating complexity, empowering developers, and many more practical lessons they’ve learned from years of experience. The book offers valuable information on key development issues that go way beyond technology. Most of the advice given is from personal experience and is good for any project leader involved with software development no matter their job title. However, you have to keep in mind that this is a compilation book, so don’t expect in-depth information or theoritical knowledge about architecture design and software engineering.

    Here are some extracts from the book:

    Simplify essential complexity; diminish accidental complexity – By Neal Ford

    Frameworks that solve specific problems are useful. Over-engineered frameworks add more complexity than they relieve. It’s the duty of the architect to solve the problems inherent in essential complexity without introducing accidental complexity.

    Chances are your biggest problem isn’t technical – By Mark Ramm

    Most projects are built by people, and those people are the foundation for success and failure. So, it pays to think about what it takes to help make those people successful.

    Communication is King – By Mark Richards

    Every software architect should know how to communicate the goals and objectives of a software project. The key to effective communication is clarity and leadership.

    Keeping developers in the dark about the big picture or why decisions were made is a clear recipe for disaster. Having the developer on your side creates a collaborative environment whereby decisions you make as an architect are validated. In turn, you get buy-in from developers by keeping them involved in the architecture process

    Architecting is about balancing – By Randy Stafford

    When we think of architecting software, we tend to think first of classical technical activities, like modularizing systems, defining interfaces, allocating responsibility, applying patterns, and optimizing performance.  Architects also need to consider security, usability, supportability, release management, and deployment options, among others things.  But these technical and procedural issues must be balanced with the needs of stakeholders and their interests.

    Software architecting is about more than just the classical technical activities; it is about balancing technical requirements with the business requirements of stakeholders in the project.

    Skyscrapers aren’t scalable – By Michael Nygard

    We cannot easily add lanes to roads, but we’ve learned how to easily add features to software. This isn’t a defect of our software processes, but a virtue of the medium in which we work. It’s OK to release an application that only does a few things, as long as users value those things enough to pay for them.

    Quantify – Keith Braithwaite

    The next time someone tells you that a system needs to be “scalable” ask them where new users are going to come from and why. Ask how many and by when? Reject “Lots” and “soon” as answers. Uncertain quantitative criteria must be given as a range: the least, the nominal, and the most. If this range cannot be given, then the required behavior is not understood.

    Some simple questions to ask: How many? In what period? How often? How soon? Increasing or decreasing? At what rate? If these questions cannot be answered then the need is not understood. The answers should be in the business case for the system and if they are not, then some hard thinking needs to be done.

    Architects must be hands on – By John Davies

    A good architect should lead by example, he/she should be able to fulfill any of the positions within his team from wiring the network, and configuring the build process to writing the unit tests and running benchmarks. It is perfectly acceptable for team members to have more in-depth knowledge in their specific areas but it’s difficult to imagine how team members can have confidence in their architect if the architect doesn’t understand the technology.

    Use uncertainty as a driver – By Kevlin Henney

    Confronted with two options, most people think that the most important thing to do is to make a choice between them. In design (software or otherwise), it is not. The presence of two options is an indicator that you need to consider uncertainty in the design. Use the uncertainty as a driver to determine where you can defer commitment to details and where you can partition and abstract to reduce the significance of design decisions.

    You can purchase “97 Things Every Software Architect Should Know” from Amazon.

    NoSQL solutions: Membase, Redis, CouchDB and MongoDB

    Each database has specific use cases and every solution has a sweet spot in terms of data, hardware, setup and operation. Here are some of the most popular key-value and document data stores:

    Key-value

    Membase

    • Developed by members of the memcached core team.
    • Simple (key value store), fast (low, predictable latency) and elastic (effortlessly grow or shrink a cluster).
    • Extensions are possible through a plug-in architecture (full-text search, backup, etc).
    • Supports Memcached ASCII and Binary protocols (uses existent Memcached libraries and clients).
    • Guarantees data consistency.
    • High-speed failover (server failures recoverable in under 100ms).
    • User management, alerts and logging and audit trail.

    Redis

    • Developed by Salvatore Sanfilippo and acquired by VMWare in 2010.
    • Very fast. Non-blocking I/O. Single threaded.
    • Data is held in memory but can be persisted by written to disk asynchronously.
    • Values can be strings, lists or sets.
    • Built-in support for master/slave replication.
    • Distributes the dataset across multiple Redis instances.

    Document-oriented

    The major benefit of using a document database comes from the fact that while it has all the benefits of a key/value store, you aren’t limited to just querying by key. However, documented-oriented databases and MapReduce aren’t appropriate for every situation.

    CouchDB

    • High read performance.
    • Supports bulk inserts.
    • Good for consistent master-master replica databases that are geographically distributed and often offline.
    • Good for intense versioning.
    • Android, MeeGo and WebOS include services for syncing locally stored data with a CouchDB non-relational database in the cloud.
    • Better than MongoDB at durability.
    • Uses REST as its interface to the database. It doesn’t have “queries” but instead uses “views”.
    • Makes heavy use of the file system cache (so more RAM is always better).
    • The database must be compacted periodically.
    • Conflicts on transactions must be handled by the programmer manually (e.g. if someone else has updated the document since it was fetched, then CouchDB relies on the application to resolve versioning issues).
    • Scales through asynchronous replication but lacks an auto-sharding mechanism. Reads are distributed to any server while writes must be propagated to all servers.

    MongoDB

    • High write performance. Good for systems with very high update rates.
    • It has the flexibility to replace a relational database in a wider range of scenarios.
    • Supports auto-sharding.
    • More oriented towards master/slave replication.
    • Compaction of the database is not necessary.
    • Both CouchDB and MongoDB support map/reduce operations.
    • Supports dynamic ad hoc queries via a JSON-style query language.
    • The pre-filtering provided by the query attribute doesn’t have a direct counterpart in CouchDB. It also allows post-filtering of aggregated values.
    • Relies on language-specific database drivers for access to the database.

    Links