Federico Cargnelutti

Simple is better than complex. Complex is better than complicated. | @fedecarg

PHP Simple HTML DOM Parser (jQuery Style)

with 12 comments

Simple HTML DOM Parser

The Simple HTML DOM Parser is implemented as a simple PHP class and a few helper functions. It supports CSS selector style screen scraping (such as in jQuery), can handle invalid HTML, and even provides a familiar interface to manipulate a DOM.

Tutorial: Easy Screen Scraping in PHP with the Simple HTML DOM Library

Zend_Dom_Query

Zend_Dom_Query provides mechanisms for querying XML and (X)HTML documents utilizing either XPath or CSS selectors. It was developed to aid with functional testing of MVC applications, but could also be used for rapid development of screen scrapers.

CSS selector notation is provided as a simpler and more familiar notation for web developers to utilize when querying documents with XML structures. The notation should be familiar to anybody who has developed Cascading Style Sheets or who utilizes Javascript toolkits that provide functionality for selecting nodes utilizing CSS selectors.

phpQuery

phpQuery is a server-side, chainable DOM selector & manipulator. CSS selectors are used to fetch nodes. It’s a partial port of jQuery JavaScript Library to PHP5. It doesn’t need jQuery to work. API is compatible with jQuery 1.2.

DomQuery

DomQuery is a wrapper for various features of the DOM extension to provide a jQuery-like API for them using the SPL ArrayObject class as a base for storage and iteration.

Written by Federico

August 7, 2008 at 7:09 pm

Posted in PHP, Programming

12 Responses

Subscribe to comments with RSS.

  1. Is this better than just using native DOM and xpath?

    Steve

    August 7, 2008 at 9:00 pm

  2. Very very nice to have! Thanks again!

    (Actually I was coding this by my self, but now I have time for other things)

    Alexander Schmidt

    August 7, 2008 at 9:33 pm

  3. Something else you may be interested in trying out is the new Zend_Dom_Query component in ZF 1.6.0. It allows you to use CSS selectors in order to select DOM nodes from a document, similar to jQuery, prototypes bling-bling, and dojo.query. Internally, we’re also using it within the Zend_Test_PHPUnit infrastructure to make it easy to verify the structure of documents generated using ZF’s MVC within unit tests.

    Matthew Weier O'Phinney

    August 8, 2008 at 12:57 am

  4. Added :)

    Federico

    August 8, 2008 at 9:06 am

  5. I developed a similar library that implements programmatic selection of nodes through the API rather than using an expression parser. You can find it here: http://svn.assembla.com/svn/php_domquery/trunk/DomQuery.php. Unit tests are also available there.

    Matthew Turland

    August 9, 2008 at 4:08 pm

  6. Added to the list, great stuff, thanks!

    Federico

    August 10, 2008 at 1:45 am

  7. [...] (jQuery Style) 8 08, 2008 Author: PHPDeveloper.org On the PHP::Impact blog today Federico points out a few HTML DOM parsers that work similar to [...]

    LoveOfPHP

    August 10, 2008 at 10:56 am

  8. I`ve spent much of time developing smth similar to phpQuery and still not finished it. Thanks guys!

    Vadim Voituk

    August 16, 2008 at 10:46 am

  9. Found Simple HTML DOM Parser on my own, but your other additions are really interesting. I use jQuery a lot so phpQuery looks really nice, though for some reason Simple feels more natural server side. I like where phpQuery is going though and look forward to the AJAX clients that will be developed.

    Travell Perkins

    September 5, 2008 at 11:47 am

  10. For everyone who’s looking forward to the Ajax, it’s now available in phpQuery 0.9.4, together with Events and WebBrowser plugin.

    If interested, read release notes:
    http://phpquery-library.blogspot.com/2008/09/phpquery-094-released.html

    Tobiasz Cudnik

    September 22, 2008 at 11:31 am

  11. It’s looking good, thanks Tobiasz.

    Federico

    September 22, 2008 at 11:34 am

  12. Thank you for the list. Very helpful.

    Jakup

    August 12, 2009 at 2:33 pm


Leave a Reply