Introducing DomQuery!!!
I'v been doing a great deal of work on the latest version of my XML/XSL-based framework, Sauc'd, and came to the conclusion that the DOMDocument class, and others related to the PHP library, while VERY useful, are fucking terrible to work with. By themselves, it takes a serious amount of code to iterate across an xml document just to find a node, or group of nodes, that you're looking for. And when you finally find what you need, you have to implement migraine-inducing logic just to manipulate the resulting nodes.
There's gotta be an easier way and, guess what, THERE IS! It's called XPath. XPath allows you to traverse over a DOM with ease and select only the data you want with a simple querying language. Think of XPath as the CSS selectors you use with popular javascript frameworks like jQuery, Mootools or Prototype. PHP 5.x comes with another library that you can use along side DOMDocument that makes applying XPath to your DOM possible.
DOMXPath, apparently, was the answer I was looking for. Using it is simple enough:
$DOM = new DOMDocument();
$DOM->loadXml('<root><item>one</item><item>two</item></root>');
$XPath = new DOMXPath($DOM);
$results = $XPath->query('//root/item[2]');
- Create an instance of the DOMDocument class
- Feed it some XML
- Create an instance of DOMXPath and feed it the instance of DOMDocument
- Query the XML document with an XPath expression. The one above returns the second 'item' node.
See? Simple. Now you can iterate through $results like you would any array.
But still, while it makes searching through XML documents much easier, it's still be lacking in simplicity and ease of use. There is definitely a learning curve. There aren't a whole lot of convenience methods included and the poorly contrived documentation in the PHP manual pretty much leaves a newbie in the dark. You REALLY have to know what you're doing when it comes to playing with PHP's native DOM functionality.
So, long story short, I thought of how cool it would be if there was a utility that mimicked the functionality and usability of jQuery. So, last night I created the DomQuery library. It functions almost exactly like jQuery does even with chainable commands!
Here is a small example of what it can do:
The XML:
<root>
<item>Item One</item>
<item>Item Two</item>
<item test="omg">Item Three</item>
<item>Item Four</item>
<item>Item Five</item>
<parent>
<child>omg</child>
<child>
<test/>
</child>
<child test="hai">omg</child>
</parent>
<copy>
<default/>
</copy>
</root>
The Code:
$Xml = new DomQuery;
$Xml->load($xml)
->path('//*[@test]')
->removeAttr('test')
->path('//root/parent/child[3]')
->attr('foo', 'bar')
->replicate('//root/copy/default');
header('Content-Type: text/xml');
die($Xml->saveXml());
The Result:
<root>
<item>Item One</item>
<item>Item Two</item>
<item>Item Three</item>
<item>Item Four</item>
<item>Item Five</item>
<parent>
<child>omg</child>
<child>
<test/>
</child>
<child foo="bar">omg</child>
</parent>
<copy>
<default>
<child foo="bar">omg</child>
</default>
</copy>
</root>
What The?
Yes, it does exactly what it looks like it's doing:
- Instantiate an instance of
DomQuery - Feed it some XML (HTML too!)
- Search for all nodes who have attributes entitled 'test'
- REMOVE THEM!
- Get the 3rd 'child' node within 'parent'
- Give it the attribute of 'foo' with the value of 'bar'
- Copy the last used result and move it to '/root/copy/default/'
How's About This:
Unfortunately, PHP < 5.3.x doesn't support lambda functions or closures like javascript does, so we can't do something fancy like:
$Xml->load($xml)->path('//root/item/*')->each(function(&$Element){ echo $Element->nodeName; });
In PHP 5.3.x this would apply a callback function to each of the results, allowing you to modify them in any way you wish. Since 99% of the people out there who would be using this don't have that version of PHP in production, I had to lower the functionality a bit to accommodate the most popular setups. So, instead of the above, you can do this:
$Xml->load($xml)->path('//root/item/*')->walk('function_name', 'param1', 'param2');
This will apply a callback to every matched node. Simply put, this works by invoking call_user_func_array, executing the function and passing along any parameter you included within the execution of the 'walk' method. Walk also passes through the current result's context by reference so you can manipulate the current state of the document, the entire result set of the last pattern and the currently matched element:
$context = array
(
'results' => &$this->results,
'element' => &$Element,
'context' => &$this
)
$context will always be the first parameter included in your callback. All other passed arguments will be included after.
What Else?
This library is still in development, but a ton of functionality is already included. Here is a list of fully supported functions:
after()Insert content after each of the matched elements.append()Append content to the inside of every matched element.appendTo()Append all of the matched elements to another, specified, set of elements.attr()Returns, adds and edits the specified attributes of a matched element.before()Insert content before each of the matched elementsclear()Clears the contents of a matched element.path()Apply an XPath pattern to a DOM and save the results.prepend()Prepend content to the inside of every matched element.prependTo()Prepend all of the matched elements to another, specified, set of elements.remove()Removes a matched elementremoveAttr()Removes a specified attribute from a matched element.replace()Replaces a matched element with another one.replicate()Copies a set of matched elements to a destination represented by another path.walk()Apply a user-defined callback function to a matched element.
I'll be posting more examples and tutorials down the road. Hopefully, I'll get this library to the point where I'll release it for all to use in the next few weeks. It's been a while since I released something fun and useful. I hope you guys will enjoy it.
If you're interested in testing this thing out with me or if you have any suggestions, post them in the comment area of this article.


