Kenjun

May 19, 2008

XML for PHP developers – Advanced XML parsing

Filed under: php, xml — kenjun @ 6:14 am

Reading, manipulating, and writing XML in PHP5

SimpleXML, in combination where necessary with the DOM, is the ideal choice for developers working with straightforward, predictable, and relatively small XML documents to read, manipulate, and write XML in PHP5.

Quick start APIs of choice

Of the many APIs available in PHP5, the DOM and SimpleXML are the most familiar, in the case of the DOM, and the easiest to code, in the case of SimpleXML.And for the most common situations, like those you are dealing with here, the most functional.

DOM extension

The Document Object Model (DOM) is a W3C standard set of objects for representing HTML and XML documents, a standard model of how you can combine these objects, and a standard interface for accessing and manipulating them. Many vendors support the DOM as an interface to their proprietary data structures and APIs, which gives the DOM model a lot of authority with developers due to its familiarity. The DOM is easy to understand and utilize since its structure in memory resembles the original XML document. To pass on information to the application, DOM creates a tree of objects that duplicates exactly the tree of elements from the XML file, with every XML element being a node in the tree. The DOM is a tree-based parser. Because DOM builds a tree of the entire document, it uses a lot of memory and processor time. Therefore, performance issues make it impractical to parse large documents with DOM. The key use of the DOM extension in the context of this article is its ability to import SimpleXML format and output DOM format XML, or the reverse, for use as a string or XML file.

SimpleXML

The SimpleXML extension is the tool of choice for parsing an XML document. The SimpleXML extension requires PHP5 and includes interoperability with the DOM for writing XML files and built-in XPath support. SimpleXML works best with uncomplicated, record-like data, such as XML passed as a document or string from another internal part of the same application. Provided that the XML document isn’t too complicated, too deep, and lacks mixed content, SimpleXML is easier to code than the DOM, as its name implies. It is also more reliable if you work with a known document structure.

Very handy article for PHP developers on XML parsing.

Blog at WordPress.com.