DOM (Document Object Model)

What is DOM?

The DOM module in PHP allows manipulating XML and HTML documents using the Document Object Model (DOM) API. DOM represents structured documents as a hierarchical tree, where each element is a node that can be dynamically modified, deleted, or added.

This extension is useful for:

  • Parsing and modifying XML and HTML documents.
  • Dynamically generating XML files.
  • Extracting specific data from structured documents.
  • Validating XML files using schemas or DTDs.

The DOM module is built on libxml and is often used with XPath and XSLT for advanced XML manipulations.

Features of the PHP DOM Module

The DOM module provides:

  • Loading and parsing XML and HTML files (loadXML(), loadHTML())
  • Traversing and modifying the DOM structure (getElementById(), getElementsByTagName(), appendChild(), removeChild())
  • Dynamically creating XML documents (createElement(), createTextNode())
  • Validating XML documents via DTD or XSD schemas (validate())
  • Using XPath for searching elements (DOMXPath)

Example usage:

Load and parse an XML document:

$xml = <<<XML
<?xml version="1.0"?>
<books>
    <book id="1">
        <title>Advanced PHP</title>
        <author>John Doe</author>
    </book>
    <book id="2">
        <title>XML and DOM</title>
        <author>Sophie Martin</author>
    </book>
</books>
XML;

$dom = new DOMDocument();
$dom->loadXML($xml);

// Retrieve all book titles
$titles = $dom->getElementsByTagName("title");
foreach ($titles as $title) {
    echo $title->nodeValue . "\n";
}

Modify an XML element:

$dom->getElementsByTagName("title")->item(0)->nodeValue = "PHP and DOM";
echo $dom->saveXML();

Advantages of DOM

  • Flexible manipulation: Allows dynamic addition, modification, and removal of XML and HTML elements.
  • Supports XML and HTML standards: Compatible with DTD, XSD, XPath, XSLT.
  • Easy-to-use API: Follows the standard W3C DOM model.
  • Better error handling than SimpleXML, especially for complex XML documents.

Disadvantages of DOM

  • High memory usage: Loads the entire document into memory, which can be inefficient for large files.
  • Slower than SAX for large XML files, as DOM processes the entire document rather than parsing it line by line.
  • More complex syntax than SimpleXML for basic tasks.

Conclusion

The PHP DOM module is a powerful tool for working with XML and HTML documents, allowing for creation, modification, and validation. While it consumes more memory than SAX and is more complex than SimpleXML, it remains essential for advanced document processing.


🔗 References:

Catégories d’articles