data types and standard library

XML: DOM, SimpleXML, XMLReader, XMLWriter

XML still appears in feeds, sitemaps, invoices, SOAP integrations, payment providers, government services, publishing systems, and legacy data exports.

PHP gives you several XML tools. Use SimpleXML for small, simple documents; DOM when you need structured editing or XPath; XMLReader for large documents that should be streamed; and XMLWriter when generating XML.

Parse small XML with SimpleXML

SimpleXML is convenient when the document is small and the structure is predictable.

PHP example
<?php

declare(strict_types=1);

$xml = new SimpleXMLElement('<user><email>nia@example.com</email></user>');
$email = strtolower((string) $xml->email);

echo $email . PHP_EOL;

// Prints:
// nia@example.com

External XML is still untrusted input. Parse it, then validate the fields your application needs.

Handle parse failures

Use internal libxml errors when you want to turn invalid XML into application exceptions.

PHP example
<?php

declare(strict_types=1);

function loadXml(string $xml): SimpleXMLElement
{
    $previous = libxml_use_internal_errors(true);
    $document = simplexml_load_string($xml);
    $errors = libxml_get_errors();
    libxml_clear_errors();
    libxml_use_internal_errors($previous);

    if (!$document || $errors !== []) {
        throw new InvalidArgumentException('XML could not be parsed.');
    }

    return $document;
}

try {
    loadXml('<user><email>nia@example.com</user>');
} catch (InvalidArgumentException $exception) {
    echo $exception->getMessage() . PHP_EOL;
}

// Prints:
// XML could not be parsed.

Do not let parser warnings leak into a user-facing response.

Validate required fields

Parsing only proves that the XML is well-formed. It does not prove the business data is present.

PHP example
<?php

declare(strict_types=1);

function emailFromUserXml(SimpleXMLElement $xml): string
{
    $email = trim((string) $xml->email);

    if (filter_var($email, FILTER_VALIDATE_EMAIL) === false) {
        throw new InvalidArgumentException('User email is invalid.');
    }

    return strtolower($email);
}

$xml = new SimpleXMLElement('<user><email>NIA@example.com</email></user>');

echo emailFromUserXml($xml) . PHP_EOL;

// Prints:
// nia@example.com

The same boundary rule applies as JSON and CSV: parse first, then validate the shape.

Use DOM for editing

DOM is better when you need to create, edit, or query a tree more deliberately.

PHP example
<?php

declare(strict_types=1);

$document = new DOMDocument('1.0', 'UTF-8');
$order = $document->createElement('order');
$order->setAttribute('id', '101');
$order->appendChild($document->createElement('status', 'paid'));
$document->appendChild($order);

echo $document->saveXML($document->documentElement) . PHP_EOL;

// Prints:
// <order id="101"><status>paid</status></order>

DOM avoids manual string concatenation and handles escaping text nodes correctly.

Stream large XML with XMLReader

XMLReader reads forward through a document, which is useful for large feeds.

PHP example
<?php

declare(strict_types=1);

$reader = new XMLReader();
$reader->XML('<orders><order id="101"/><order id="102"/></orders>');

while ($reader->read()) {
    if ($reader->nodeType === XMLReader::ELEMENT && $reader->name === 'order') {
        echo $reader->getAttribute('id') . PHP_EOL;
    }
}

$reader->close();

// Prints:
// 101
// 102

Use XMLReader when loading the whole file into memory would be wasteful or risky.

Generate XML with XMLWriter

XMLWriter is a good choice for exports and feeds.

PHP example
<?php

declare(strict_types=1);

$writer = new XMLWriter();
$writer->openMemory();
$writer->startDocument('1.0', 'UTF-8');
$writer->startElement('product');
$writer->writeElement('sku', 'KB-101');
$writer->writeElement('name', 'Keyboard');
$writer->endElement();
$writer->endDocument();

echo $writer->outputMemory();

// Prints:
// <?xml version="1.0" encoding="UTF-8"?>
// <product><sku>KB-101</sku><name>Keyboard</name></product>

Generating XML through a writer is safer than building XML with string concatenation.

What to remember

Choose the XML API based on document size and task. Parse XML as untrusted input, validate required fields, avoid manual XML strings, and stream large files instead of loading them fully into memory.

Practice

Task: Import users from XML

Write a small XML importer for user records.

Requirements

  • Use declare(strict_types=1);.
  • Parse a small XML string with SimpleXML.
  • Require each user to have an email element.
  • Validate each email with FILTER_VALIDATE_EMAIL.
  • Return lowercased email addresses.
  • Print the imported emails.
  • Show one invalid XML or invalid email case by catching the exception.
  • Include the expected output as comments in the same PHP code block.

The importer should parse XML and validate the business fields separately.

Show solution
PHP example
<?php

declare(strict_types=1);

function importUserEmails(string $xmlText): array
{
    $previous = libxml_use_internal_errors(true);
    $xml = simplexml_load_string($xmlText);
    $errors = libxml_get_errors();
    libxml_clear_errors();
    libxml_use_internal_errors($previous);

    if (!$xml || $errors !== []) {
        throw new InvalidArgumentException('XML could not be parsed.');
    }

    $emails = [];

    foreach ($xml->user as $user) {
        $email = trim((string) $user->email);

        if (filter_var($email, FILTER_VALIDATE_EMAIL) === false) {
            throw new InvalidArgumentException('User email is invalid.');
        }

        $emails[] = strtolower($email);
    }

    return $emails;
}

$emails = importUserEmails('<users><user><email>NIA@example.com</email></user><user><email>lee@example.com</email></user></users>');

echo implode(', ', $emails) . PHP_EOL;

try {
    importUserEmails('<users><user><email>not-an-email</email></user></users>');
} catch (InvalidArgumentException $exception) {
    echo $exception->getMessage() . PHP_EOL;
}

// Prints:
// nia@example.com, lee@example.com
// User email is invalid.

The solution handles parser errors separately from invalid business data. That keeps XML mechanics out of the rest of the application and gives callers a clear failure reason.