advanced php language

PHP 8.5 URI Extension

PHP 8.5 adds a built-in URI extension for parsing, inspecting, comparing, resolving, and modifying URIs and URLs. It gives PHP a standards-aware API instead of making every project rely on parse_url(), string splitting, or third-party packages for common URI work.

The extension has two main families:

  • Uri\Rfc3986\Uri for RFC 3986 URI handling.
  • Uri\WhatWg\Url for browser-style WHATWG URL handling.

Most server-side application code starts with RFC 3986 because it is useful for APIs, redirects, generated links, service callbacks, and signed URLs. WHATWG URLs matter when you need behaviour that matches browsers more closely.

Why parse_url() Is Not Enough

parse_url() returns an array, false, or partial data. That can be workable for simple scripts, but it is easy to forget a missing key or accidentally treat a malformed value as acceptable.

PHP example
<?php

declare(strict_types=1);

$parts = parse_url('https://example.com/docs?page=1');

echo $parts['host'] ?? 'missing host';
echo PHP_EOL;

// Prints:
// example.com

This code is fine for a tiny example, but the result is just an array. It does not give you object methods for normalized components, immutable modification, resolving relative paths, or clear equivalence checks.

Creating A URI Object

Uri\Rfc3986\Uri parses the string and exposes the URI components through methods.

PHP example
<?php

declare(strict_types=1);

use Uri\Rfc3986\Uri;

$uri = new Uri('https://Example.com/docs/getting-started?page=1#top');

echo $uri->getScheme() . PHP_EOL;
echo $uri->getHost() . PHP_EOL;
echo $uri->getPath() . PHP_EOL;
echo $uri->getQuery() . PHP_EOL;

// Prints:
// https
// example.com
// /docs/getting-started
// page=1

The getter methods return normalized components. If you need the original encoded form, the class also has raw getters such as getRawPath() and getRawQuery().

Handling Invalid Input

The constructor throws Uri\InvalidUriException when the URI is invalid. The static parse() method returns null instead.

PHP example
<?php

declare(strict_types=1);

use Uri\Rfc3986\Uri;

function hostFromInput(string $input): string
{
    $uri = Uri::parse($input);

    if ($uri === null || $uri->getHost() === null) {
        return 'invalid URI';
    }

    return $uri->getHost();
}

echo hostFromInput('https://example.com/account') . PHP_EOL;
echo hostFromInput('not a full URL') . PHP_EOL;

// Prints:
// example.com
// invalid URI

This is the shape you normally want at a request boundary: parse the incoming string, reject values that do not meet your rule, then pass a clearer value deeper into the application.

Modifying URIs

The with*() methods create a changed URI instead of mutating the original object.

PHP example
<?php

declare(strict_types=1);

use Uri\Rfc3986\Uri;

$original = new Uri('http://example.com/old?page=1');
$updated = $original
    ->withScheme('https')
    ->withPath('/docs')
    ->withQuery('page=2');

echo $original->toString() . PHP_EOL;
echo $updated->toString() . PHP_EOL;

// Prints:
// http://example.com/old?page=1
// https://example.com/docs?page=2

Immutable objects are easier to reason about in reviews because a helper cannot secretly change the URI another part of the code is still using.

Resolving Relative References

A common job is combining a base URI with a relative path.

PHP example
<?php

declare(strict_types=1);

use Uri\Rfc3986\Uri;

$base = new Uri('https://example.com/docs/guides/');
$resolved = $base->resolve('../api/reference');

echo $resolved->toString() . PHP_EOL;

// Prints:
// https://example.com/docs/api/reference

This is safer than manually trimming and concatenating strings, especially when paths contain .., query strings, fragments, or trailing slashes.

Choosing RFC 3986 Or WHATWG

Use Uri\Rfc3986\Uri when your application is dealing with general URI syntax, API links, service callbacks, generated links, or non-browser protocols.

Use Uri\WhatWg\Url when your code needs URL behaviour that matches browsers. That may matter for user-entered web addresses, frontend-facing redirects, or compatibility with JavaScript URL handling.

The important point is to choose deliberately. URL parsing can become security-sensitive around redirects, host allow-lists, signed URLs, and webhook callback validation.

Validating Redirect Targets

Redirect validation is a realistic place to use URI parsing. The exact rules depend on the application, but a common rule is: allow only HTTPS URLs on trusted hosts.

PHP example
<?php

declare(strict_types=1);

use Uri\Rfc3986\Uri;

function isAllowedRedirect(string $target): bool
{
    $uri = Uri::parse($target);

    if ($uri === null) {
        return false;
    }

    return $uri->getScheme() === 'https'
        && $uri->getHost() === 'example.com';
}

echo isAllowedRedirect('https://example.com/account') ? 'allowed' : 'blocked';
echo PHP_EOL;
echo isAllowedRedirect('https://evil.example/account') ? 'allowed' : 'blocked';
echo PHP_EOL;

// Prints:
// allowed
// blocked

The parser is only part of the solution. You still need application rules: allowed schemes, allowed hosts, whether relative URLs are permitted, and what to do with fragments or query strings.

What You Should Be Able To Do

After this lesson, you should be able to create a URI object, read normalized components, handle invalid input, modify a URI immutably, resolve a relative reference, and explain when URI parsing becomes security-sensitive.

For junior PHP work, this matters because links and redirects are everywhere. The professional habit is to parse and validate them with a real API instead of relying on fragile string checks.

Practice

Practice: Validate Redirect URLs

Create a small redirect validation example using the PHP 8.5 URI extension.

Task

Build an isAllowedRedirect() function that:

  • accepts a string URL
  • parses it with Uri\Rfc3986\Uri::parse()
  • rejects invalid URIs
  • allows only the https scheme
  • allows only the host example.com

Use strict types. Show one allowed URL and two blocked URLs. Keep the expected output inside the PHP code block as printed lines or comments.

Afterward, add a short note explaining why parsing is safer than checking strings with str_contains().

Show solution

This solution parses the input first, then applies the application rules to the parsed components.

PHP example
<?php

declare(strict_types=1);

use Uri\Rfc3986\Uri;

function isAllowedRedirect(string $target): bool
{
    $uri = Uri::parse($target);

    if ($uri === null) {
        return false;
    }

    return $uri->getScheme() === 'https'
        && $uri->getHost() === 'example.com';
}

function showResult(string $target): void
{
    echo $target . ' => ';
    echo isAllowedRedirect($target) ? 'allowed' : 'blocked';
    echo PHP_EOL;
}

showResult('https://example.com/account');
showResult('http://example.com/account');
showResult('https://evil.example/account');

// Prints:
// https://example.com/account => allowed
// http://example.com/account => blocked
// https://evil.example/account => blocked

Parsing is safer than str_contains() because redirects depend on specific URI components. A string may contain example.com in the path, query string, username, or as part of a different hostname, and those cases should not automatically be trusted.