web php

Upload Validation And Scanning

Upload validation decides whether a received file is acceptable for a feature. Scanning adds another layer for files that may contain malware, unsafe macros, scripts, or content that needs review.

The browser cannot be trusted to describe the file honestly. The original filename, extension, and browser-supplied MIME type are user-controlled. Treat them as hints, not proof.

Validate in layers

A reasonable upload validation flow checks:

  • PHP upload error
  • feature-specific size limit
  • detected MIME type from the temporary file
  • allowed extension for the feature
  • generated server-side filename
  • storage location that cannot execute PHP
  • scanning or review status for risky file types

No single check is enough. A file named avatar.jpg might not be an image. A file with an image MIME type might still be too large for the feature.

Detect content, not just names

Use finfo to inspect the temporary file content. Do not trust $_FILES['avatar']['type'].

PHP example
<?php

declare(strict_types=1);

$tmp = tempnam(sys_get_temp_dir(), 'upload_');
file_put_contents($tmp, "<?php echo 'not an image';");

$mime = (new finfo(FILEINFO_MIME_TYPE))->file($tmp);
$allowedMimeTypes = ['image/png', 'image/jpeg'];

if (!in_array($mime, $allowedMimeTypes, true)) {
    echo 'Rejected upload: ' . $mime . PHP_EOL;
}

unlink($tmp);

// Output on many systems:
// Rejected upload: text/x-php

Different systems may report slightly different MIME strings for unusual content. The rule is the same: compare detected MIME types against a small allow-list for the feature.

Extensions are still useful

Extensions are not proof, but they are useful for user experience and storage. A profile image feature may allow jpg, jpeg, and png; a document feature may allow pdf.

PHP example
<?php

declare(strict_types=1);

function extensionFromName(string $filename): string
{
    return strtolower(pathinfo($filename, PATHINFO_EXTENSION));
}

$extension = extensionFromName('Quarterly Report.PDF');

echo $extension . PHP_EOL;

// Prints:
// pdf

Validate extension and detected MIME together. If the feature expects a PDF, a .pdf extension with application/pdf is more convincing than either signal alone.

Return reviewable decisions

Upload validation should produce clear reasons. That makes the behaviour easier to test and easier to explain to users.

PHP example
<?php

declare(strict_types=1);

/**
 * @param array{name: string, error: int, size: int, detected_mime: string} $file
 * @return list<string>
 */
function uploadProblems(array $file): array
{
    $problems = [];

    if ($file['error'] !== UPLOAD_ERR_OK) {
        $problems[] = 'Upload did not complete successfully.';
    }

    if ($file['size'] > 500_000) {
        $problems[] = 'File is larger than 500 KB.';
    }

    $allowedMimeTypes = ['image/jpeg', 'image/png'];
    if (!in_array($file['detected_mime'], $allowedMimeTypes, true)) {
        $problems[] = 'File is not an allowed image type.';
    }

    $allowedExtensions = ['jpg', 'jpeg', 'png'];
    if (!in_array(extensionFromName($file['name']), $allowedExtensions, true)) {
        $problems[] = 'Filename extension is not allowed for images.';
    }

    return $problems;
}

$file = [
    'name' => 'shell.php',
    'error' => UPLOAD_ERR_OK,
    'size' => 1200,
    'detected_mime' => 'text/x-php',
];

echo implode(PHP_EOL, uploadProblems($file)) . PHP_EOL;

// Prints:
// File is not an allowed image type.
// Filename extension is not allowed for images.

The example uses a supplied detected_mime so it can run without an actual HTTP upload. In real code, detect it from tmp_name with finfo.

Scanning and quarantine

Some files should not become available immediately after upload. Common examples include Office documents, PDFs, archives, or uploads visible to other users.

A safer flow is:

  1. Accept the upload into a private quarantine location.
  2. Store a database row with status pending_scan.
  3. Run a scanner or review process.
  4. Mark the file clean before serving it.
  5. Delete or isolate files marked infected or failed_scan.

PHP might hand the file off to a scanner service, a queue worker, an antivirus tool such as ClamAV, or a managed storage scanning service. The exact tool varies by company. The important design is that unscanned files are not treated as safe.

PHP example
<?php

declare(strict_types=1);

function canDownload(string $scanStatus): bool
{
    return $scanStatus === 'clean';
}

foreach (['pending_scan', 'clean', 'infected'] as $status) {
    echo $status . ': ' . (canDownload($status) ? 'download allowed' : 'blocked') . PHP_EOL;
}

// Prints:
// pending_scan: blocked
// clean: download allowed
// infected: blocked

Image-specific checks

For image uploads, dimensions often matter. A profile avatar might require a maximum width and height. In a real upload handler, getimagesize() can help inspect image dimensions, but it should be part of a wider validation flow rather than the only check.

If you process images, consider re-encoding them through an image library. Re-encoding can remove unexpected metadata and ensures the stored file is one your application created, not just one it accepted.

Dangerous storage mistakes

Never store untrusted uploads in a directory where the web server can execute PHP. A file upload bug becomes much worse if an attacker can upload something.php and then request it as code.

Avoid serving user-uploaded files with guessed headers. Store metadata about the validated type and send deliberate Content-Type and Content-Disposition headers when users download private files.

Do not rely on a block-list such as "reject .php". Attackers can use confusing names, multiple extensions, and server-specific behaviours. Prefer allow-lists for each feature.

What to check in a project

Find the allow-list for the feature. It should be specific, such as avatar images or PDF invoices, not "any file".

Check that validation uses detected MIME type from the temporary file, not only the browser-supplied type.

Check storage. Uploaded files should have generated names and should not be executable by the web server.

Check scanning status. If the business needs scanning, unscanned files should stay blocked until a clean result is recorded.

Check failure paths. Scanner timeouts, upload errors, and unsupported types should produce controlled results.

What you should be able to do

After this lesson, you should be able to combine size, extension, and detected MIME checks, explain why browser-provided metadata is untrusted, design a quarantine-and-scan flow, and spot dangerous upload storage choices.

Practice

Task: Validate Upload Metadata

Write a small PHP script that validates upload metadata for an avatar image feature.

Requirements

  • Use declare(strict_types=1);.
  • Accept a file array with name, error, size, and detected_mime.
  • Allow only JPEG and PNG MIME types.
  • Allow only jpg, jpeg, and png filename extensions.
  • Reject files larger than 500 KB.
  • Include a valid image case.
  • Include a disguised script case.
  • Include a too-large image case.
  • Print the validation result for each case.

Check Your Work

Run the script and confirm that the disguised script fails for both MIME and extension reasons, while the large image fails for size.

Show solution

This solution returns all validation problems instead of stopping at the first one. That makes the result easier to test.

PHP example
<?php

declare(strict_types=1);

function extensionFromName(string $filename): string
{
    return strtolower(pathinfo($filename, PATHINFO_EXTENSION));
}

/**
 * @param array{name: string, error: int, size: int, detected_mime: string} $file
 * @return list<string>
 */
function validateAvatarUpload(array $file): array
{
    $problems = [];

    if ($file['error'] !== UPLOAD_ERR_OK) {
        $problems[] = 'Upload did not complete successfully.';
    }

    if ($file['size'] > 500_000) {
        $problems[] = 'File is larger than 500 KB.';
    }

    if (!in_array($file['detected_mime'], ['image/jpeg', 'image/png'], true)) {
        $problems[] = 'Detected MIME type is not allowed.';
    }

    if (!in_array(extensionFromName($file['name']), ['jpg', 'jpeg', 'png'], true)) {
        $problems[] = 'Filename extension is not allowed.';
    }

    return $problems;
}

$files = [
    'valid' => ['name' => 'avatar.png', 'error' => UPLOAD_ERR_OK, 'size' => 120_000, 'detected_mime' => 'image/png'],
    'script' => ['name' => 'avatar.php', 'error' => UPLOAD_ERR_OK, 'size' => 1_200, 'detected_mime' => 'text/x-php'],
    'large' => ['name' => 'photo.jpg', 'error' => UPLOAD_ERR_OK, 'size' => 900_000, 'detected_mime' => 'image/jpeg'],
];

foreach ($files as $label => $file) {
    $problems = validateAvatarUpload($file);
    echo $label . ': ' . ($problems === [] ? 'accepted' : implode('; ', $problems)) . PHP_EOL;
}

// Prints:
// valid: accepted
// script: Detected MIME type is not allowed.; Filename extension is not allowed.
// large: File is larger than 500 KB.

In real upload handling, detected_mime would come from finfo inspecting the temporary upload file, not from the browser.

Why This Works

The valid case proves the allow-list accepts intended files. The script case proves the code does not trust the filename alone. The large case proves feature-specific limits are enforced even when the MIME type is allowed.