http clients and apis

Long-Running API Work And Failure Handling

Some API requests cannot finish while the caller waits. Report exports, video processing, large imports, payment reconciliation, bulk email sends, and third-party synchronisation may take seconds, minutes, or hours.

A good API does not leave the HTTP request hanging forever. It accepts the work, returns a job ID, lets the client check progress, and records failures clearly enough for support and developers to act.

Returning 202 Accepted

202 Accepted means the server accepted the request, but processing has not finished yet. The response should include a job identifier and a URL where the client can check status.

PHP example
<?php

declare(strict_types=1);

function acceptedJobResponse(string $jobId): array
{
    return [
        'status' => 202,
        'headers' => [
            'Content-Type' => 'application/json',
            'Location' => '/v1/jobs/' . $jobId,
        ],
        'body' => [
            'data' => [
                'id' => $jobId,
                'status' => 'queued',
            ],
        ],
    ];
}

print_r(acceptedJobResponse('job_123'));

// Prints:
// [status] => 202
// [Location] => /v1/jobs/job_123
// [status] => queued

Do not return 201 Created unless the resource has actually been created. A queued job is not the same as completed work.

Job states

Job states should be predictable. A simple workflow might be:

  • queued: accepted and waiting for a worker.
  • running: currently being processed.
  • succeeded: finished successfully.
  • failed: stopped because of a non-retryable error or exhausted retries.
  • cancelled: deliberately stopped by a user or system.
PHP example
<?php

declare(strict_types=1);

function canMoveJob(string $from, string $to): bool
{
    $allowed = [
        'queued' => ['running', 'cancelled'],
        'running' => ['succeeded', 'failed', 'cancelled'],
        'failed' => [],
        'succeeded' => [],
        'cancelled' => [],
    ];

    return in_array($to, $allowed[$from] ?? [], true);
}

var_dump(canMoveJob('queued', 'running'));
var_dump(canMoveJob('succeeded', 'running'));

// Prints:
// bool(true)
// bool(false)

Avoid vague states such as done when the client needs to distinguish success from failure.

Status endpoint

The status endpoint should give the client enough information to decide what to do next, without exposing internal stack traces.

PHP example
<?php

declare(strict_types=1);

function jobStatusResponse(array $job): array
{
    $body = [
        'data' => [
            'id' => $job['id'],
            'status' => $job['status'],
            'created_at' => $job['created_at'],
            'updated_at' => $job['updated_at'],
        ],
    ];

    if ($job['status'] === 'failed') {
        $body['data']['error'] = [
            'code' => $job['error_code'] ?? 'job_failed',
            'message' => $job['public_error'] ?? 'The job could not be completed.',
        ];
    }

    return ['status' => 200, 'body' => $body];
}

$response = jobStatusResponse([
    'id' => 'job_123',
    'status' => 'failed',
    'created_at' => '2026-05-28T12:00:00Z',
    'updated_at' => '2026-05-28T12:05:00Z',
    'error_code' => 'export_source_unavailable',
    'public_error' => 'The export source is temporarily unavailable.',
]);

print_r($response);

// Prints:
// [status] => failed
// [code] => export_source_unavailable

Internal exception messages belong in logs, not in public API responses.

Worker retries

Background workers need retry rules just like HTTP clients. Temporary failures can be retried. Invalid input, permission failures, or missing records usually need a clear failed state.

PHP example
<?php

declare(strict_types=1);

function nextJobAction(string $failureType, int $attempt, int $maxAttempts): string
{
    $temporary = in_array($failureType, ['timeout', 'rate_limited', 'service_unavailable'], true);

    if ($temporary && $attempt < $maxAttempts) {
        return 'retry';
    }

    return 'fail';
}

echo nextJobAction('timeout', 1, 3) . PHP_EOL;
echo nextJobAction('invalid_input', 1, 3) . PHP_EOL;
echo nextJobAction('timeout', 3, 3) . PHP_EOL;

// Prints:
// retry
// fail
// fail

Retries should include attempt counts, backoff, and logs. Infinite retries are just hidden failures.

Idempotency and duplicate jobs

Long-running work often needs idempotency. If a client submits the same export request twice after a timeout, the API may need to return the existing job instead of creating another expensive job.

Use an idempotency key, a unique job request hash, or a business constraint such as "one active export per user and report type", depending on the product requirement.

Cancellation and time limits

Some jobs should be cancellable. Others may need a maximum runtime so stuck work does not run forever. A worker should update heartbeat or progress information if the job is expected to run for a long time.

Progress can be exact, such as 42 of 1000 rows processed, or coarse, such as queued, processing, and finalising. Do not fake precision if the system cannot measure it.

What to check

Before moving on, make sure you can:

  • Return 202 Accepted with a job ID and status URL.
  • Design clear job states.
  • Build a status response that is useful but does not leak internals.
  • Decide which failures should retry and which should fail.
  • Explain how idempotency prevents duplicate long-running work.
  • Describe how logs, attempts, and cancellation make jobs maintainable.

Practice

Practice: Model A Long-Running Job

Write a PHP example that models accepting a long-running API job and checking its status.

Requirements

  • Return 202 when a job is accepted.
  • Include a job ID and Location status URL.
  • Support job states such as queued, running, succeeded, and failed.
  • Include a status response for a failed job with a public error code and message.
  • Add a retry decision for temporary worker failures.
  • Include examples for accepted, failed status, retryable failure, and exhausted retries.
Show solution

This solution separates the initial acceptance response from the later status response.

PHP example
<?php

declare(strict_types=1);

function acceptJob(string $jobId): array
{
    return [
        'status' => 202,
        'headers' => [
            'Location' => '/v1/jobs/' . $jobId,
            'Content-Type' => 'application/json',
        ],
        'body' => [
            'data' => [
                'id' => $jobId,
                'status' => 'queued',
            ],
        ],
    ];
}

function jobStatus(array $job): array
{
    $data = [
        'id' => $job['id'],
        'status' => $job['status'],
    ];

    if ($job['status'] === 'failed') {
        $data['error'] = [
            'code' => $job['error_code'],
            'message' => $job['public_error'],
        ];
    }

    return ['status' => 200, 'body' => ['data' => $data]];
}

function workerFailureAction(string $failureType, int $attempt, int $maxAttempts): string
{
    $temporary = in_array($failureType, ['timeout', 'rate_limited', 'service_unavailable'], true);

    if ($temporary && $attempt < $maxAttempts) {
        return 'retry';
    }

    return 'fail';
}

$accepted = acceptJob('job_123');
$failed = jobStatus([
    'id' => 'job_123',
    'status' => 'failed',
    'error_code' => 'source_unavailable',
    'public_error' => 'The export source is temporarily unavailable.',
]);

echo $accepted['status'] . ' ' . $accepted['headers']['Location'] . PHP_EOL;
echo $failed['body']['data']['status'] . ' ' . $failed['body']['data']['error']['code'] . PHP_EOL;
echo workerFailureAction('timeout', 1, 3) . PHP_EOL;
echo workerFailureAction('timeout', 3, 3) . PHP_EOL;

// Prints:
// 202 /v1/jobs/job_123
// failed source_unavailable
// retry
// fail

This is the API contract clients need: fast acceptance, a stable status URL, clear states, and predictable failure handling.