These notes are public, opinionated, and evolving — read abdelkader.ma for the long-form posts.
PHP SecurityPath Traversal

Path Traversal (PHP)

Any filesystem call with attacker input in its path argument is a candidate. The result is either disclosure (read files the app shouldn’t expose) or write in a place you shouldn’t (overwrite config, drop a webshell, plant a file an LFI later includes).

For the include-and-execute variant, see LFI. For the metadata-deserialise variant via phar://, see PHAR Deserialization.

Why

readfile("/var/app/files/" . $_GET['name']);

?name=../../../../etc/passwd walks out of the base directory. PHP resolves the path through the OS just like a shell does.

The variants that catch developers:

  • basename() is not enough. basename("../foo") returns foo — fine. But the concatenated result may still escape if .. arrived URL-encoded earlier.
  • Stripping ../ once lets ....// survive (... + / left over).
  • realpath() returns false on missing files. Code that proceeds on false ends up using the unresolved input.
  • Wrapper schemesphar://, file://, php://, data:// — defeat path-prefix checks unless the scheme is explicitly rejected first.

Search patterns

# Read-side
rg -n '\b(fopen|fpassthru|file_get_contents|readfile|file|stream_get_contents|md5_file|sha1_file|hash_file)\s*\(' --type=php
 
# Write-side
rg -n '\b(file_put_contents|fwrite|fputs|fputcsv|chmod|chown|touch|tempnam|mkdir|rmdir|copy|rename|move_uploaded_file|symlink|link|unlink)\s*\(' --type=php
 
# Read + write — SplFileObject
rg -n 'SplFileObject|SplFileInfo|DirectoryIterator|RecursiveIteratorIterator' --type=php
 
# Stat-only (still triggers PHAR; see related)
rg -n '\b(file_exists|is_file|is_dir|is_link|is_readable|is_writable|filesize|filemtime|filectime|fileatime|filetype|fileperms|fileowner|filegroup|stat|lstat)\s*\(' --type=php

Test inputs

Disclosure:

  • ../../../../etc/passwd
  • ../../../../etc/hosts
  • ../../../../proc/self/environ
  • ../../../../proc/self/cmdline
  • ../../../../var/www/html/.env
  • ../../../../var/www/html/config.php
  • /etc/passwd (absolute — bypasses naive $base . $input)

Encoding bypasses:

  • ..%2f..%2f..%2fetc/passwd
  • ..%252f..%252fetc/passwd (double-encoded — when one decode happens before path validation)
  • ....//....//etc/passwd (filter strips ../ once)
  • ..%c0%af..%c0%afetc/passwd (UTF-8 overlong — rare but real on Windows / legacy stacks)
  • Null byte (legacy, PHP < 5.3.4): ../../etc/passwd%00.jpg

Windows-specific:

  • ..\..\..\Windows\win.ini
  • ..%5c..%5c..%5cWindows%5cwin.ini
  • UNC: \\attacker\share\file

Symlink-based:

  • When uploads accept symlinks (e.g. tar extraction without --no-same-owner
    • --no-same-permissions), symlink an upload entry to /etc/passwd and read it back via the app.

Write-side:

  • ?name=../../../../var/www/html/shell.php (drop webshell)
  • ?name=../../../../home/user/.ssh/authorized_keys (key bombing)
  • ?name=../../../../etc/crontab (cron poisoning, write perms permitting)

Audit focus

For each filesystem call with potentially tainted path:

  1. Source of the path — superglobal, body field, header, cookie, stored value from earlier user input?
  2. Filter shape — strip-based or resolve-based?
    • Strip: str_replace("..", "", $p) → vulnerable
    • Resolve: realpath($base . '/' . $p) then prefix-check → correct
  3. Scheme reject — does any wrapper (phar://, file://, php://, data://, http://) reach the call?
  4. basename semantics — used and trusted too far? It’s good for stripping directory components from a displayed filename, not a sanitiser for a path passed to another function.
  5. Write surface — write APIs are higher risk per call than read APIs because they’re often left out of audits.
  6. TOCTOU — does the code check (is_writable($p)) and then act later? A symlink swap between the two opens a window.
  7. Race in upload directory — concurrent upload + extract can let one request influence another’s filename.

Reject anything containing :// in the path before validation. Schema prefixes are not legitimate input for “filename” parameters and rejecting them removes the PHAR, RFI, and data:// variants in one line.

Fix

function safe_read(string $input, string $base_dir): string {
  if (preg_match('#^[a-zA-Z][a-zA-Z0-9+\-.]*://#', $input)) {
    throw new RuntimeException("scheme not allowed");
  }
  if (str_contains($input, "\0")) {
    throw new RuntimeException("null byte");
  }
 
  $base   = realpath($base_dir);
  $target = realpath($base . DIRECTORY_SEPARATOR . $input);
 
  if ($target === false || !str_starts_with($target, $base . DIRECTORY_SEPARATOR)) {
    throw new RuntimeException("path escapes base");
  }
 
  return file_get_contents($target);
}

realpath() resolves .., symlinks, and (on POSIX) case. The prefix check confines the result to $base. The scheme reject + null-byte reject close the wrapper class.