|
|
- claude contributed src
- Opens the zip with std.zip.ZipArchive (reads the whole file into
memory)
- Locates pod.manifest inside the archive to discover document paths
and languages
- Extracts markup files (.sst/.ssm/.ssi) as in-memory strings
- Extracts images as in-memory byte arrays
- Extracts conf/dr_document_make if present
- Presents these to the existing pipeline as if they were read from
the filesystem
- Some security mitigations:
- Zip Slip / Path Traversal: Reject entries containing `..` or
starting with `/`; canonicalize resolved paths and verify they
fall within extraction root
- Zip Bomb: Check `ArchiveMember.size` before extracting; enforce
per-file (50MB) and total size limits (500MB)
- Entry Count: Limit number of entries (a pod should have at most
~100 files)
- Path depth: limit (Maximum 10 path components).
- Symlinks: Verify no symlinks in extracted content before
processing (post-extraction recursive scan)
- Filename Validation: Only allow expected characters; reject null
bytes
- Malformed Zips: Catch `ZipException` from `std.zip.ZipArchive`
constructor
- Cleanup on error
|