Lambda S3 Cache
An AWS Lambda setup that caches files from public URLs in S3. When a URL is requested, the service returns a 302 redirect to either the cached S3 copy (via presigned URL) or the original source. Cache misses trigger asynchronous downloads to S3, ensuring future requests are served from the cache.
Caching uses the original URL’s host and path as the S3 key - each unique URL maps to a single cache entry that persists until expiration. Frequently accessed content automatically extends its cache lifetime on each access.
Note: Cached content is immutable; changes at the origin won’t be reflected until the cache expires.
Features
- Streaming Upload: Handles files efficiently by streaming directly from source to S3 without loading into memory
- Presigned S3 URLs: Returns short-lived signed URLs for cached content
- Automatic Expiration: Cached content expires after a certain time, with lifetime extended on each access
- URL Prefix Allowlist: Only caches content matching explicitly allowed host+path prefixes
- Extension Filtering: Limit caching to specific file extensions
- Custom Domain: Optional custom domain with automatic TLS certificate management via ACM and Route53
Architecture
Three Lambda functions handle the caching workflow:
- Handler - API Gateway endpoint that checks cache, returns 302 redirects, and triggers async operations. Implements touch cooldown to prevent S3 throttling by only touching objects after a configurable time period has elapsed since last modification.
- Uploader - Downloads from origin and streams to S3 on cache misses (invoked asynchronously)
- Touch - Updates S3 object timestamps to extend cache lifetime on cache hits (invoked asynchronously only when cooldown period has elapsed)
Prerequisites
- AWS CLI configured with appropriate credentials (IAM permissions for Lambda, API Gateway, S3, CloudFormation, CloudWatch Logs, and optionally Route53/ACM for custom domain)
- Podman or Docker (for containerized SAM build/deploy)
- Python 3.11 or later (for development)
Deployment
Build and deploy using containerized AWS SAM CLI:
make lambda-s3-cache/build # Build Lambda package
make lambda-s3-cache/deploy # First deployment (interactive/guided)
make lambda-s3-cache/redeploy # Subsequent deployments (non-interactive)
Deployment output: ApiEndpoint - the API Gateway URL to use for requests
Custom Domain
Optional custom domain with automatic TLS certificate management (ACM +
Route53). Requires a Route53 hosted zone. Deploy with CustomDomainName and
HostedZoneId parameters - CloudFormation handles certificate creation, DNS
validation, and configuration. Certificate validation takes 5-30 minutes; allow
up to 1 hour for DNS propagation.
Parameters
| Parameter | Description | Default |
|---|---|---|
ResourcePrefix |
Prefix for all resource names | myapp-prod |
PresignedUrlExpiration |
Presigned URL expiration (seconds) | 3600 |
AllowedPrefixes |
Whitespace-separated list of allowed URL prefixes | example.com/path/ |
AllowedExtensions |
Whitespace-separated file extensions | .rpm |
CacheExpirationDays |
Days to keep cached content (minimum: 1) | 14 |
TouchCooldownMinutes |
Minimum minutes between touch operations | 60 |
CustomDomainName |
Optional custom domain name | `` |
HostedZoneId |
Route53 hosted zone ID (required if custom domain) | `` |
SentryDsn |
Optional Sentry DSN for error tracking | `` |
Resource naming: All AWS resources follow {ResourcePrefix}-{type}-{name}
pattern (e.g., myapp-prod-bucket, myapp-prod-lambda-handler).
Usage
Make GET requests to the API endpoint (or custom domain if configured). The URL
to cache is encoded in the path (without the https:// prefix):
curl -L "https://<api-endpoint>/example.com/path/to/file.rpm"
Behavior:
- First request (cache miss): Redirects to original URL while triggering async S3 upload
- Subsequent requests (cache hit): Redirects to presigned S3 URL and resets cache expiration
- Disallowed prefixes/extensions: URLs not matching allowed prefixes or extensions are transparently redirected to the original URL (302 pass-through)
Usage as Repository Proxy
The cache can be used as a DNF/yum baseurl for RPM repositories. Non-RPM files
(repository metadata like repomd.xml, primary.xml.gz) pass through
transparently via 302 redirects, while .rpm files get cached:
[myrepo]
name=My Repository
baseurl=https://koji-s3-cache.example.com/download.example.org/pub/repo/$basearch/
enabled=1
This provides caching benefits for RPM downloads while maintaining full repository functionality.
Development
See the main README for development workflows.
make lambda-s3-cache/setup # Install dependencies
make check # Lint code
make fmt # Format code
make test # Run unit tests
make coverage # Run tests with coverage
Configuration
Lambda functions receive configuration via environment variables (automatically set by CloudFormation):
| Variable | Handler | Uploader | Touch | Description |
|---|---|---|---|---|
S3_BUCKET_NAME |
✅ | ✅ | ✅ | S3 bucket name |
PRESIGNED_URL_EXPIRATION |
✅ | Presigned URL expiration (sec) | ||
UPLOADER_LAMBDA_ARN |
✅ | Uploader Lambda ARN | ||
TOUCH_LAMBDA_ARN |
✅ | Touch Lambda ARN | ||
TOUCH_COOLDOWN_MINUTES |
✅ | Min minutes between touch ops | ||
ALLOWED_PREFIXES |
✅ | Allowed URL prefixes | ||
ALLOWED_EXTENSIONS |
✅ | Allowed file extensions | ||
SENTRY_DSN |
✅ | ✅ | ✅ | Optional Sentry DSN |
Security & Limitations
Security:
- S3 bucket has public access blocked; all objects encrypted at rest (AES256)
- Presigned URLs expire after a certain time
- IAM policies follow least privilege principle
- URL prefix allowlist prevents caching arbitrary URLs
- Extension allowlist restricts which file types can be cached
Limitations:
- Lambda timeout: 15 min (uploader), 30 sec (handler, touch)
- Lambda memory: 1024 MB (uploader), 256 MB (handler, touch)
- S3 object size: Up to 5 TB (AWS limit)
License
This project is licensed under the GNU General Public License v3.0 or later - see the LICENSE file for details.