Skip to content

Commit 28d7345

Browse files
committed
Revamp cache housekeeping logic
1 parent 2f1c5d2 commit 28d7345

4 files changed

Lines changed: 266 additions & 84 deletions

File tree

README.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ It supports multiple backends: Azure Blob Storage and local filesystem, to provi
55
It caches the files in a local directory and serves them from there.
66
Range requests are supported, but only for start offset, end limit is not implemented yet.
77
Files are protected by upload locking to prevent concurrent uploads to the same path.
8-
To limit disk usage, a background housekeeping loop samples disk space every five minutes and deletes the oldest cached files (older than an hour) whenever free space drops below 12%. A housekeeping summary is logged every five minutes showing the current free space, cache size, and any clean-up actions taken.
8+
To limit disk usage, a background housekeeping loop samples disk space every five minutes, keeps the cache below 1,000,000 items, and deletes the oldest cached files in configurable batches (default 100,000 files) whenever the file-count or disk-space thresholds are exceeded. A housekeeping summary is logged every five minutes showing the current free space, cache size, and any clean-up actions taken.
99

1010
## Configuration
1111

@@ -42,6 +42,18 @@ name = "user@email.com"
4242
prefixes = ["/allowed/path"]
4343
```
4444

45+
### Cache Housekeeping
46+
47+
Cache housekeeping enforces a hard limit of 1,000,000 cached artifacts (`*.content` files) and frees disk space whenever free capacity drops below 12% (stopping once it rises above 13%). Files are always deleted in least-recently-updated order. The size of each deletion batch is configurable via the optional `[cache]` section:
48+
49+
```toml
50+
[cache]
51+
# Number of cached files deleted per housekeeping iteration when limits are exceeded
52+
cleanup_chunk_size = 100000 # Defaults to 100000
53+
```
54+
55+
Larger chunk sizes reclaim space faster when the cache is far above the limit, while smaller chunks reduce the amount of data removed per iteration.
56+
4557
## Creating user tokens
4658

4759
The server uses JWT token based authentication. The token is passed in the `Authorization` header as a Bearer token.

config-local.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,6 @@ jwt_secret="test-secret-for-local-storage-testing"
33

44
[local]
55
storage_path="./test-storage"
6+
7+
[cache]
8+
cleanup_chunk_size=100000

docs/_index.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ curl https://files.kernelci.org/metrics
9090

9191
- **JWT Authentication**: Secure token-based authentication for uploads
9292
- **Multiple Storage Backends**: Currently supports Azure Blob Storage with extensible driver architecture
93-
- **Local Caching**: Files are cached locally with automatic cleanup when disk space is low; disk space is sampled every five minutes, a housekeeping summary is logged at the same cadence, and the least recently updated cached files older than an hour are removed until space recovers
93+
- **Local Caching**: Files are cached locally with automatic cleanup rules; housekeeping enforces a hard limit of 1,000,000 cached entries, deletes the oldest files in configurable batches (default 100,000 files), and frees disk space whenever available space falls below 12%
9494
- **Range Request Support**: Partial content downloads using HTTP range requests
9595
- **File Locking**: Prevents concurrent uploads to the same file path
9696
- **Prometheus Metrics**: System monitoring and metrics collection
@@ -157,6 +157,17 @@ name = "admin@example.com"
157157
prefixes = [""] # Empty prefix allows access to all paths
158158
```
159159

160+
### Cache Housekeeping
161+
162+
The cache directory is capped at 1,000,000 cached artifacts (`*.content` files). When the limit is exceeded—or when disk space drops below 12%—the housekeeping worker deletes the oldest entries in batches. The batch size defaults to 100,000 files and can be overridden in the configuration file:
163+
164+
```toml
165+
[cache]
166+
cleanup_chunk_size = 100000
167+
```
168+
169+
Raising the value makes each cleanup iteration more aggressive; lowering it favors smaller, more frequent deletions.
170+
160171
## Environment Variables
161172

162173
- `STORAGE_DEBUG`: Enable debug logging

0 commit comments

Comments
 (0)