# Cache Architecture {#cache-architecture} ::: info This page provides a technical overview of the Tuist cache service architecture. It is primarily intended for **self-hosting users** and **contributors** who need to understand the internal workings of the service. General users who only want to use the cache do not need to read this. ::: The Tuist cache service is a standalone service that provides Content Addressable Storage (CAS) for build artifacts and a key-value store for cache metadata. ## Overview {#overview} The service uses a two-tier storage architecture plus local SQLite metadata: - **Local disk**: Primary storage for low-latency cache hits - **S3**: Durable storage that persists artifacts and allows recovery after eviction - **SQLite**: Local metadata for artifact access tracking, orphan cleanup, background jobs, and key-value cache data ```mermaid flowchart LR CLI[Tuist CLI] --> NGINX[Nginx] NGINX --> APP[Cache service] NGINX -->|X-Accel-Redirect| DISK[(Local Disk)] APP --> S3[(S3)] APP -->|auth| SERVER[Tuist Server] ``` ## Components {#components} ### Nginx {#nginx} Nginx serves as the entry point and handles efficient file delivery using `X-Accel-Redirect`: - **Downloads**: The cache service validates authentication, then returns an `X-Accel-Redirect` header. Nginx serves the file directly from disk or proxies from S3. - **Uploads**: Nginx proxies requests to the cache service, which streams data to disk. ### Content Addressable Storage {#cas} Artifacts are stored on local disk in a sharded directory structure: - **Path**: `{account}/{project}/cas/{shard1}/{shard2}/{artifact_id}` - **Sharding**: First four characters of the artifact ID create a two-level shard (e.g., `ABCD1234` → `AB/CD/ABCD1234`) ### SQLite Metadata {#sqlite} The cache service uses two SQLite databases: - **Primary metadata DB**: Stores `cache_artifacts`, orphan scan cursors, Oban jobs, and other service metadata. - **Key-value DB**: Stores `key_value_entries` and `key_value_entry_hashes` in a dedicated SQLite file. The key-value store is split into its own database so it can use SQLite incremental auto-vacuum without affecting artifact metadata and orphan cleanup state. ### S3 Integration {#s3} S3 provides durable storage: - **Background uploads**: After writing to disk, artifacts are queued for upload to S3 via a background worker that runs every minute - **On-demand hydration**: When a local artifact is missing, the request is served immediately via a presigned S3 URL while the artifact is queued for background download to local disk ### Disk Eviction {#eviction} The service manages disk space using multiple background processes: - **CAS disk eviction** uses LRU semantics backed by `cache_artifacts` - When disk usage exceeds 85%, the oldest artifacts are deleted until usage drops to 70% - Artifacts remain in S3 after local eviction - **KV eviction** removes old key-value entries by retention and can also shrink the dedicated KV database when it grows past its configured size budget ### Orphan Cleanup {#orphan-cleanup} The service also runs an orphan cleanup worker for disk artifacts: - It scans the storage tree for files that exist on disk but have no corresponding `cache_artifacts` row. - This can happen if a file is written to disk but the metadata write is lost before the SQLite buffer flush completes. - Files newer than a safety window are ignored to avoid racing with in-flight uploads. - If an orphan is deleted and later requested again, the next cache miss causes it to be uploaded again, so the system self-heals. ### Authentication {#authentication} The cache delegates authentication to the Tuist server by calling the `/api/projects` endpoint and caching results (10 minutes for success, 3 seconds for failure). ## Request Flows {#request-flows} ### Download {#download-flow} ```mermaid sequenceDiagram participant CLI as Tuist CLI participant N as Nginx participant A as Cache service participant D as Disk participant S as S3 CLI->>N: GET /api/cache/cas/:id N->>A: Proxy for auth A-->>N: X-Accel-Redirect alt On disk N->>D: Serve file else Not on disk N->>S: Proxy from S3 end N-->>CLI: File bytes ``` ### Upload {#upload-flow} ```mermaid sequenceDiagram participant CLI as Tuist CLI participant N as Nginx participant A as Cache service participant D as Disk participant S as S3 CLI->>N: POST /api/cache/cas/:id N->>A: Proxy upload A->>D: Stream to disk A-->>CLI: 201 Created A->>S: Background upload ``` ## API Endpoints {#api-endpoints} | Endpoint | Method | Description | |----------|--------|-------------| | `/up` | GET | Health check | | `/metrics` | GET | Prometheus metrics | | `/api/cache/cas/:id` | GET | Download CAS artifact | | `/api/cache/cas/:id` | POST | Upload CAS artifact | | `/api/cache/keyvalue/:cas_id` | GET | Get key-value entry | | `/api/cache/keyvalue` | PUT | Store key-value entry | | `/api/cache/module/:id` | HEAD | Check if module artifact exists | | `/api/cache/module/:id` | GET | Download module artifact | | `/api/cache/module/start` | POST | Start multipart upload | | `/api/cache/module/part` | POST | Upload part | | `/api/cache/module/complete` | POST | Complete multipart upload |