Skip to content
ricochet

Kubernetes

Install ricochet using the OCI Helm chart:

Terminal window
helm install ricochet oci://ghcr.io/ricochet-rs/ricochet-helm

The ricochet server creates Kubernetes resources for your workloads:

  • Each ricochet app runs as a Kubernetes Deployment
  • Each ricochet task runs as a Kubernetes Job

Configure deployment specs for apps using the launcher.deployment settings.

When you deploy a new app, ricochet creates a deploy-<ulid> job that installs all required dependency packages.

When an app starts, an init-container (check-dependencies) verifies that all dependencies are installed. This is necessary for every app start because the initial dependency installation runs on a node with a specific architecture. If the app pod starts on a node with a different architecture, the dependencies may not (yet) be available. In this case, the init-container installs the dependencies for that architecture.

Ricochet requires two persistent volumes:

  • “Data”: /opt/ricochet/data
  • “Cache”: /opt/ricochet/cache

Both volumes must use “ReadWriteMany” access mode to be mounted across multiple pods.

Here are some recommendations for your volumes based on the expected number of active apps:

# of AppsCacheData
< 510 Gi20 Gi
5 - 2025 Gi50 Gi
20+30 Gi75Gi

The “cache” PVC does not scale linearly because apps share dependencies. The more apps deployed, the more overlap in packages.

The “data” PVC scales more linearly but depends on bundle sizes. Each bundle can be several megabytes, and a single app can accumulate many deployments during development.

  • The “cache” PVC stores all R, Julia, and Python packages across apps and tasks. It grows over time as different interpreter versions are used.

  • The “data” PVC stores all app and user content. Size depends on bundle sizes and deployment frequency. You can start with smaller volumes and expand as needed, though resizing may involve some operational overhead.

Ricochet includes built-in horizontal scaling for app deployments.

Each app deployment can scale across multiple replicas based on these parameters:

  • min_instances: Minimum number of replicas to maintain
  • max_instances: Maximum number of replicas to run
  • spawn_threshold: Connection occupancy percentage that triggers scale-up
  • max_connections: Maximum connections per app replica
  • max_connection_age: Maximum duration before closing a connection
  • inactive_timeout: Maximum idle time before closing a connection

These parameters together determine scaling behavior.

Apps run as Kubernetes deployments with support for multiple replicas, providing high availability. You can strengthen this by enforcing a minimum replica count distributed across multiple nodes.

You can extend the deployment spec per app with custom YAML configurations. This is an advanced feature. Care must be taken to avoid modifying sections that already exist through instance-wide configuration set via the Helm chart.

The following fields can be set instance-wide through the Helm chart and should not be overwritten on a per-app basis:

  • imagePullPolicy
  • imagePullSecrets
  • strategy