Pausing and Resuming
OpenKruise Agents allows you to pause a running sandbox so that it stops consuming CPU / memory, and later
resume it back to a running state, keeping the sandbox identity (same sandboxID, same Pod) intact.
â ī¸ This area is still evolving. The underlying capture mechanism (whether memory state is preserved, whether the filesystem is checkpointed, etc.) depends on the cluster environment (e.g. memory-state preservation is currently only supported on Alibaba Cloud ACS). This document focuses on how to use the API; for state-preservation guarantees, see your platform runbook.
Overviewâ
Two parallel interfaces are provided, both acting on the same underlying Sandbox resource:
| Interface | Audience | Typical scenarios |
|---|---|---|
| E2B SDK (pause/connect) | Application code running on top of E2B Python / JavaScript | Programmatic pause before idle, resume on demand |
| Kubernetes CRD | Cluster ops, declarative GitOps, kubectl or custom control | Toggle spec.paused or schedule spec.pauseTime |
Pause/Resume is one-to-one: the sandbox ID stays the same across the pause â resume cycle. If you need a one-to-many "snapshot and fork" workflow, see Snapshot Management.
How It Works (summary)â
- Pause freezes the sandbox Pod. Active WebSocket / PTY / command-stream connections are dropped; clients must reconnect after resume.
- Resume brings the Pod back to the running state. The sandbox ID is preserved.
- The exact capture scope (memory / filesystem) depends on the runtime platform and the configuration on the
backing
SandboxSpec.
Pausing a Sandboxâ
- E2B SDK
- Kubernetes CRD
The E2B SDK exposes a pause() method on a sandbox handle. It calls the POST /sandboxes/{sandboxID}/pause
endpoint under the hood.
from e2b_code_interpreter import Sandbox
with Sandbox.create(template="code-interpreter", timeout=300) as sbx:
sbx.run_code("a = 1")
sbx.pause() # sandbox is now paused; sandboxID is retained
import { Sandbox } from 'e2b'
const sbx = await Sandbox.create({ template: 'code-interpreter', timeoutMs: 300_000 })
await sbx.betaPause()
Notes:
- Pausing a sandbox that is not in
runningstate returns409 Conflict. - Paused sandboxes are kept indefinitely â the auto-shutdown timer is disabled while paused.
Set spec.paused: true on the Sandbox CR. The controller drives the Pod into the paused state.
kubectl patch sbx my-sandbox -n default --type=merge -p '{"spec":{"paused":true}}'
You can also schedule an automatic pause via spec.pauseTime:
apiVersion: agents.kruise.io/v1alpha1
kind: Sandbox
metadata:
name: my-sandbox
namespace: default
spec:
pauseTime: "2026-05-13T10:00:00Z" # RFC3339; absolute time to auto-pause
Check the status phase:
kubectl get sbx my-sandbox -n default -o jsonpath='{.status.phase}'
# â Paused
Auto Pauseâ
In addition to calling pause explicitly, you can declare that the sandbox should auto-transition into paused
on expiry at creation time â when the timer fires the sandbox is not killed, it moves into the paused state with
its identity preserved, waiting for a later resume.
- E2B SDK
- Kubernetes CRD
Follow the E2B docs â Auto-pause: set
lifecycle.on_timeout to "pause" when creating the sandbox.
from e2b_code_interpreter import Sandbox
sbx = Sandbox.create(
template="demo",
timeout=600, # 10 minutes; on expiry go to paused instead of being killed
lifecycle={
"on_timeout": "pause",
"auto_resume": False, # see note below
},
)
import { Sandbox } from 'e2b'
const sandbox = await Sandbox.create({
template: 'demo',
timeoutMs: 10 * 60 * 1000, // 10 minutes; on expiry go to paused
lifecycle: {
onTimeout: 'pause',
autoResume: false, // see note below
},
})
â ī¸
auto_resumeis not yet implemented in OpenKruise Agents. Even if you set it totrue, a paused sandbox will not be woken up automatically; clients must still callSandbox.connect(sandbox_id, ...)explicitly when they need it (see the next section). Set it tofalseto make the semantics explicit.
Set spec.pauseTime on a SandboxClaim (or on the underlying Sandbox CR). When the absolute time is reached the
controller drives the sandbox into paused.
apiVersion: agents.kruise.io/v1alpha1
kind: SandboxClaim
metadata:
name: demo-sandbox-claim
namespace: default
spec:
templateName: demo
# RFC 3339 absolute time; the controller auto-pauses the sandbox on expiry.
# It is recommended to set this field programmatically, for example:
# sbc.Spec.PauseTime = metav1.NewTime(time.Now().Add(5 * time.Minute))
pauseTime: "2026-02-06T07:33:30Z"
The
SandboxCR itself also has aspec.pauseTimefield (see the previous section). The CRD path uses an absolute time rather than the "pause when the timeout expires" offset semantics of the E2B SDK â use the E2B SDK if you need the latter.
Resuming a Sandboxâ
The recommended interface on the E2B SDK side is Sandbox.connect(...) â it implicitly resumes a paused sandbox and
at the same time refreshes its timeout. The legacy resume endpoint still exists for backward compatibility but
should not be used by new code.
- E2B SDK
- Kubernetes CRD
from e2b_code_interpreter import Sandbox
sbx = Sandbox.connect(sandbox_id, timeout=300) # resumes if paused, and extends the timeout
sbx.run_code("print(a)")
import { Sandbox } from 'e2b'
const sbx = await Sandbox.connect(sandboxId, { timeoutMs: 300_000 })
Notes:
- If the sandbox is already running,
connectonly refreshes the timeout. The refresh is extend-only: the timeout will never be shortened. (Exception: a paused â running resume applies the requested timeout directly.) connecton a sandbox that does not exist or is owned by another API key returns404 Not Found.
Clear spec.paused (set it back to false). The controller will resume the Pod. You can also adjust
spec.pauseTime / spec.shutdownTime in the same patch to effectively "refresh the timeout while resuming":
kubectl patch sbx my-sandbox -n default --type=merge -p '{"spec":{"paused":false}}'
# Resume and also push the next auto-shutdown one hour out (example)
kubectl patch sbx my-sandbox -n default --type=merge \
-p '{"spec":{"paused":false,"shutdownTime":"2026-05-13T11:00:00Z"}}'
The CRD path has no extend-only guard. The E2B SDK's
connect + timeoutMsonly extends, never shortens, the remaining lifetime while running; the CRD path takes the user-written value verbatim and can shorten it. Pick the path that matches your intent.
Capability Matrixâ
| Capability | E2B SDK | Kubernetes CRD |
|---|---|---|
| Pause a running sandbox | â
sbx.beta_pause() | â
spec.paused: true |
| Resume a paused sandbox | â
Sandbox.connect(id, ...) | â
spec.paused: false |
| Auto-pause when the timeout expires | â
lifecycle.on_timeout='pause' | â |
| Auto-pause at a specific absolute time | â | â
spec.pauseTime |
| Auto-resume a paused sandbox | â not yet supported | â not yet supported |
| Set / refresh the sandbox timeout together with resume | â
Sandbox.connect(id, timeout=...) | â
write spec.shutdownTime / spec.pauseTime in the same patch |
| Extend-only guard on timeout refresh while running | â | â user-written value may shorten |
| Observe paused/running state | via SDK response | status.phase (Paused / Running) |
If you need "extend-only, never shorten" timeout semantics, use the E2B SDK. The CRD path is better suited for declarative / GitOps control over the paused/running bit plus absolute scheduling times.
Notesâ
- Connection drop. The Pod is frozen on pause; all active streams (WebSocket / PTY / command streams) disconnect. Clients must reconnect after resume.
- Timeout during pause. Paused sandboxes are not auto-deleted by the idle timeout. Auto-delete is controlled
separately by
spec.shutdownTime. - Old SDKs. The legacy
POST /sandboxes/{sandboxID}/resumeendpoint is kept for old SDK compatibility only. New code should always useSandbox.connect(...). - State-preservation caveats. Whether memory is preserved across pause/resume depends on the runtime platform. If you need explicit memory + filesystem snapshots that can also be cloned into brand-new sandboxes, use Snapshot Management instead.