Supervision
In most languages, supervision requires a framework or a special runtime construct. In Saga, supervision is a handler pattern: catch a failure, log it, and restart the computation. This falls directly out of the effect system with no special language support.
The Basic Pattern
A supervisor is a handler for Fail that restarts the computation instead of
propagating the error:
fun supervise : (Unit -> Unit needs {Fail, ..e}) -> Unit needs {..e}
supervise f = f () with {
fail reason = {
dbg $"crashed: {reason}"
supervise f
}
}When the computation calls fail!, the handler catches it, logs the reason,
and calls supervise f again. The computation restarts from scratch.
Because the supervisor doesn't call resume, the failed computation is
abandoned. A fresh invocation starts clean.
fun unreliable_worker : Unit -> Unit needs {Fail}
unreliable_worker () = {
dbg "working..."
fail! "something broke"
}
main () = supervise unreliable_worker
# working...
# crashed: something broke
# working...
# crashed: something broke
# (loops forever)Retry Limits
An infinite restart loop is rarely what you want. Add a retry counter:
fun supervise_n : Int -> (Unit -> a needs {Fail, ..e}) -> Result a String needs {..e}
supervise_n 0 _ = Err "retries exhausted"
supervise_n n f = {
let result = f () with {
fail reason = {
dbg $"attempt failed: {reason}"
supervise_n (n - 1) f
}
return value = Ok value
}
result
}After n failures, the supervisor gives up and returns Err.
Using Std.Supervisor
The standard library provides supervised, which wraps this pattern:
import Std.Supervisor (supervised)
fun unreliable : Unit -> Int needs {Fail String}
unreliable () = fail! "something went wrong"
main () = {
let result = supervised 3 (fun () -> unreliable ())
dbg $"result: {debug result}"
}
# result: Err("something went wrong")supervised catches failures and retries up to the given number of times.
It returns Ok value on success or Err reason with the last failure if
all retries are exhausted.
Backoff
Since supervision is just a function, adding backoff is straightforward.
Use the Timer effect to add a delay between retries:
import Std.Actor (Timer)
fun supervise_backoff : Int -> Int -> (Unit -> a needs {Fail, ..e})
-> Result a String needs {Timer, ..e}
supervise_backoff 0 _ _ = Err "retries exhausted"
supervise_backoff n delay f = {
let result = f () with {
fail reason = {
dbg $"failed: {reason}, retrying in {show delay}ms"
sleep! delay
supervise_backoff (n - 1) (delay * 2) f
}
return value = Ok value
}
result
}
# Usage: exponential backoff starting at 100ms, up to 5 retries
supervise_backoff 5 100 my_workerThe delay doubles on each retry: 100ms, 200ms, 400ms, 800ms, 1600ms. Because
the Timer effect is just another effect in the needs clause, no special
integration is required.
Let It Crash
The BEAM's "let it crash" philosophy says: don't try to handle every possible error defensively. Instead, let processes fail and have a supervisor restart them. This works because BEAM processes are isolated. One crashing process cannot corrupt another's state.
In Saga, this philosophy maps directly to effects:
- Write your logic assuming everything works. Use
fail!when something goes wrong. - Wrap it in a supervisor handler at the boundary.
- The supervisor decides the restart policy. The business logic doesn't know or care.
fun server_loop : Unit -> Unit needs {Fail, Actor Request, Process}
server_loop () = {
let req = receive { r -> r }
let response = process_request req
send! req.reply_to response
server_loop ()
}
main () = {
supervise (fun () -> server_loop ())
} with beam_actorIf process_request fails, the supervisor restarts the loop. The requesting
process gets no response (it should have its own timeout), but the server
keeps running.
Supervision and Resources
When a supervised computation acquires resources, combine supervision with scoped cleanup to ensure resources are released on restart:
fun worker : Unit -> Unit needs {Fail, Scope}
worker () = {
let db = acquire_scoped! (fun () -> connect "postgres") disconnect
do_work db
}
main () = {
supervise (fun () -> {
worker () with run_scoped
})
}Each restart acquires a fresh connection, and each failure cleans up the old
one through the finally block in run_scoped. The patterns compose because
they are all just handlers.