Thursday, July 24, 2025
Scaling Phoenix Clusters on Fly.io—One GenServer per App with Swarm

Setting up a Phoenix application to run a single GenServer instance per “app” across an entire cluster of nodes is a common requirement: you want exactly one process responsible for each logical unit no matter how many machines you spin up. In this post we’ll walk through:
- Adding the necessary dependencies
- Configuring libcluster for service discovery on Fly.io
- Booting Swarm and libcluster under your supervision tree
- Defining and registering your GenServer via Swarm
- Deploying to Fly.io (with DNS‑poll clustering)
- Testing locally with multiple nodes
1. Add dependencies
In your mix.exs
, add both libcluster
and swarm
to your deps
:
defp deps do
[
{:phoenix, "~> 1.7"},
# … your other deps …
{:libcluster, "~> 3.3"},
{:swarm, "~> 3.4"}
]
end
Then in the same file, declare them as applications so they start automatically:
def application do
[
mod: {SettleeasyApi.Application, []},
extra_applications: [:logger, :runtime_tools, :libcluster, :swarm]
]
end
Run:
mix deps.get
2. Configure libcluster for Fly.io
Fly.io gives each app an internal DNS domain that returns the private IPs of all your app’s VMs. We’ll use libcluster’s Cluster.Strategy.DNSPoll
to discover peers every few seconds.
In config/prod.exs
:
if config_env() == :prod do
app = System.fetch_env!("FLY_APP_NAME")
config :libcluster,
topologies: [
fly6pn: [
strategy: Cluster.Strategy.DNSPoll,
config: [
query: "#{app}.internal",
node_basename: app,
polling_interval: 5_000
]
]
]
end
And in your fly.toml
, be sure to set:
[env]
FLY_APP_NAME = "app-name"
(Replace "app-name"
with whatever you named your Fly.io app.)
3. Define and register your GenServer
You’ve already got most of this in place. Here’s a pattern with a basic name server:
defmodule App.Server do
use GenServer
# Public API
def start_link(%{name: name}) do
# Use a via‐tuple so Swarm can track us globally
key = {:via, :swarm, name}
case GenServer.start_link(__MODULE__, %{}, name: key) do
{:ok, pid} ->
{:ok, pid}
{:error, {:already_started, _pid}} ->
# If somebody else on the cluster already started it, do nothing
:ignore
{:error, reason} ->
{:error, reason}
end
end
# GenServer callbacks
@impl true
def init(_) do
{:ok, []}
end
# … your handle_call, handle_cast, etc. …
end
Why this works
-
{:via, :swarm, name}
tells Elixir’s registry to go through Swarm, which will forward the call to whichever node is hosting that process.
4. Add your genserver to you’re application.ex children:
Supervisor.child_spec({App.Server, %{name: :email_agent}},
id: :email_agent
)
5. Deploying to Fly.io
-
Build & deploy as usual:
fly deploy
-
Fly will launch multiple VMs (you control count with
fly scale count N
). -
Each VM boots, libcluster polls DNS for
app-name.internal
, discovers its peers, and forms a mesh. -
Swarm.Cluster sees new nodes come up and shares the registry.
Now, your Phoenix code will guarantee there’s one (and exactly one) GenServer for whatever your use case is running somewhere in the cluster.
Conclusion
By combining:
- libcluster for automatic node discovery,
- Swarm for a distributed process registry and handoff,
-
{:via, :swarm, name}
naming for your GenServers,
you get a bulletproof “exactly‑one‑process‑per‑key” guarantee across an arbitrarily sized cluster. This pattern scales from a couple of local nodes all the way up to dozens of machines on Fly.io (or any other platform that supports Erlang clustering). Happy clustering!