Thursday, July 24, 2025

Scaling Phoenix Clusters on Fly.io—One GenServer per App with Swarm

Liam Killingback

Tutorial

Setting up a Phoenix application to run a single GenServer instance per “app” across an entire cluster of nodes is a common requirement: you want exactly one process responsible for each logical unit no matter how many machines you spin up. In this post we’ll walk through:

Adding the necessary dependencies
Configuring libcluster for service discovery on Fly.io
Booting Swarm and libcluster under your supervision tree
Defining and registering your GenServer via Swarm
Deploying to Fly.io (with DNS‑poll clustering)
Testing locally with multiple nodes

1. Add dependencies

In your mix.exs, add both libcluster and swarm to your deps:

defp deps do
  [
    {:phoenix, "~> 1.7"},
    # … your other deps …
    {:libcluster, "~> 3.3"},
    {:swarm, "~> 3.4"}
  ]
end

Then in the same file, declare them as applications so they start automatically:

def application do
  [
    mod: {SettleeasyApi.Application, []},
    extra_applications: [:logger, :runtime_tools, :libcluster, :swarm]
  ]
end

Run:

mix deps.get

2. Configure libcluster for Fly.io

Fly.io gives each app an internal DNS domain that returns the private IPs of all your app’s VMs. We’ll use libcluster’s Cluster.Strategy.DNSPoll to discover peers every few seconds.

In config/prod.exs:

if config_env() == :prod do
  app = System.fetch_env!("FLY_APP_NAME")

  config :libcluster,
    topologies: [
      fly6pn: [
        strategy: Cluster.Strategy.DNSPoll,
        config: [
          query: "#{app}.internal",
          node_basename: app,
          polling_interval: 5_000
        ]
      ]
    ]
end

And in your fly.toml, be sure to set:

[env]
  FLY_APP_NAME = "app-name"

(Replace "app-name" with whatever you named your Fly.io app.)

3. Define and register your GenServer

You’ve already got most of this in place. Here’s a pattern with a basic name server:

defmodule App.Server do
  use GenServer

  # Public API
  def start_link(%{name: name}) do
    # Use a via‐tuple so Swarm can track us globally
    key = {:via, :swarm, name}

    case GenServer.start_link(__MODULE__, %{}, name: key) do
      {:ok, pid} ->
        {:ok, pid}
      {:error, {:already_started, _pid}} ->
        # If somebody else on the cluster already started it, do nothing
        :ignore
      {:error, reason} ->
        {:error, reason}
    end
  end

  

  # GenServer callbacks

  @impl true
  def init(_) do
    {:ok, []}
  end

  # … your handle_call, handle_cast, etc. …
end

Why this works

{:via, :swarm, name} tells Elixir’s registry to go through Swarm, which will forward the call to whichever node is hosting that process.

4. Add your genserver to you’re application.ex children:

Supervisor.child_spec({App.Server, %{name: :email_agent}},
        id: :email_agent
      )

5. Deploying to Fly.io

Build & deploy as usual:
```
fly deploy
```
Fly will launch multiple VMs (you control count with fly scale count N).
Each VM boots, libcluster polls DNS for app-name.internal, discovers its peers, and forms a mesh.
Swarm.Cluster sees new nodes come up and shares the registry.

Now, your Phoenix code will guarantee there’s one (and exactly one) GenServer for whatever your use case is running somewhere in the cluster.

Conclusion

By combining:

libcluster for automatic node discovery,
Swarm for a distributed process registry and handoff,
{:via, :swarm, name} naming for your GenServers,

you get a bulletproof “exactly‑one‑process‑per‑key” guarantee across an arbitrarily sized cluster. This pattern scales from a couple of local nodes all the way up to dozens of machines on Fly.io (or any other platform that supports Erlang clustering). Happy clustering!

Back to blog