Monday, July 07, 2025

Deploying systems with Fly.io

Liam Killingback

Info

Introduction

NorthCape’s platform leverages Fly.io to deploy globally distributed, clustered services interconnected via Fly’s built-in private WireGuard mesh. By running applications as Fly Machines in multiple regions and peering them over an encrypted 6PN network, NorthCape achieves seamless communication between databases, APIs, and web frontends with minimal operational overhead(fly.io).

Architecture & Private Networking

Every Fly.io organization comes with an IPv6 “6PN” private network that allows Machines to resolve each other’s .internal hostnames and communicate over WireGuard. This zero-config VPN ensures that microservices and data stores talk privately without exposing ports to the public Internet(fly.io, fly.io). You can also create on-premise or cross-org WireGuard peers via fly wireguard create, downloading a preconfigured tunnel file to plug into your existing network(fly.io).

Sample private-service configuration (fly.toml)

```toml app = "northcape-service" primary_region = "iad" [build] # build settings… [[services]] internal_port = 4000 protocol = "tcp" # enable autostop/autostart (see Scalability) auto_stop_machines = true auto_start_machines = true [[services.ports]] handlers = ["tls", "http"] port = 443 tls_options = { alpn = ["h2", "http/1.1"], versions = ["TLSv1.2","TLSv1.3"] } # DNS name within 6PN [services.concurrency] type = "requests" soft_limit = 20 hard_limit = 25 ```

Scalability & Auto-Scaling

Fly.io supports two primary autoscaling modes for NorthCape’s clusters:

Proxy-based autostop/autostart Fly Proxy watches concurrency per Machine and will stop idle instances when they exceed soft limits and restart them on incoming requests. This lets NorthCape scale down to as few as one machine per region when traffic drops, keeping costs aligned with demand(fly.io, fly.io). You can also enforce a min_machines_running to never scale below a threshold(community.fly.io).
Metrics-based autoscaling For more custom rules (e.g., queue depth or CPU usage), NorthCape can deploy Fly’s autoscaler app, which polls metrics from Prometheus, Redis, or other sources and dynamically creates or destroys Machines based on defined thresholds(fly.io).

Load Balancing

Fly Proxy provides global Anycast edge load balancing, routing each request to the closest, least-loaded Machine based on RTT measurements and concurrency settings. Within each region, it follows this strategy:

Below soft limit → route normally
Between soft & hard limits → deprioritize until others fill up
Above hard limit → drain (503 responses)(fly.io, fly.io)

Cross-region routing only occurs if the local region is at hard capacity or unhealthy, ensuring localized performance by default.

Region Deployment & Global Distribution

Fly.io currently runs in 35 regions worldwide—from Tokyo to São Paulo—connected via Anycast for ultra-low latency. NorthCape specifies a primary_region in fly.toml (e.g. "iad"), and can add or remove regions on the fly with:

fly scale count <num> --region <region-code>

This lets NorthCape bring services closer to end users or comply with data sovereignty requirements in minutes(fly.io, fly.io).

Ease of Deployment

Getting started is as simple as:

fly launch — scaffolds a new fly.toml, prompts for app name, region, and port using Fly Launch’s interactive UI(fly.io).
fly deploy — builds the Docker image (or Nixpacks), pushes it, and spins up Machines across configured regions in one command(fly.io).
fly status / fly logs provide real-time insights into running Machines, health checks, and autoscaling events.

Security Benefits

WireGuard Encryption: All east–west traffic between Machines travels over WireGuard, which uses Curve25519 for key exchange and ChaCha20-Poly1305 for authenticated encryption(en.wikipedia.org).
Private Networking: Services on the 6PN mesh are never exposed publicly unless explicitly configured, reducing attack surface(fly.io).
TLS Termination: For north–south traffic, Fly Proxy handles Let’s Encrypt-managed certificates and TLS termination at the edge, then securely forwards plaintext over the WireGuard mesh to backend Machines(fly.io).

Configuration Options & Best Practices

Service Concurrency: Tune soft_limit and hard_limit under [services.concurrency] to match application throughput requirements.
Auto-Stop/Start: Enable auto_stop_machines = true and auto_start_machines = true for cost-efficient, request-based scaling. Use min_machines_running to maintain high availability during low traffic periods.
Regions: Define primary_region for origin of traffic and use --regions with fly deploy to target multiple locations.
Health Checks: Configure HTTP or TCP health checks under [[services]] to ensure Fly Proxy only routes to healthy Machines.
Environment Management: Use multiple fly.toml files or environment overrides (e.g., staging vs production) to spin up per-environment clusters quickly.

With Fly.io’s global network, out-of-the-box WireGuard mesh, autoscaling, and edge load balancing, NorthCape can confidently deploy resilient, secure, and cost-optimized clusters spanning databases, APIs, and web frontends—all with minimal DevOps overhead.

Back to blog