Monday, July 07, 2025

Deploying systems with Fly.io

Liam Killingback
Info
404

Introduction

NorthCape’s platform leverages Fly.io to deploy globally distributed, clustered services interconnected via Fly’s built-in private WireGuard mesh. By running applications as Fly Machines in multiple regions and peering them over an encrypted 6PN network, NorthCape achieves seamless communication between databases, APIs, and web frontends with minimal operational overhead(fly.io).

Architecture & Private Networking

Every Fly.io organization comes with an IPv6 “6PN” private network that allows Machines to resolve each other’s .internal hostnames and communicate over WireGuard. This zero-config VPN ensures that microservices and data stores talk privately without exposing ports to the public Internet(fly.io, fly.io). You can also create on-premise or cross-org WireGuard peers via fly wireguard create, downloading a preconfigured tunnel file to plug into your existing network(fly.io).

Sample private-service configuration (fly.toml) ```toml app = "northcape-service" primary_region = "iad" [build] # build settings… [[services]] internal_port = 4000 protocol = "tcp" # enable autostop/autostart (see Scalability) auto_stop_machines = true auto_start_machines = true [[services.ports]] handlers = ["tls", "http"] port = 443 tls_options = { alpn = ["h2", "http/1.1"], versions = ["TLSv1.2","TLSv1.3"] } # DNS name within 6PN [services.concurrency] type = "requests" soft_limit = 20 hard_limit = 25 ```

Scalability & Auto-Scaling

Fly.io supports two primary autoscaling modes for NorthCape’s clusters:

  1. Proxy-based autostop/autostart Fly Proxy watches concurrency per Machine and will stop idle instances when they exceed soft limits and restart them on incoming requests. This lets NorthCape scale down to as few as one machine per region when traffic drops, keeping costs aligned with demand(fly.io, fly.io). You can also enforce a min_machines_running to never scale below a threshold(community.fly.io).

  2. Metrics-based autoscaling For more custom rules (e.g., queue depth or CPU usage), NorthCape can deploy Fly’s autoscaler app, which polls metrics from Prometheus, Redis, or other sources and dynamically creates or destroys Machines based on defined thresholds(fly.io).

Load Balancing

Fly Proxy provides global Anycast edge load balancing, routing each request to the closest, least-loaded Machine based on RTT measurements and concurrency settings. Within each region, it follows this strategy:

  • Below soft limit → route normally
  • Between soft & hard limits → deprioritize until others fill up
  • Above hard limit → drain (503 responses)(fly.io, fly.io)

Cross-region routing only occurs if the local region is at hard capacity or unhealthy, ensuring localized performance by default.

Region Deployment & Global Distribution

Fly.io currently runs in 35 regions worldwide—from Tokyo to São Paulo—connected via Anycast for ultra-low latency. NorthCape specifies a primary_region in fly.toml (e.g. "iad"), and can add or remove regions on the fly with:

fly scale count <num> --region <region-code>

This lets NorthCape bring services closer to end users or comply with data sovereignty requirements in minutes(fly.io, fly.io).

Ease of Deployment

Getting started is as simple as:

  1. fly launch — scaffolds a new fly.toml, prompts for app name, region, and port using Fly Launch’s interactive UI(fly.io).
  2. fly deploy — builds the Docker image (or Nixpacks), pushes it, and spins up Machines across configured regions in one command(fly.io).
  3. fly status / fly logs provide real-time insights into running Machines, health checks, and autoscaling events.

Security Benefits

  • WireGuard Encryption: All east–west traffic between Machines travels over WireGuard, which uses Curve25519 for key exchange and ChaCha20-Poly1305 for authenticated encryption(en.wikipedia.org).
  • Private Networking: Services on the 6PN mesh are never exposed publicly unless explicitly configured, reducing attack surface(fly.io).
  • TLS Termination: For north–south traffic, Fly Proxy handles Let’s Encrypt-managed certificates and TLS termination at the edge, then securely forwards plaintext over the WireGuard mesh to backend Machines(fly.io).

Configuration Options & Best Practices

  • Service Concurrency: Tune soft_limit and hard_limit under [services.concurrency] to match application throughput requirements.
  • Auto-Stop/Start: Enable auto_stop_machines = true and auto_start_machines = true for cost-efficient, request-based scaling. Use min_machines_running to maintain high availability during low traffic periods.
  • Regions: Define primary_region for origin of traffic and use --regions with fly deploy to target multiple locations.
  • Health Checks: Configure HTTP or TCP health checks under [[services]] to ensure Fly Proxy only routes to healthy Machines.
  • Environment Management: Use multiple fly.toml files or environment overrides (e.g., staging vs production) to spin up per-environment clusters quickly.

With Fly.io’s global network, out-of-the-box WireGuard mesh, autoscaling, and edge load balancing, NorthCape can confidently deploy resilient, secure, and cost-optimized clusters spanning databases, APIs, and web frontends—all with minimal DevOps overhead.

Frequently Asked Questions

Do you build custom Ai integrations?

Yes, we specialize in creating tailored AI solutions to meet your specific needs. Examples include chatbots, recommendation systems, and predictive analytics.

What industries do you serve?

We serve a wide range of industries, including healthcare, finance, retail, and anything else that requires tech, which is.. basically everyone.

How do you ensure data security?

We implement industry-standard security measures, including encryption and access controls, to protect your data. We recommend all our builds follow an on-shore private cloud network, ensuring compliance with data protection regulations.

Do you build full-stack applications?

Yes, we have expertise in building full-stack applications using modern frameworks and technologies. Our go to stack is phoenix/elixir for fast, highly scalable concurrent applications that can be deployed & maintained easily.