A digital drawing of a Gopher storing stuff in a distributed database.

Distributed SQLite with LiteFS

Using distributed SQLite and Go in the cloud with LiteFS.

Last updated November 2022 for Go 1.19.

SQLite? Distributed?!

You heard me right. We’re taking SQLite out of the realm of embedded databases and into the clouds of the world wide web. ☁️🕸 (Embrace mixing metaphors.)

In the article Go and SQLite in the Cloud I showed you how to get up and running with a Go web app using SQLite as the storage backend. This article builds on that, so consider reading it first if you haven’t already.

In this article, I’ll show you how to extend the sqlite-app to run in multiple cloud edge regions. Why is that cool? Because then we can have our app running really close to our users. In combination with Go and SQLite, we can make things really freakin’ fast for them. 🏃💨 So they can get to read your sweet articles ASAP.

We’ll be using fly.io again, but you could probably adapt it to your favourite cloud provider by doing a few configuration and code changes.

We’ll also be using LiteFS. It’s important to note that LiteFS is still in beta, so probably shouldn’t be used for production workloads quite yet. I’ll try to keep this example up to date with any changes in the beta.

We’ll learn about:

using LiteFS in a primary + read-replicas setup,
forwarding writes to the primary, and
tradeoffs involved (including eventual consistency and write latency).

And a nice BONUS: A few short questions to and answers from Ben Johnson, creator of LiteFS, at the end of the article. 😎

Like last time, I’ll show you the relevant parts of the app on this page. Check out all the code of the litefs-app on Github, and see the online demo here. There’s one demo instance running in Frankfurt, Germany ("fra" region) and one running close to New York, USA ("ewr" region).

What’s LiteFS?

In short, LiteFS is a system that sits between your app and the SQLite database files, and enables nifty things like read replicas and changing what app instance is the primary (which can write to the database).

It does this by mounting a FUSE filesystem where the app accesses the database. It then basically finds database transactions and ships those to read replicas. If you want more details than this, head over to the official "how it works" page.

LiteFS currently doesn’t do write forwarding (that’s being worked on), so we’ll be using a feature in Fly to do HTTP request replays with a special fly-replay HTTP header for all POST requests, which are the only ones that do writes.

Being a primary

One of your app instances is going to be the primary. Which one exactly is determined through a shared lock (in a system called Consul) which has a timeout. The instance that holds the lock is the primary. If the primary needs to go offline for a reboot or similar, it can release the lock, and another instance can take it. If the instance unexpectedly crashes, the timeout ensures that another instance can take the look shortly after and become the primary.

How does LiteFS communicate this to your app? Through the filesystem, like SQLite. If there’s a special file called .primary in the directory where your database is FUSE-mounted, the instance is currently a replica. The file content is a reference to the primary and can be used to communicate with it.

Bonus: Another special file: app.db-pos

To make this easy to use in the app, I’ve added this method directly to the database:

sql/database.go
package sql

// …

// GetPrimary instance name if this is a replica, otherwise the empty string.
func (d *Database) GetPrimary() (string, error) {
	basePath := strings.TrimPrefix(d.baseURL, "file:")
	primaryPath := filepath.Join(filepath.Dir(basePath), ".primary")
	primary, err := os.ReadFile(primaryPath)
	if err != nil {
		if errors.Is(err, os.ErrNotExist) {
			return "", nil
		}
		return "", err
	}
	return strings.TrimSpace(string(primary)), nil
}

Why do we need to know whether the current app instance is the primary? Because only the primary can write to the database. Luckily, we’ve made it easy for ourselves in our app, because everything that can change state goes through HTTP POST requests. So we can just create some middleware to check whether the incoming request is a POST request, check whether the current instance is a replica, and redirect to the primary with the fly-replay HTTP header if so.

To make it easier to view data from a specific region, there’s also middleware to check for an HTTP URL query parameter (unsurprisingly called region), and redirect to it if the current instance is not in that region.

http/middleware.go
package http

import (
	"log"
	"net/http"
)

type Middleware = func(http.Handler) http.Handler

// RedirectRegion using the fly-replay HTTP header if "region" is set in the URL query params.
func RedirectRegion(currentRegion string) Middleware {
	return func(next http.Handler) http.Handler {
		return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {

			if region := r.URL.Query().Get("region"); region != "" && region != currentRegion {
				w.Header().Set("fly-replay", "region="+region)
				return
			}

			next.ServeHTTP(w, r)
		})
	}
}

type primaryGetter interface {
	GetPrimary() (string, error)
}

// RedirectToPrimary if the request is POST and this is not the primary instance.
func RedirectToPrimary(db primaryGetter, log *log.Logger) Middleware {
	return func(next http.Handler) http.Handler {
		return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
			// We don't need to do anything if this is not a POST request
			if r.Method != http.MethodPost {
				next.ServeHTTP(w, r)
				return
			}

			// If region is forced in the URL query params, don't do anything
			if region := r.URL.Query().Get("region"); region != "" {
				next.ServeHTTP(w, r)
				return
			}

			primary, err := db.GetPrimary()
			if err != nil {
				log.Println("Error getting primary:", err)
				http.Error(w, err.Error(), http.StatusInternalServerError)
				return
			}

			// If primary is not empty, redirect
			if primary != "" {
				w.Header().Set("fly-replay", "instance="+primary)
				return
			}

			next.ServeHTTP(w, r)
		})
	}
}

The current region is set in the FLY_REGION environment variable and passed to the HTTP server at app startup. The middleware is then applied like this:

http/routes.go
package http

import (
	"github.com/go-chi/chi/v5"
	"github.com/go-chi/chi/v5/middleware"
)

func (s *Server) setupRoutes() {
	s.mux.Use(middleware.Recoverer, middleware.Compress(5))
	s.mux.Use(RedirectRegion(s.region), RedirectToPrimary(s.database, s.log))

	s.mux.Group(func(r chi.Router) {
		r.Use(middleware.SetHeader("Content-Type", "text/html; charset=utf-8"))

		Home(r, s.log, s.database, s.region)
		Articles(r, s.log, s.database)
		NewArticle(r, s.log, s.database)
	})

	Migrate(s.mux, s.database)
}

And don’t worry: If something goes wrong and you’re trying to write to the database on a replica, you’ll get an error:

$ http post 'https://litefs-app.fly.dev/new?region=ewr' title=foo content=bar
HTTP/1.1 500 Internal Server Error
content-encoding: gzip
content-length: 399
content-type: text/html; charset=utf-8
date: Wed, 23 Nov 2022 12:24:53 GMT
fly-request-id: 01GJJ75F6RRAV9AVC2KQD2C5T6-fra
server: Fly/b76dc087 (2022-11-18)
vary: Accept-Encoding
via: 1.1 fly.io

2022-11-23T12:24:53Z app[8c94534f] ewr [info]fuse: write(): wal error: read only replica
2022-11-23T12:24:53Z app[8c94534f] ewr [info]2022/11/23 12:24:53 articles.go:99: Error creating article: disk I/O error

Running LiteFS

LiteFS runs as a separate process underneath your app. It’s easy to set up with just a small change to the Dockerfile and a configuration file called litefs.yml:

Dockerfile
FROM flyio/litefs:0.3.0-beta5 AS litefs

FROM golang AS builder
WORKDIR /src

COPY go.mod go.sum ./
RUN go mod download

COPY . ./
RUN GOOS=linux GOARCH=amd64 go build -tags "sqlite_fts5 sqlite_foreign_keys" -ldflags="-s -w" -o /bin/server ./cmd/server

FROM debian:bullseye-slim AS runner
WORKDIR /app

RUN mkdir -p /data /mnt/data

RUN set -x && apt-get update && \
  DEBIAN_FRONTEND=noninteractive apt-get install -y ca-certificates sqlite3 fuse && \
  rm -rf /var/lib/apt/lists/*

ADD litefs.yml /etc/litefs.yml

COPY --from=litefs /usr/local/bin/litefs ./
COPY --from=builder /bin/server ./

CMD ["./litefs", "mount"]

litefs.yml
# The path to where the SQLite database will be accessed.
mount-dir: "/data"

# The path to where the underlying volume mount is.
data-dir: "/mnt/data"

# Execute this subprocess once LiteFS connects to the cluster.
exec: "/app/server"

# These environment variables will be available in your Fly.io application.
# You must specify "experiment.enable_consul" for FLY_CONSUL_URL to be available.
consul:
  url: "${FLY_CONSUL_URL}"
  advertise-url: "http://${HOSTNAME}.vm.${FLY_APP_NAME}.internal:20202"

For this to work, there are also a few small changes in the fly.toml configuration file. We need to enable Consul (for the shared lock) and mount the underlying persistent volume in another place than /data, because that’s where the FUSE mount will be.

fly.toml
# fly.toml file generated for litefs-app on 2022-11-22T10:42:41+01:00

app = "litefs-app"
kill_signal = "SIGINT"
kill_timeout = 5
processes = []

[env]
  DATABASE_URL = "file:/data/app.db"

[experimental]
  allowed_public_ports = []
  auto_rollback = true
  enable_consul = true

[mounts]
  destination = "/mnt/data"
  source = "data"

[[services]]
  http_checks = []
  internal_port = 8080
  processes = ["app"]
  protocol = "tcp"
  script_checks = []
  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    force_https = true
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

…and that’s really all there is to it! LiteFS now takes care of replication from the primary to replicas as well as primary selection, and all you need to do is make sure you never change state outside of POST requests. (If you need PUT/PATCH/DELETE/WHATEVER, just change the middleware.)

Tradeoffs

After running fly launch and deploying your new app, you’ve got a globally distributed, SQLite-backed cloud app. Pretty cool, ay? 😄

But there are always tradeoffs. This setup is simpler than most distributed databases, is super duper fast, but you need to be aware of a few things:

This replication type is what’s often called asynchronous replication, which is eventually consistent. What does that mean?
It means that once changes are written to the primary, there’s a (hopefully small) delay for those changes to be propagated to the read replicas. So if you’re unlucky, the subsequent read (from a replica) just after a write (on the primary) will not have the write yet, leaving the user with a confusing, inconsistent world view. (WHERE’S MY ARTICLE?!, they will hopefully not yell at you.)
There are ways around this, including something called sticky cookies (where you set a cookie in the client’s browser to read from the primary until the state change is properly replicated to the local replica), or not acknowleding the write on the primary until all replicas have received the change. But those are outside the scope of this article. DuckDuckGo is your friend!
Your users’ writes will be slower than their reads if the primary is not the app instance closest to them, because light is so slow. Come on, light, move it! 🌞😄
Currently, if things go wrong in LiteFS due to a bug or unforeseen circumstance, there’s probably not a lot of people in the world that can help you. This will get better over time if/when LiteFS sees more usage, but something to be aware of at the moment.

Those tradeoffs are worth it for a lot of read-heavy web apps out there. Especially a blog app like this one. Whether it’s worth it for your particular app is worth carefully considering.

Bonus: Three questions for Ben

I reached out to Ben Johnson, tech lead on LiteFS, for a few questions about LiteFS and its future. He graceously accepted the request. Thank you Ben! ❤️

Markus: Why use distributed SQLite instead of a more common setup like Postgres + read replicas?

Ben: Distributed Postgres can be a great option but I see a few trade-offs with it. First, Postgres can be resource intensive. You typically need larger machines than what SQLite requires and that gets expensive once you start to have multiple regions. You also need to manage which application nodes connect to which Postgres nodes and how they are meant to failover.

Second, built-in Postgres replication doesn’t support more modern deployment strategies where nodes are ephemeral so you typically need third-party support with something like Stolon for high availability and wal-g for high durability. LiteFS is built for dynamic clusters and we’ll have built-in high durability features coming soon.

Finally, SQLite has the benefit of moving data close to the application. Like, really close. Query latency overhead against Postgres within the same region can be as high as one millisecond whereas latency from an application to SQLite it’s typically closer to 10-20 microseconds. That difference means that N+1 query performance issues aren’t typically a problem with SQLite like they are with client/server databases.

Markus: Why did you choose async replication for LiteFS, and what are your thoughts on the tradeoffs involved for an average web app?

Ben: We chose to support async replication initially because synchronous replication is difficult, if not impossible, to implement in a dynamic cluster since it can’t support distributed consensus. We are adding synchronous replication soon though, however, it’ll require an external, high durability service such as S3 to act as a data authority.

Synchronous replication can have high throughput costs for a serializable database like SQLite though so I think a better balance would be supporting async replication with an upper time bound for unreplicated writes. For example, a primary node could stop accepting writes if it hasn’t been able to replicate to S3 for more than a few seconds.

Markus: What’s in the future for LiteFS and distributed SQLite in general?

Ben: Right now we’re in a phase of focusing on stability and correctness with LiteFS and that’s been going well. The biggest near term feature will be replicating to external storage such as S3. That should make LiteFS easier to run in a production environment while knowing that data is stored with extremely high durability.

More long term, we plan to release WASM-based clients for purely ephemeral services such as Vercel & Deno. LiteFS’ architecture lends itself to being able to have thin clients that page in data on-demand from a point-in-time snapshot by using a transaction ID. SQLite does limit write concurrency so this architecture doesn’t work well for a single database. However, SQLite databases are just files so they’re very lightweight. A SaaS service could isolate their data so that each customer has their own SQLite database which has security & compliance benefits.

I’m looking forward to following the development of LiteFS. Thanks again Ben!

Resources

I’ve used the following resources in the creation of this article:

About

I’m Markus 🤓✨. I’m passionate about simple & boring but useful software. So we can build the things that actually matter.

Want to learn Go web and cloud development? I’ve got online Go courses! 😎

Need someone to do it for you? I do software consulting.

Enjoyed this? Want my newsletter? No spam. Instant unsubscribe any time.