In December 2025 I went on a self-hosting spree at Farmako. Typesense Cloud first, then Metabase, then an observability stack to watch it all. The spree was triggered by a pricing page, and pricing pages make people do rash things.
Typesense on GKE
Typesense powers our medicine search. Someone types "dolo" and we need the right paracetamol brand in milliseconds, with typo tolerance, because people type drug names from memory of a doctor's handwriting. Typesense Cloud was fine technically. It was just expensive for what is a single stateful binary with a data directory.
The Kubernetes manifest is a StatefulSet with a single replica (we don't need HA; the search index rebuilds from Postgres in under a minute). The PVC uses an SSD-backed storage class (pd-ssd on GKE) because Typesense mmaps its index files and read latency directly affects p99 query latency:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: typesense
spec:
serviceName: typesense
replicas: 1
template:
spec:
containers:
- name: typesense
image: typesense/typesense:27.1
args:
- --data-dir=/data
- --api-key=$(TYPESENSE_API_KEY)
- --memory-limit-mb=768
resources:
requests:
memory: 512Mi
cpu: 250m
limits:
memory: 1Gi
cpu: 1000m
volumeMounts:
- name: data
mountPath: /data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: ssd
resources:
requests:
storage: 10Gi
The --memory-limit-mb=768 flag is the one that matters. Without it, Typesense mmaps aggressively and the resident set grows until the kernel OOM-kills the process. With it, Typesense triggers internal compaction before hitting the limit. The container limit is set to 1Gi to give 256MB headroom above the Typesense limit for the OS page cache and process overhead.
The re-index job is a CronJob that runs nightly. The strategy is alias-based atomic swaps:
# 1. Create new collection with timestamp suffix
new_name = f"medicines_{int(time.time())}"
client.collections.create({
"name": new_name,
"fields": [
{"name": "name", "type": "string"},
{"name": "generic", "type": "string", "optional": True},
{"name": "manufacturer", "type": "string", "facet": True},
{"name": "mrp", "type": "float"},
{"name": "in_stock", "type": "bool", "facet": True},
]
})
# 2. Bulk import from Postgres
rows = pg_conn.execute("SELECT * FROM medicines WHERE active = true")
documents = [format_for_typesense(row) for row in rows]
client.collections[new_name].documents.import_(documents, {"action": "create"})
# 3. Atomic alias swap
client.aliases.upsert("medicines", {"collection_name": new_name})
# 4. Drop old collection
old_collections = [c for c in client.collections.retrieve()
if c["name"].startswith("medicines_") and c["name"] != new_name]
for old in old_collections:
client.collections[old["name"]].delete()
Search always queries the medicines alias. During re-index, the alias still points to the old collection. The swap is atomic from the query path's perspective. If the import fails, the alias is untouched.
Metabase self-hosted
Metabase prices per seat. A pharmacy operations team has a lot of people who need to look at one dashboard once a day. Self-hosted Metabase runs in a Deployment (not StatefulSet, because Metabase's state is in its own Postgres database, not local disk).
The migration from Metabase Cloud was a pg_dump of the application database, pg_restore into the self-hosted instance, and connection string updates. The tricky part was SAML SSO with Keycloak. Metabase's SAML integration expects specific attribute names: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress for email, http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname for first name. Keycloak's default SAML mappers use different URIs.
The fix is Keycloak client mapper configuration. For each attribute, you create a "SAML User Attribute" mapper with SAML Attribute NameFormat set to URI Reference and the SAML Attribute Name set to the full URI Metabase expects. The debugging was painful because Metabase silently ignores attributes that don't match its expected URIs. No error, no warning, just an empty user profile. I ended up intercepting the SAML response with a browser extension to verify the attribute names in the XML.
Observability
The stack is Grafana, Prometheus, Loki, and Alertmanager, deployed via the kube-prometheus-stack Helm chart. Prometheus scrapes Typesense's /metrics endpoint (request latency histograms, index size, memory RSS, active connections) and Metabase's JMX metrics via a jmx_exporter sidecar.
The alert that justified the whole stack: a Prometheus rule on container_memory_working_set_bytes{container="typesense"} / on(pod) kube_pod_container_resource_limits{resource="memory"} > 0.8 fired three weeks after the migration. The Typesense process was climbing toward 1Gi because the --memory-limit-mb flag wasn't set yet (I added it after this incident). Grafana showed a steady upward ramp over 12 hours. We set the flag, deployed, and the memory curve flattened to a sawtooth pattern as compaction kicked in. No customer-visible impact.
Would I recommend self-hosting in general? It depends on team size. The cloud versions exist so you don't need a person who thinks about disks. We had one (me), so the math worked. If your infra person is also your only backend person and your only on-call, pay the SaaS bill.