Anurag Gupta

All posts

ConOps: I wanted Argo CD without the Kubernetes

2026-02-06

I spend my workdays inside Kubernetes, and the thing I miss most when I leave it is Argo CD. Push to Git, the cluster converges, drift gets corrected, done. Then I'd go home to my own servers, which run plain Docker Compose, and deployment was "ssh in and run docker compose up like a caveman."

The gap annoyed me enough that over a few weeks in January and February I built ConOps: point it at a Git repo containing a compose file and it clones, deploys, polls for new commits, and reconciles. One Go binary. No cluster.

The reconciliation loop

The core is a goroutine running an infinite loop with a configurable tick interval (default 30 seconds):

func (c *Controller) Run(ctx context.Context) error {
    ticker := time.NewTicker(c.interval)
    defer ticker.Stop()

    for {
        select {
        case <-ctx.Done():
            return nil
        case <-ticker.C:
            if err := c.reconcile(ctx); err != nil {
                c.logger.Error("reconcile failed", "err", err)
            }
        }
    }
}

Each reconcile call does four things:

  1. git pull --rebase on the local clone (via go-git, in-process)
  2. Parse the compose file into a normalized service map
  3. Hash each service's config and compare against the stored hashes from the last apply
  4. For any mismatch: stop the old container, remove it, recreate from the compose definition

State is stored in SQLite (embedded via modernc.org/sqlite, pure Go, no CGO). Each reconciliation writes a record with the commit SHA, per-service config hashes, and the outcome. This gives you a full deployment history.

Config hashing for drift detection

The hardest single problem was knowing whether a container is "different." You can docker inspect a running container, but the output includes runtime-generated fields (container ID, created timestamp, network endpoint IDs) that change on every docker run. Comparing full inspect output gives phantom drift on every iteration.

The solution is to hash only the fields that define desired state:

type ServiceConfig struct {
    ImageDigest string            `json:"image_digest"`
    Env         map[string]string `json:"env"`
    Ports       []string          `json:"ports"`
    Volumes     []string          `json:"volumes"`
    Labels      map[string]string `json:"labels"`
    Networks    []string          `json:"networks"`
    Resources   Resources         `json:"resources"`
}

func (s *ServiceConfig) Hash() string {
    // Sort map keys for deterministic serialization
    data, _ := json.Marshal(s)
    h := sha256.Sum256(data)
    return hex.EncodeToString(h[:])
}

The image field uses the digest (sha256:abc123...), not the tag (latest), because tags are mutable. ConOps resolves the tag to a digest via the Docker registry API (HEAD /v2/{name}/manifests/{tag} with Accept: application/vnd.docker.distribution.manifest.v2+json). This means a docker push to the same tag triggers a redeploy on the next reconciliation, which is the behavior you want.

Labels are filtered to exclude Docker-injected ones (com.docker.compose.project, com.docker.compose.service, etc.) before hashing. Without this filter, every container would appear drifted because Docker adds its own metadata at creation time.

The hash is stored as a label on the running container: conops.config-hash=abc123. On the next loop, the controller reads this label and compares it against the hash of the current compose file's service definition. Match means no drift. Mismatch means recreate.

Handling human interference

If someone SSHs into the box and runs docker stop myapp, the next reconciliation detects that the container isn't running (the Docker API returns no container matching the service name with a running status) and recreates it from the compose definition.

If someone runs docker update --env FOO=bar myapp, the config hash won't match because the environment changed. The container gets killed and recreated from Git state. The loop doesn't care why state diverged, only that it did. This is the same principle that makes Kubernetes controllers robust: desired state is declarative, actual state is observed, and the controller drives actual toward desired.

In-process Git via go-git

Git operations use go-git rather than shelling out to the git binary:

func (r *Repo) Pull(ctx context.Context) (bool, error) {
    w, err := r.repo.Worktree()
    if err != nil {
        return false, err
    }

    beforeHead, _ := r.repo.Head()
    err = w.PullContext(ctx, &git.PullOptions{
        RemoteName:    "origin",
        ReferenceName: r.branch,
        SingleBranch:  true,
        Auth:          r.auth,
        Force:         true,
    })
    if errors.Is(err, git.NoErrAlreadyUpToDate) {
        return false, nil
    }
    afterHead, _ := r.repo.Head()
    return beforeHead.Hash() != afterHead.Hash(), err
}

The benefit is no dependency on the host's git installation. The cost is that go-git doesn't support every git feature (sparse checkout doesn't work, for example), so the full repo gets cloned. For most compose repos this is a few hundred KB, so it's not a problem.

SSH authentication uses ssh.NewPublicKeysFromFile for key-based auth. HTTPS uses http.BasicAuth with a token. The credentials are passed once at startup via flags or environment variables.

Embedded web UI

The UI is vanilla HTML/JS embedded in the Go binary via embed.FS:

//go:embed ui/dist/*
var uiFS embed.FS

func (s *Server) setupRoutes() {
    stripped, _ := fs.Sub(uiFS, "ui/dist")
    s.mux.Handle("/", http.FileServer(http.FS(stripped)))
    s.mux.HandleFunc("/api/services", s.handleServices)
    s.mux.HandleFunc("/api/history", s.handleHistory)
    s.mux.HandleFunc("/api/logs/", s.handleLogs)
}

The /api/services endpoint returns the current state of each service: name, status (synced/drifted/error), current image, config hash, uptime. The /api/history endpoint returns the deployment timeline from SQLite. /api/logs/{service} streams container logs via the Docker API's ContainerLogs endpoint with follow=true.

The frontend polls /api/services every 5 seconds and updates a status grid. Watching a red "drifted" badge turn green after the loop fixes things is stupidly satisfying. There's also a manual sync button that triggers an immediate reconciliation outside the normal interval, for when you've just pushed and don't want to wait 30 seconds.

It found a small audience among homelab people, picked up a few dozen stars, and people run it for things I'd never have tested, which is how I learned about a bug with private registries within a week of telling anyone it existed.