Anurag Gupta

All posts

Migrating live auth from SuperTokens to Keycloak

2026-01-29

Auth migrations are the dental surgery of backend work. Nobody wants one, you can't skip it once it's needed, and the patient is awake the whole time. Early this year I moved Farmako's authentication from SuperTokens to Keycloak while the app kept serving orders.

Why move

SuperTokens had been fine for a single app with phone-OTP login. But we'd grown into a small fleet: customer app, POS, kiosk, internal tools, a referral site. Each one was doing its own session handling against the same SuperTokens core, and every new product meant re-answering the same questions about session storage, token validation across origins, and refresh rotation.

Keycloak is heavy and the admin UI was clearly designed for enterprise procurement demos, but realms, clients, and OIDC-standard token exchange map exactly onto the multi-product problem we had. One identity, many products, one login.

The SPI

Keycloak's phone-OTP support out of the box is limited to TOTP (authenticator app codes). We needed SMS OTP as the primary authentication method, because that's what a pharmacy customer base uses. I wrote a custom Service Provider Interface in Java.

The SPI implements Authenticator and AuthenticatorFactory. The flow:

@Override
public void authenticate(AuthenticationFlowContext context) {
    String phone = context.getHttpRequest()
        .getDecodedFormParameters()
        .getFirst("phone");

    // Generate OTP
    String otp = String.format("%06d", secureRandom.nextInt(1_000_000));
    String hash = BCrypt.hashpw(otp, BCrypt.gensalt(10));

    // Store in auth session with TTL
    context.getAuthenticationSession()
        .setAuthNote("otp_hash", hash);
    context.getAuthenticationSession()
        .setAuthNote("otp_expiry",
            String.valueOf(System.currentTimeMillis() + 300_000)); // 5 min

    // Send via SMS gateway
    smsGateway.send(phone, "Your code: " + otp);

    context.challenge(
        context.form()
            .setAttribute("phone", phone)
            .createForm("otp-verify.ftl")
    );
}

On the verify step:

@Override
public void action(AuthenticationFlowContext context) {
    String submitted = context.getHttpRequest()
        .getDecodedFormParameters()
        .getFirst("otp");
    String storedHash = context.getAuthenticationSession()
        .getAuthNote("otp_hash");
    long expiry = Long.parseLong(
        context.getAuthenticationSession().getAuthNote("otp_expiry"));

    if (System.currentTimeMillis() > expiry) {
        context.failureChallenge(AuthenticationFlowError.EXPIRED_CODE, ...);
        return;
    }
    if (!BCrypt.checkpw(submitted, storedHash)) {
        context.failureChallenge(AuthenticationFlowError.INVALID_CREDENTIALS, ...);
        return;
    }

    // Look up or create user by phone
    String phone = context.getAuthenticationSession().getAuthNote("phone");
    UserModel user = findOrCreateUser(context, phone);
    context.setUser(user);
    context.success();
}

findOrCreateUser is the migration bridge. It first queries Keycloak's user store by the phone attribute. If no user exists, it queries the old Postgres user table by phone number. If found there, it creates the Keycloak user with the same UUID (set via UserModel.setId()), copies the relevant attributes, and returns it. This means users get adopted by Keycloak one login at a time, with their old IDs preserved so downstream services that reference user IDs don't break.

The SPI JAR is built with Maven and mounted into the Keycloak container via an init container that copies it to /opt/keycloak/providers/. On startup, Keycloak scans that directory and registers discovered SPIs.

Frontend cutover

Each product app switched from the SuperTokens SDK to oidc-client-ts, a standard OIDC client library. The change per app was roughly:

// Before: SuperTokens
import SuperTokens from 'supertokens-web-js';
SuperTokens.init({ apiDomain: 'https://api.farmako.ai', apiBasePath: '/auth' });

// After: OIDC
import { UserManager } from 'oidc-client-ts';
const mgr = new UserManager({
    authority: 'https://auth.farmako.ai/realms/farmako',
    client_id: 'customer-app',
    redirect_uri: 'https://farmako.ai/callback',
    scope: 'openid profile phone',
});

The backend switched from SuperTokens session verification middleware (which checked its proprietary session format) to standard JWT validation against Keycloak's JWKS endpoint:

import jwt from 'jsonwebtoken';
import jwksClient from 'jwks-rsa';

const client = jwksClient({
    jwksUri: 'https://auth.farmako.ai/realms/farmako/protocol/openid-connect/certs',
    cache: true,
    rateLimit: true,
});

function getKey(header, callback) {
    client.getSigningKey(header.kid, (err, key) => {
        callback(null, key.getPublicKey());
    });
}

// In middleware:
jwt.verify(token, getKey, { issuer: 'https://auth.farmako.ai/realms/farmako' });

The logout bug

There's a closed PR in our repo titled "fix: logout" that represents a genuinely confusing week. During the transition period, both session systems were active. SuperTokens sessions lived in httpOnly cookies scoped to our domain. Keycloak sessions lived in Keycloak's own cookies plus an OIDC id_token. Signing out of one didn't sign you out of the other.

Users ended up in a half-logged-in state: the frontend thought they were authenticated (Keycloak token valid) but the backend rejected requests (SuperTokens session expired, and some middleware was still checking it).

The fix was making logout explicitly kill both:

async function logout() {
    // 1. Keycloak logout
    await oidcManager.signoutRedirect({
        id_token_hint: user.id_token,
        post_logout_redirect_uri: window.location.origin,
    });

    // 2. SuperTokens session revoke (during transition only)
    await fetch('/auth/signout', { method: 'POST', credentials: 'include' });

    // 3. Clear all cookies for good measure
    document.cookie.split(';').forEach(c => {
        document.cookie = c.trim().split('=')[0] +
            '=;expires=Thu, 01 Jan 1970 00:00:00 GMT;path=/';
    });
}

This dual-kill ran for about three weeks until the last client was fully migrated, at which point the SuperTokens code path and the supertokens-web-js dependency were deleted entirely.

By March, SuperTokens was gone. The SPI handles about 15k logins per day. Multi-product SSO works as expected: log into the customer app, and you're authenticated on the kiosk and POS too.