Docs

KeyLM Project Documentation

KeyLM is a hybrid free-tier and BYOK multi-provider chat app built with Next.js App Router, Prisma, and Postgres. This page documents the product flow, backend APIs, and data model in one place.

Open the App Back Home

Overview

A single workspace where users can start on a shared Groq free pool, then move to their own OpenAI, Gemini, or Anthropic keys.

Hybrid Access

New accounts get a shared Groq fallback before switching to personal keys.

Model Catalog

Models are normalized across providers and cached for 24 hours per user.

Streaming Chat

Server-sent events deliver token deltas with stop and retry safety.

Threaded History

Threads persist provider, model, settings, message history, and token usage.

Quick Start

1. Install dependencies

npm install

2. Create the environment file

cp .env.example .env
# set DATABASE_URL, APP_AUTH_SECRET, APP_ENCRYPTION_KEY, NEXT_PUBLIC_SUPABASE_URL, NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY, GROQ_API_KEY

3. Generate an encryption key

node -e "console.log(Buffer.from(require('crypto').randomBytes(32)).toString('base64'))"

4. Run database migrations

npm run prisma:migrate

5. Start the dev server

npm run dev

Environment

Required variables for local development and production.

DATABASE_URL: Postgres connection string used by Prisma.
APP_AUTH_SECRET: HMAC secret for signing session tokens.
NEXT_PUBLIC_SUPABASE_URL: Supabase project URL used for passwordless Email Auth.
NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY: Supabase publishable/anon key used to request Magic Links and verify OTP codes.
NEXT_PUBLIC_TURNSTILE_SITE_KEY: Cloudflare Turnstile public site key rendered on the passwordless login/register form.
APP_PUBLIC_BASE_URL: Public app origin used to build the /auth/callback Magic Link redirect URL.
APP_ENCRYPTION_KEY: 32-byte base64 key for encrypting provider secrets.
GROQ_API_KEY: Server-only API key used for the shared KeyLM Free Groq pool.
GROQ_BASE_URL: Groq base URL, defaults to https://api.groq.com/openai/v1.
GROQ_FREE_MODEL: Fixed shared free model, defaults to moonshotai/kimi-k2-instruct-0905.
GROQ_FREE_FALLBACK_MODELS: Optional comma-separated Groq fallback models if the primary free model is unavailable.
FREE_USER_DAILY_LIMIT: Per-user daily free request limit, defaults to 50.
FREE_GLOBAL_DAILY_LIMIT: Global daily free request limit, defaults to 1000.
RATE_LIMIT_PER_MINUTE: Optional request limit for chat and password reset endpoints.
PASSWORD_RESET_TTL_MINUTES: Legacy password reset TTL. Passwordless OTP/link expiry is configured in Supabase Auth as 900 seconds.

Supabase setup: enable Email Auth, add http://localhost:3000/auth/callback and your production callback URL to Auth redirect URLs, set Email OTP expiry to 900 seconds, enable Captcha with Cloudflare Turnstile, add TURNSTILE_SECRET_KEY only in Supabase Dashboard, and include both the Magic Link and OTP token in the Supabase email template if you want users to choose either method.

User Flow

Create an account or sign in.
Use KeyLM Free immediately if daily user/global Groq quota is still available.
Add a provider key and validate it with a lightweight request when you want BYOK mode.
Load the model list for connected providers and create BYOK threads.
Send a message and stream responses via SSE.
Persist assistant output, token usage, and continue the thread.

Streaming responses use server-sent events from the messages endpoint, with idempotency on requestId.

Architecture

The app is split into route handlers under src/app/api and reusable services undersrc/lib.

Auth and sessions

Supabase passwordless email auth with Magic Links/OTP and signed, httpOnly app session cookies.

Key management

Provider keys are stored encrypted, masked in UI, and audited.

Provider adapters

OpenAI, Gemini, Anthropic, and Groq adapters normalize models, streaming, and usage.

Model service

Model lists are cached per key and refreshed on demand.

Thread service

Threads and messages are persisted with idempotent request IDs.

Free-tier quotas

Per-user and global daily counters gate the shared Groq fallback.

Project Structure

src/app

App Router pages and API route handlers.

src/lib

Core services, providers, crypto, auth, and utilities.

prisma

Database schema and migrations.

src/app/globals.css

Shared theme and component styles.

API Endpoints

Auth

POST

/api/auth/register

Send a Supabase passwordless signup Magic Link or OTP.

POST

/api/auth/login

Send a Supabase passwordless login Magic Link or OTP.

POST

/api/auth/verify-otp

Verify an email OTP and start the app session.

GET

/auth/callback

Handle Magic Link callback, sync the user, and start the app session.

POST

/api/auth/logout

Clear the session cookie.

GET

/api/auth/me

Return the current session user.

POST

/api/auth/password-reset/request

Legacy password reset endpoint; passwordless auth does not require it.

POST

/api/auth/password-reset/confirm

Legacy password reset confirmation endpoint.

Provider keys

POST

/api/providers/:provider/keys

Validate and store a new key.

GET

/api/providers/:provider/keys

List keys for a provider.

POST

/api/providers/:provider/keys/:keyId/validate

Re-validate a stored key.

DELETE

/api/providers/:provider/keys/:keyId

Revoke a key.

Models

GET

/api/providers/:provider/models

Return cached models, with optional refresh=true.

POST

/api/providers/:provider/models/refresh

Force a model refresh and update cache.

Free usage

GET

/api/usage/free

Return the current user/global Groq free quota snapshot.

Threads and messages

POST

/api/threads

Create a BYOK or KeyLM Free thread.

GET

/api/threads

List threads for the user.

GET

/api/threads/:threadId

Get a thread and its messages.

DELETE

/api/threads/:threadId

Delete a thread.

POST

/api/threads/:threadId/messages

Send a message, stream SSE deltas, and persist token usage.

Data Model

User

id, email, passwordHash?, supabaseUserId, lastLoginAt, createdAt

ProviderKey

provider, keyCiphertext, keyMask, status, lastValidatedAt, lastUsedAt

ProviderModelCache

provider, keyId, models, fetchedAt, expiresAt

Thread

provider, model, systemPrompt, settings, status, updatedAt

Message

threadId, role, content, providerMessageId, clientRequestId, metadata.usage

AuditLog

action, provider, keyId, metadata, createdAt

PasswordResetToken

tokenHash, expiresAt, usedAt

UserDailyFreeUsage

userId, day, count

GlobalDailyFreeUsage

day, count

UX Behavior

Users without active keys can start on KeyLM Free while quota remains.
Model dropdown appears only after a provider key is active.
Model lists are cached for 24 hours and can be refreshed manually.
Threads are locked to the provider and model chosen at creation, including Groq free threads.
After 5 free requests, the UI shows a persistent reminder to connect a personal key for better output.
Streaming responses show deltas in real time with stop support.
Each assistant reply shows prompt, output, and total token usage when available.

Security

Provider keys are encrypted at rest and never returned in plaintext.
The shared Groq key stays server-side and is never exposed to clients.
Supabase verifies Magic Links/OTPs; the app then issues its existing signed httpOnly session cookie.
Rate limiting protects chat streaming and password reset requests.
Audit logs track key lifecycle events for traceability.
Model and thread access is scoped to the authenticated user.

Edge Cases

A key that was valid can be revoked later; validation endpoints update status.
If a model refresh fails, cached models are served with a stale flag.
Duplicate message requests are deduped via clientRequestId.
Free quota resets at 00:00 UTC for both the user bucket and the global pool.
Rate limits return retryable errors with 429 responses.

Testing

Unit: provider adapters, crypto helpers, and validation schemas.
Integration: key validation, free quota reservation, model caching, and thread persistence.
E2E: use KeyLM Free, exhaust quota, connect a key, stream chat, and save history.
Security: verify secrets never leak to logs or responses.

Roadmap

Tool calling and structured output support.
Vision attachments with capability gating.
Usage analytics and per-model cost reporting.
Team workspaces with shared key vaults.