HealthLab Management Platform - Engineering Case Study

Problem #

Pathology labs struggle with appointment scheduling and patient communication. Manual booking systems lead to:

Phone tag - Staff spending hours on calls for simple bookings
Double bookings - No real-time capacity tracking across time slots
Report delivery delays - Patients calling repeatedly to check if results are ready
No centralized dashboard - Lab admins managing everything in spreadsheets or paper

I wanted to build a system where patients could book tests via Telegram (where they already spend time), and lab admins could manage everything through a web dashboard. I got this idea while scheduling a blood test for myself. Labs already have partial digital workflows (WhatsApp reports, phone bookings), but these systems are fragmented.

Constraints #

Multi-channel booking - Support Telegram bot today, extensible to WhatsApp/Web later
Real-time capacity tracking - Prevent overbooking of time slots
Secure API - JWT auth for admin dashboard, API keys for bot integrations
Production-ready - Dockerized, with proper migrations and Swagger docs
Minimal infrastructure - PostgreSQL + single Go binary, no message queues or Redis(for V1 MVP)

Architecture #

flowchart TB subgraph clients["Client Layer"] direction LR telegram["Telegram Bot
(Patient UX)"] react["React Admin
Dashboard"] api_clients["API Clients
(Future)"] telegram ~~~ react ~~~ api_clients end subgraph api["Go/Gin REST API"] subgraph auth["Authentication Layer"] direction LR jwt["JWT Auth
(Dashboard)"] apikey["API Key Auth
(Bots)"] webhook["Webhook Auth
(Telegram)"] jwt ~~~ apikey ~~~ webhook end flow["Handlers → Services → Repositories → GORM"] end subgraph db["PostgreSQL Database"] tables["Labs | Tests | TimeSlots | Bookings | Reports"] end clients --> api api --> db

Key components:

Telegram Bot - Conversation state machine for guided booking flow
Go/Gin API - Clean architecture with repository pattern
React Dashboard - Admin UI with real-time stats
Docker Compose - One-command local development and production deployment

Decisions & Tradeoffs #

Why Go over Node.js or Python?

Go was chosen for several reasons:

Single binary deployment - No runtime dependencies, easy Docker images
Concurrency - Goroutines handle multiple bot conversations efficiently
Strong typing - Catches errors at compile time, not in production
Gin framework - Fast, well-documented, great middleware ecosystem
Personal choice - I wanted to work with Go and build something production-ready.

Tradeoff: Steeper learning curve, but the type safety and deployment simplicity are worth it.

Why Telegram bot over WhatsApp Business API?

Free tier - Telegram Bot API is free, WhatsApp charges per conversation
No approval process - Can start building immediately
Rich interactions - Inline keyboards, callback queries, and message editing
Wider reach in target market - Many users in India use Telegram for group communications

Tradeoff: WhatsApp has higher market penetration, but Telegram let me iterate faster.

Why conversation state machine for booking flow?

The booking flow requires multiple steps: select test → pick slot → enter name → enter phone → confirm. Options considered:

Stateless - Pass all context in callback data (limited to 64 bytes)
Database state - Persist conversation state to PostgreSQL
In-memory state - Store in a sync.Map with TTL

I chose in-memory state with the sync.Map because:

Booking conversations are short-lived (< 5 minutes)
No need for persistence across restarts (user can start over)
Simple to implement and debug

Why repository pattern over direct GORM calls?

Testability - Can mock repositories for unit tests
Single responsibility - Handlers don't know about GORM
Query reuse - Complex queries like slot availability checks are centralized

Tradeoff: More boilerplate, but the separation pays off as the codebase grows.

Why Cloudflare R2 for report storage?

Reports are uploaded as PDFs and need to be accessible via stable URLs:

S3-compatible API - Works with existing AWS SDKs
Zero egress fees - Patients downloading reports don't incur costs
Simple integration - AWS CLI works natively

Implementation Details #

Telegram Bot State Machine

The bot maintains per-user conversation state:

type ConversationState struct {
    Step          string    // "select_test", "select_slot", "enter_name", etc.
    LabID         uint
    TestID        uint
    SlotID        uint
    SlotStartTime time.Time
    Name          string
    Phone         string
    Gender        string
}

// State stored in sync.Map with chatID as key
func (b *Bot) getState(chatID int64) *ConversationState {
    if s, ok := b.conversations.Load(chatID); ok {
        return s.(*ConversationState)
    }
    return &ConversationState{Step: "idle"}
}

Slot Capacity Tracking

Time slots have capacity limits that decrement atomically on booking:

// In repository - atomic update to prevent race conditions
func (r *TimeSlotRepository) IncrementBooked(ctx context.Context, slotID uint) error {
    return r.db.Model(&model.TimeSlot{}).
        Where("id = ? AND booked < capacity", slotID).
        Update("booked", gorm.Expr("booked + 1")).Error
}

Multi-Auth Middleware Stack

Three authentication strategies for different clients:

// JWT for dashboard (admin users)
v1.Use(middleware.JWTAuth())

// API Key for bot integrations
botGroup := v1.Group("/bot")
botGroup.Use(middleware.APIKeyAuth())

// Secret token for Telegram webhooks
telegramGroup.Use(middleware.TelegramSecretToken())

Natural Language Date Parsing

The bot supports natural language date input for slot selection:

// "next monday", "tomorrow", "15th jan" → time.Time
func parseNaturalDate(input string) (time.Time, error) {
    // Uses github.com/olebedev/when for parsing
}

Failure Modes & Mitigations #

Slot Overbooking Race Condition

Problem: Two users selecting the same slot simultaneously
Mitigation: Atomic UPDATE with WHERE booked < capacity condition
Result: Second booking fails with "slot no longer available"

Telegram Webhook Downtime

Problem: Server restarts cause missed webhook updates
Mitigation: Telegram retries webhooks for 24 hours
Future: Implement /getUpdates polling as fallback

Database Connection Pool Exhaustion

Problem: Too many concurrent bookings
Mitigation: GORM connection pool limits + request timeouts
Monitoring: Health endpoint checks DB connectivity

Results & Metrics #

Bot Response Time: < 200ms for all commands
Booking Flow Completion: 5-step flow completes in under 2 minutes
API Documentation: Full Swagger/OpenAPI specs auto-generated
Deployment: Single docker-compose up for complete stack

Lessons Learned #

What I'd do differently

Add webhook retry queue - Currently relies on Telegram's retry; should have Redis-backed queue
Use database for conversation state - Would enable multi-instance deployment
Add rate limiting earlier - Should have implemented from day one to prevent abuse
Not use docker - Go is already efficient with a single binary for production, docker adds unnecessary overhead. I went with docker due to my prev experience with it.

What worked well

Repository pattern - Made unit testing straightforward
Telegram inline keyboards - Good UX for multi-step flows
Goose migrations - Database schema version control works well
Swagger auto-generation - API documentation stays in sync