Next-Gen Candidate Judge
A full-stack candidate evaluation platform with Docker-based sandboxed workspaces, multi-type automated judging (AI, Quiz, Text, Auto), real-time WebSocket progress tracking, and role-based dashboards — built with Laravel 12, Inertia.js, and React.
What is Next-Gen Candidate Judge?
Next-Gen Candidate Judge is a candidate evaluation platform that provisions isolated Docker-based workspaces for each task attempt, evaluates submissions through multiple judging strategies, and provides real-time feedback via WebSocket — all wrapped in a modern, role-based UI.
Think of it as a self-hosted HackerRank alternative where admins define tasks with Docker environments, and candidates get sandboxed workspaces with SSH access, timers, and automated scoring with attempt-based penalty systems.
The Problem
Traditional candidate evaluation workflows suffer from:
- No isolation — candidates run code on shared environments, risking interference
- Manual grading — evaluators spend hours checking submissions by hand
- No real-time feedback — candidates wait indefinitely for results
- Rigid evaluation — one-size-fits-all question formats don’t test diverse skills
I wanted a platform where each candidate gets their own ephemeral Docker workspace with SSH access, works on real-world tasks in a sandboxed environment, and gets evaluated automatically through configurable judging pipelines.
Architecture Overview
┌─────────────────────────────────────────────────────────┐
│ Frontend (React + Inertia.js) │
│ ┌──────────┐ ┌───────────┐ ┌──────────┐ ┌────────┐ │
│ │ Dashboard│ │ Task CRUD │ │ Workspace│ │ Users │ │
│ │ (Admin/ │ │ (Admin) │ │ (User) │ │ Mgmt │ │
│ │Candidate)│ │ │ │ │ │(Admin) │ │
│ └──────────┘ └───────────┘ └──────────┘ └────────┘ │
└────────────────────────┬────────────────────────────────┘
│ Inertia Protocol
┌────────────────────────┴────────────────────────────────┐
│ Backend (Laravel 12) │
│ ┌───────────────┐ ┌──────────────┐ ┌───────────────┐ │
│ │ Controllers │ │ Services │ │ Job Chains │ │
│ │ (6 controllers│ │ (Workspace, │ │ (Server Prov, │ │
│ │ + Dashboard)│ │ Judge, AI, │ │ Workspace │ │
│ │ │ │ Provision) │ │ Provisioning)│ │
│ └───────────────┘ └──────────────┘ └───────┬───────┘ │
│ │ │
│ ┌───────────────┐ ┌──────────────┐ ┌───────┴───────┐ │
│ │ Broadcasting │ │ Models │ │ Queue Worker │ │
│ │ (Pusher/ │ │ (12 models, │ │ (Redis + │ │
│ │ Soketi) │ │ 3 enums) │ │ Horizon) │ │
│ └───────────────┘ └──────────────┘ └───────────────┘ │
└────────────────────────┬────────────────────────────────┘
│
┌────────────────────────┴────────────────────────────────┐
│ Infrastructure │
│ ┌──────────┐ ┌───────────┐ ┌────────────┐ │
│ │ Docker │ │ Traefik │ │ Remote SSH │ │
│ │ Compose │ │ (Reverse │ │ Servers │ │
│ │ Per-Task │ │ Proxy) │ │ │ │
│ └──────────┘ └───────────┘ └────────────┘ │
└─────────────────────────────────────────────────────────┘
Key Features
1. Docker-Based Sandboxed Workspaces
Each task can define a docker-compose.yaml template with placeholders. When a candidate starts a task, the system:
- Creates a Linux user on the target server
- Finds a free port for the container
- Runs pre-provisioning scripts
- Fills the Docker Compose template with dynamic data (domain, ports, paths)
- Starts the Docker Compose stack
- Runs post-provisioning scripts
- Optionally configures SSH access into the container
- Finalizes and marks the workspace as ready
All of this happens through a Laravel job chain with real-time progress tracking via WebSocket.
2. Multi-Type Judging System
The platform supports four distinct judging strategies, each implementing a JudgeInterface:
| Judge Type | How It Works | Use Case |
|---|---|---|
| QuizJudge | Multiple-choice questions with correct answers stored server-side | Knowledge checks, MCQs |
| TextJudge | Exact text matching against expected answers | CLI output verification, specific commands |
| AiJudge | Sends answers to OpenAI API with custom prompts for semantic evaluation | Open-ended questions, code review |
| AutoJudge | Runs a custom judge script on the server (WIP) | Custom automated testing |
Each judge returns a standardized result with score, details per question, and a lock recommendation.
3. Attempt-Based Penalty System
Scores degrade with each attempt to incentivize preparation:
- Attempt 1: 100% of max score available
- Attempt 2: 90% of max score
- Attempt 3: 80% of max score
- …and so on
When the next attempt’s max possible score drops below 20% of the task’s total score, the task gets locked for that candidate. Tasks also lock on 100% correct answers (success lock).
Within each attempt, candidates get a configurable number of submissions (default: 3) before the attempt is marked as failed.
4. Real-Time WebSocket Progress
The platform uses Laravel Broadcasting with Pusher/Soketi for real-time updates:
- Server provisioning progress — admins see each provisioning step update live
- Workspace provisioning progress — candidates see their workspace being set up
- Job run status updates — script execution status broadcasts in real-time
Three dedicated broadcast channels with authorization:
server-updates.{serverId}— admin onlyworkspace-updates.{attemptId}— owner or adminjob-runs-updated— admin only
5. Server Provisioning Pipeline
Admins can register remote Linux servers and provision them automatically. The provisioning pipeline:
- StartProvisioningJob — Validates SSH connectivity
- UpdateServerPackageJob —
apt update && apt upgrade - InstallNecesseryPackagesJob — Core system packages
- InstallDockerJob — Docker Engine + Docker Compose
- UpdateServerFirewallJob — UFW firewall configuration
- InstallAndSetupTraefikJob — Traefik reverse proxy with Cloudflare DNS + Let’s Encrypt SSL
Each step uses Blade templates for script generation and executes via SSH using the ScriptEngine service.
6. Role-Based Dashboard
The dashboard renders different views based on user role:
Admin Dashboard:
- Total users, tasks, and submissions at a glance
- Active vs inactive task breakdown
- Recent activity table showing all candidate submissions
Candidate Dashboard:
- Remaining, completed, and locked task counts
- Total score earned
- Completion progress bar
- Recent submission history with attempt counts
Tech Stack
Backend
- Laravel 12 — PHP framework with Fortify auth, Horizon queues, Inertia SSR
- Laravel Horizon — Redis-powered queue dashboard for monitoring background jobs
- Spatie Laravel Permission — Role & permission management (admin/user)
- Pusher / Soketi — WebSocket broadcasting for real-time events
- Predis — Redis client for queues and caching
Frontend
- React 19 — UI library with TypeScript
- Inertia.js — SPA-like navigation without building an API
- Tailwind CSS 4 — Utility-first styling
- Radix UI — Accessible component primitives (Dialog, Select, Tabs, Progress, etc.)
- CodeMirror — In-browser YAML editor for Docker Compose templates
- date-fns — Lightweight date formatting
- Sonner — Toast notifications
- Lucide React — Icon library
Infrastructure
- Docker Compose — Per-task container orchestration with templated YAML
- Traefik — Reverse proxy with automatic SSL via Cloudflare DNS challenge
- SSH/Script Engine — Remote command execution for server and workspace provisioning
Database Schema
The system uses 12 Eloquent models across 26 migrations:
User
├── hasMany → UserTaskAttempt (attempts)
├── hasMany → TaskUserLock (task locks)
├── hasMany → Server (servers)
└── hasMany → ScriptJobRun (job runs)
Task
├── belongsTo → Server
├── hasMany → UserTaskAttempt (attempts)
├── hasMany → QuizJudge → hasMany → QuizQuestionAnswer
├── hasMany → TextJudge
├── hasMany → AiJudge
├── hasOne → AutoJudge
├── hasMany → TaskUserLock (locked users)
└── hasMany → ScriptJobRun (job runs)
UserTaskAttempt
├── belongsTo → User
├── belongsTo → Task
├── hasMany → ScriptJobRun (job runs)
└── hasMany → UserTaskAttemptAnswer
Server
└── belongsTo → User
Key enums:
- AttemptTaskStatus —
pending,preparing,started,running,evaluating,completed,done,failed,attempted_failed,terminated,locked - ScriptJobStatus —
pending,running,in-progress,completed,failed - TaskUserLockStatus —
completed,penalty
Project Structure
app/
├── Contracts/ # JudgeInterface, TracksProgressInterface
├── Enums/ # AttemptTaskStatus, ScriptJobStatus, TaskUserLockStatus
├── Events/ # 6 broadcast events (Server, Workspace, ScriptJobRun)
├── Http/Controllers/ # Dashboard, Task, UserTask, Server, User, ScriptJobRun
├── Interfaces/ # SolutionCheckerInterface
├── Jobs/Scripts/
│ ├── Server/ # 7 server provisioning jobs
│ └── Workspace/ # 8 workspace provisioning jobs
├── Models/ # 12 Eloquent models
├── Services/
│ ├── AI/ # OpenAIService (GPT integration)
│ ├── JudgeServices/ # Quiz, Text, AI, Auto judge services
│ ├── Progress/ # WorkflowRegistry
│ └── ... # ScriptEngine, WorkspaceService, ServerProvisionService
└── Traits/ # HasMeta, TracksProgress, NotesAccessor
resources/js/
├── components/ # 60+ React components (UI primitives + feature components)
│ ├── ui/ # Radix-based shadcn/ui components
│ ├── AdminDashboard.tsx
│ ├── CandidateDashboard.tsx
│ ├── WorkflowProgressTracker.tsx
│ ├── ServerProvisionProgressTracker.tsx
│ ├── WorkspaceProgressTracker.tsx
│ ├── SubmissionResultModal.tsx
│ ├── QuizProgressModal.tsx
│ └── AiJudgeProgressModal.tsx
├── pages/
│ ├── auth/ # Login, Register, 2FA, Password reset
│ ├── dashboard.tsx # Role-based dashboard routing
│ ├── tasks/ # CRUD pages (admin)
│ ├── user/tasks/ # Task list + attempt workspace (candidate)
│ ├── servers/ # Server CRUD + provisioning view
│ ├── users/ # User management + attempt analytics
│ ├── jobs/ # Script job run viewer
│ └── settings/ # Profile, password, appearance, 2FA
└── layouts/ # App shell with sidebar navigation
Development Timeline
| Date | Milestone |
|---|---|
| Nov 3, 2025 | Project initialized with Laravel 12 + React starter kit |
| Nov 5 | Server provisioning with Docker integration, SSH credential management |
| Nov 6 | Script job execution engine, workspace provisioning pipeline, traits system |
| Nov 7 | Traefik reverse proxy working, judge configuration support (Quiz, Text, AI, Auto) |
| Nov 9 | HTTPS routing fix, SSH access to containers, task show page redesign, TextJudge evaluation working |
| Nov 10 | QuizJudge + AiJudge services, server provision progress tracking, edit page fixes |
| Nov 11 | Enum-based status system, phone registration, code cleanup (removed 6+ unused files), Pusher/Soketi WebSocket |
| Nov 12 | Full WebSocket integration, Jobs page redesign, User Management module |
| Nov 18 | Real-time workflow progress tracking via WebSocket, CodeMirror YAML editor |
| Nov 19 | Pre/post script execution steps, ScriptJobStatus enum migration |
| Nov 23-24 | Timer system, submission count tracking, user task view + analytics |
| Feb 17, 2026 | Role-based dashboard (Admin + Candidate views) |
Challenges & Learnings
Dynamic Docker Compose Templating
Each task defines a Docker Compose YAML with {{placeholders}}. The system fills these at runtime with attempt-specific data (domains, ports, paths). Getting Traefik routing to work correctly with dynamic subdomains and Cloudflare DNS was one of the trickier parts.
Job Chain Orchestration with Progress Tracking
Laravel’s Bus::chain() dispatches jobs sequentially, but I needed each job to report its progress back to the frontend in real-time. The solution was a TracksProgress trait that stores workflow state in a JSON metadata column and broadcasts updates via WebSocket after each step.
Attempt Penalty & Locking Logic
The scoring system needed to handle multiple edge cases: successful locks vs penalty locks, submission limits within attempts, and preventing race conditions when starting new attempts (solved with lockForUpdate()).
What’s Next
- AutoJudge implementation — Run custom judge scripts inside the container to validate solutions
- Leaderboard — Real-time ranking of candidates by total score
- Bulk task import/export — JSON/YAML format for task definitions
- Container resource limits — CPU/memory constraints per workspace
- Automated cleanup — Cron-based cleanup of expired workspaces and containers