Smart learning engine for a driving theory platform

The client was building a driving theory exam prep app for a Central Asian market — think millions of potential users in a region where passing the theory test is a genuine barrier to getting a license. The existing study options were PDFs of outdated question banks and a handful of apps that felt like they were built in 2012. The goal was straightforward: build a learning engine that actually helps people pass, tracks their progress honestly, and works in both Russian and Uzbek from day one.

topic-based engine and tickets

exam simulation screen showing question with answer options and timer

The core of what we built is a topic-based learning system backed by a full exam simulation engine. Users study questions organized by traffic rule topics — road signs, right of way, emergency procedures — and then test themselves with exam tickets. Each ticket is a randomized set of questions, but "randomized" here doesn't mean pulling questions from a hat. The real exam follows a strict distribution: a fixed number of questions per topic, weighted toward areas that matter most for road safety. We replicated that logic exactly, so practice tickets feel identical to the real thing. The generation algorithm pulls from the question bank using weighted random selection per topic bucket, and it's fast enough to generate a fresh ticket on every request without caching the combinations themselves.

progress without choking the db

Progress tracking was where things got interesting — and expensive if done naively. Every answer a user submits updates their profile: correct/wrong counts per topic, time spent per question, rolling accuracy trends, improvement curves over the last week. Multiply that by thousands of concurrent users, and you're looking at a write-heavy workload that can easily choke a database. We solved this with a two-tier approach: individual answer events go into a lightweight append-only log, and aggregated statistics are recomputed asynchronously using batch queries on a schedule. The user sees near-real-time stats, but the database isn't drowning in UPDATE statements on hot rows. Cachetools handles the read side — topic-level stats are cached per user with short TTLs, so the same dashboard request doesn't hit the database twice within a few seconds.

Append-only events, batch aggregates

Per-answer writes land in a log while scheduled recomputation keeps hot rows from melting the database.

bilingual content integrity

The bilingual challenge was more subtle than it sounds. Every question, every answer option, every explanation exists in both Russian and Uzbek. That's not just a translation layer — it's a consistency problem. When a question gets updated (because traffic laws change), both language versions need to update atomically. We enforced this at the database level: question content lives in a paired structure where both translations share a single version counter. If a Russian text gets edited but the Uzbek translation hasn't caught up yet, the system flags it as "pending review" rather than serving stale content. It's a small thing, but in a domain where one wrong word in a question can flip the correct answer, precision matters.

stateful exam sessions

The exam simulation itself runs with a countdown timer, question navigation, the ability to flag questions for review, and a detailed results breakdown at the end — split by topic, showing exactly which areas need more work. We built this as a stateful session on the backend rather than relying on the client, so closing the app mid-exam doesn't lose progress. The session persists in PostgreSQL with a TTL, and the user can resume within the time window.

Paired RU/UZ content versions

A shared version counter flags incomplete translation pairs so mismatched questions never ship after law updates.

results and lessons

The result is a learning engine that handles the full cycle from first-time study to exam-ready confidence. Users who complete at least 80% of the question bank in the app pass the real exam at a noticeably higher rate — the client started tracking this through post-exam surveys. The honest takeaway from this build: the hardest part wasn't the algorithms or the database optimization. It was the bilingual content integrity. Technical debt in translations compounds silently — you don't notice a mismatched question until a user reports it, and by then trust is already damaged. If we were starting over, we'd build the translation review pipeline before the learning engine, not after.

Stack

Backend: FastAPI, psycopg (async), Pydantic, cachetools

Database: PostgreSQL

Architecture: stateful exam sessions, batch aggregation for statistics, paired bilingual content model