Case StudyZisuGen: Architecting a Production-GradeMulti-Agent Swarm for Global EdTech
Inside SoftGen's submission to Google's elite agentic hackathon. Stress-testing multi-model reasoning, building self-healing JSON repair pipelines, and structuring localized RAG vector spaces for enterprise scale.

The Challenge: Hardening Agents for the Real World
The Google for Startups AI Agents Challenge 2026 (April 22 - June 12, 2026) brought together over 1,560 developers to build autonomous systems driving real business results. Among three competitive tracks, SoftGen Solutions entered ZisuGen under Track 2: Optimize (Existing Agents).
Track 2 was designed for systems that run successfully in sandbox environments but face the unpredictable edge cases of real-world deployment. Our focus was treating AI quality as a rigorous engineering discipline: stress-testing multi-step reasoning, programmatically refining system instructions, eliminating hallucinations on localized curricula, and optimizing network costs.
Custom Orchestration Over Abstraction Frameworks
For an enterprise EdTech platform demanding extreme concurrency and sub-second latency, standard orchestration wrappers (such as LangChain or CrewAI) introduce unacceptable latency overhead, token wastage, and a loss of granular execution control.
Therefore, we engineered a 100% Custom Multi-Agent Orchestrator (Swarm Logic) natively using the `@google/generative-ai` SDK and Vertex AI. By dynamically routing between Gemini 3.1 Pro (for deep reasoning) and Gemini 2.5 Flash (for high-speed retrieval/OCR), combined with the Groq SDK (LLaMA 3.3 70B) for drafting and raw pgvector queries on Cloud SQL (bypassing vector abstraction libraries), we achieved an optimized cognitive engine that standard ADKs cannot match.
Multi-Model Orchestration & Cognitive Routing
ZisuGen routes user queries to four distinct cognitive modes based on task complexity, cost matrices, and latency targets:
Routed to `Gemini 2.5 Flash` for quick answers and standard RAG queries, minimizing compute overhead and cost.
Routed to `Gemini 3.1 Pro` to force extended reasoning steps to appear in the UI before outputting final answers, heavily utilized for complex A/L Physics.
Handled via `guided-learning-handler.ts`. Instead of providing direct answers, it invokes `Gemini 3.1 Pro` to analyze the student's step-by-step mathematical progression.
Handled via `problem-solver-handler.ts`. When an image is uploaded, the pipeline parses the base64 layers and routes to a Pro Vision model.
Overcoming LLM Limitations: The 4-Stage JSON Repair Pipeline
During high-compute generation tasks—such as orchestrating 50-question MCQ examination papers—LLMs occasionally output truncated JSON or malformed LaTeX math formatting due to token limit exhaustion or parsing errors.
To solve this, we engineered a custom 4-Stage JSON Repair & Recovery Pipeline that acts as a middleware between the raw LLM output and the UI renderer:
- 01/Bracket Balancing: Dynamically scans the response string and auto-appends missing closing tags, curly braces, and square brackets.
- 02/LaTeX Sanitizer: Escapes backslashes and normalizes mathematical notation to prevent rendering crashes.
- 03/Truncation Parsing: Extracts partial objects from incomplete JSON streams and preserves the academic integrity of successfully generated content.
- 04/Validation Check: Ensures the output conforms strictly to the TypeScript interface before passing it to the frontend.
JSON Auto-Healing Telemetry
{ "questions": [ { "id": 1, "text": "f = G\frac{m_1m_2}{r^2}", "options": [ "A", "B"...
{ "questions": [ { "id": 1, "text": "f = G\\frac{m_1m_2}{r^2}", "options": [ "A", "B" ] } ] }
The Sri Lankan Data Moat: pgvector & Temporal Logic
Ingesting localized data presented unique engineering hurdles. Sri Lanka's national syllabus textbooks and past papers are largely stored in legacy, non-Unicode encodings (such as FM Abaya and Bamini) that break standard LLM ingestion.
SoftGen built a custom Transcoding Middleware that converts FM Abaya strings into standard UTF-8 Unicode before vectorization. The vectorized chunks are stored in Cloud SQL PostgreSQL using pgvector.
To manage curriculum iterations, we architected a Bitemporal Data Model using `validFromYear` and `validUntilYear` parameters. When the National Institute of Education (NIE) updates a subject syllabus, ZisuGen performs a Delta Ingestion. Legacy vectors are expired and new curriculum namespaces are injected with temporal validity parameters, ensuring the Socratic tutor never references deprecated concepts.
Scale, Cost Governance & Market Validation
Running autonomous swarms and generating multimedia pipelines requires enterprise-grade financial guardrails. We engineered a Dynamic AI Service Costs Matrix to protect our billing accounts:
Achieved by keeping CAC at an absolute minimum (Rs. 150.00) via our Programmatic SEO Flywheel.
Google Cloud Run horizontal scaling project handles the Sunday 9:00 AM thundering herd.
Real active students sitting for exams, generating micro-psychometric telemetry in production.
Google Cloud & Antigravity Synergy
By combining Google Cloud Run serverless agility with Firebase real-time synchronizations and Gemini multi-model agent capabilities, ZisuGen demonstrates what is possible when human ingenuity meets cutting-edge developer platforms.
Under the leadership of Founder & CEO Shehan Chamika, the entire enterprise-grade platform was conceptualized, coded, and deployed to production in a staggering 2-month sprint strictly utilizing the Google Antigravity AI Agent IDE. ZisuGen stands ready to scale from Sri Lanka to emerging markets globally.
Build your next system
Looking to build your own custom AI agent or SaaS?
SoftGen transforms complex system architectures and AI requirements into secure, production-grade applications. Contact us today.
Related Articles
Continue reading more insights from SoftGen

