Skip to content

Soban Ejaz

Generative AI Engineer — Autonomous Agents, RAG Systems, LLM Orchestration

Generative AI Engineer specialized in autonomous agents, RAG systems, and LLM orchestration. Expert in designing scalable multi-agent architectures and production-ready AI pipelines that bridge the gap between prototyping and deployment.

Technical Skills

AI & LLMs:
OpenAIGroqGeminiAnthropicOpenClawMCP
Frameworks:
CrewAILangGraphLangChainLlamaIndexPydanticAIFastMCP
Data & RAG:
PineconeChromaDBCrawl4AIFirecrawlHybrid SearchReranking
Frontend:
ReactTailwind CSSNext.jsStreamlitChainlit
Backend/Ops:
PostgreSQLSupabaseFirebaseDockerLinux

Education

BS Computer Science

NASTP Institute of Information Technology

Sep 2025 — Present

Relevant Coursework: Machine Learning, Data Science, Deep Learning

Experience

AI Engineer (Intern)

Spiral Lab, Lahore

Dec 2025 — Mar 2026
  • Designed and optimized multi-agent systems using CrewAI, LangChain, and LlamaIndex, improving reliability of LLM-powered applications
  • Built and deployed RAG pipelines and AI prototypes with Streamlit and Chainlit for production-ready workflows
  • Integrated APIs from Groq, Gemini, and OpenAI to deliver scalable, multi-model AI solutions
  • Collaborated in agile teams on innovative AI-driven software across remote and cross-functional environments
AI Engineer (Intern)

Skill2Success, Lahore

Oct 2025 — Nov 2025
  • Integrated Generative AI into core EdTech products to enhance student engagement and personalized learning
  • Implemented a RAG pipeline to deliver personalized course recommendations to students at scale
  • Collaborated with the technical lead to refine AI features around student engagement and learning outcomes

Key Projects

Narrate-AI

Python, Groq, Pinecone, ElevenLabs, OpenCLIP

  • Built a multiphase autonomous pipeline (Research → Script → Image Retrieval → Video) that converts any topic into a narrated documentary with zero manual input
  • Engineered a full RAG stack: web crawling with Crawl4AI, chunked indexing into Pinecone, and semantic retrieval to ground the LLM script writer in real sources
  • Integrated OpenCLIP for vision-language image ranking and ElevenLabs TTS for high-quality narration; output renders as a final MP4 at 1280×720
Agentic Employee Tracking & QA System

Python, GPT-4o, OpenPhone, Monday.com

  • Refactored 20+ scripts into a modular in-memory pipeline, cutting end-to-end execution time by ~80%
  • Automated two-way sync between OpenPhone call logs and Monday.com boards, with GPT-4o Structured Output auditing for billing compliance and transcript relevance
  • Implemented Pydantic validation and asynchronous requests for robust, production-grade data handling
AI Voice Agent

Python, Whisper, Gemini, ElevenLabs

  • Built a real-time voice AI using faster-Whisper STT, Gemini for reasoning, and ElevenLabs streaming TTS for low-latency spoken conversation
  • Achieved seamless microphone-to-speaker loop with minimal perceptible delay via optimized async audio streaming
Gemini MCP Server

Python, FastMCP, SSE

  • Developed an MCP server exposing Gemini API documentation lookups and ready-to-use code snippets to any MCP-compatible client
  • Supports both local and remote VM deployment via SSE over HTTP with token-based secure authentication
AI Competitor Analysis Agent

Python, Firecrawl, Groq, Pinecone, Streamlit

  • Built a high-performance RAG tool using async parallel crawling (asyncio + ThreadPoolExecutor) to reduce data acquisition time by ~50% vs. sequential scrapers
  • Leveraged Groq LPU inference (Llama-3-70b) and ephemeral Pinecone namespaces for near-instant competitor analysis with automatic context cleanup