Soban Ejaz

Generative AI Engineer — Autonomous Agents, RAG Systems, LLM Orchestration

sobanpythonista@gmail.com·github.com/SOBANEJAZ·linkedin.com/in/sobanejaz·Lahore, Pakistan

Generative AI Engineer specialized in autonomous agents, RAG systems, and LLM orchestration. Expert in designing scalable multi-agent architectures and production-ready AI pipelines that bridge the gap between prototyping and deployment.

Technical Skills

AI & LLMs:

OpenAIGroqGeminiAnthropicOpenClawMCP

Frameworks:

CrewAILangGraphLangChainLlamaIndexPydanticAIFastMCP

Data & RAG:

PineconeChromaDBCrawl4AIFirecrawlHybrid SearchReranking

Frontend:

ReactTailwind CSSNext.jsStreamlitChainlit

Backend/Ops:

PostgreSQLSupabaseFirebaseDockerLinux

Education

BS Computer Science

NASTP Institute of Information Technology

Sep 2025 — Present

Relevant Coursework: Machine Learning, Data Science, Deep Learning

Experience

AI Engineer

Dec 2025 — Apr 2026

Designed and optimized multi-agent systems using CrewAI, LangChain, and LlamaIndex, improving reliability of LLM-powered applications
Built and deployed RAG pipelines and AI prototypes with Streamlit and Chainlit for production-ready workflows
Integrated APIs from Groq, Gemini, and OpenAI to deliver scalable, multi-model AI solutions
Collaborated in agile teams on innovative AI-driven software across remote and cross-functional environments

AI Engineer

Sept 2025 — Dec 2025

Integrated Generative AI into core EdTech products to enhance student engagement and personalized learning
Implemented a RAG pipeline to deliver personalized course recommendations to students at scale
Collaborated with the technical lead to refine AI features around student engagement and learning outcomes
Pitched the GenAI-powered EdTech platform to the National Incubation Center, enabling curriculum-aligned learning through AI

Key Projects

Narrate-AI

Python, Groq, Pinecone, ElevenLabs, OpenCLIP

Built a multiphase autonomous pipeline (Research → Script → Image Retrieval → Video) that converts any topic into a narrated documentary with zero manual input
Engineered a full RAG stack: web crawling with Crawl4AI, chunked indexing into Pinecone, and semantic retrieval to ground the LLM script writer in real sources
Integrated OpenCLIP for vision-language image ranking and ElevenLabs TTS for high-quality narration; output renders as a final MP4 at 1280×720

Agentic Employee Tracking & QA System

Python, GPT-4o, OpenPhone, Monday.com

Refactored 20+ scripts into a modular in-memory pipeline, cutting end-to-end execution time by ~80%
Automated two-way sync between OpenPhone call logs and Monday.com boards, with GPT-4o Structured Output auditing for billing compliance and transcript relevance
Implemented Pydantic validation and asynchronous requests for robust, production-grade data handling

AI Voice Agent

Python, Whisper, Gemini, ElevenLabs

Built a real-time voice AI using faster-Whisper STT, Gemini for reasoning, and ElevenLabs streaming TTS for low-latency spoken conversation
Achieved seamless microphone-to-speaker loop with minimal perceptible delay via optimized async audio streaming

Gemini MCP Server

Python, FastMCP, SSE

Developed an MCP server exposing Gemini API documentation lookups and ready-to-use code snippets to any MCP-compatible client
Supports both local and remote VM deployment via SSE over HTTP with token-based secure authentication

AI Competitor Analysis Agent

Python, Firecrawl, Groq, Pinecone, Streamlit

Built a high-performance RAG tool using async parallel crawling (asyncio + ThreadPoolExecutor) to reduce data acquisition time by ~50% vs. sequential scrapers
Leveraged Groq LPU inference (Llama-3-70b) and ephemeral Pinecone namespaces for near-instant competitor analysis with automatic context cleanup