Video Processing Pipeline
Built a scalable video platform supporting upload, processing, live streaming, moderation, and AI-powered video interaction.

Overview
Developed a comprehensive video infrastructure platform that enables video upload, processing, streaming, moderation, and AI-powered content interaction. The system leverages a microservices architecture with Spring Boot, Kafka, and RabbitMQ to orchestrate asynchronous media workflows. It supports RTMP live stream ingestion, adaptive bitrate HLS generation, automated video processing pipelines, and a Retrieval-Augmented Generation (RAG) chatbot that allows users to query video content using generated transcripts.
System Architecture Diagrams

Microservices Architecture & Live Ingestion Flow

Adaptive Transcoding Worker Pipeline
Key Features
- 1Implemented 20+ REST APIs for media upload, processing, streaming, moderation, and content management
- 2Designed 5+ asynchronous event-driven workflows using Kafka and RabbitMQ for video ingestion and background processing
- 3Built RTMP live streaming ingestion pipelines with FFmpeg transcoding to adaptive bitrate HLS streams
- 4Developed AI-powered video chatbot using LangChain4j, RAG architecture, vector embeddings, and Whisper transcript generation
- 5Integrated cloud-based media storage and content delivery for scalable video distribution
- 6Implemented adaptive streaming to support smooth playback across multiple devices and network conditions
- 7Containerized services using Docker to enable independent deployment and horizontal scalability
The Engineering Challenge
Coordinating multiple asynchronous media workflows while maintaining low processing latency and system reliability. This was addressed through an event-driven architecture using Kafka and RabbitMQ, separating ingestion, transcoding, moderation, transcript generation, embedding creation, and media delivery into independently scalable services. Building the video RAG pipeline required efficient transcript generation, embedding storage, and retrieval mechanisms to provide accurate context-aware responses from video content.