Profile picture of Deepak Research Blogs

Research Interests

Reinforcement Learning

Multi-agent Systems & AI Agents.

Vision & Speech AI

Multimodal Emotion & Affective Computing.

LLMs & NLP

Alignment, Reasoning, & Safety.

I am a Machine Learning Engineer and Researcher focused on the intersection of Multimodal AI, Computer Vision, and Trustworthy AI Agents.

Currently, I work on enhancing LLM collaboration in Medical AI and reasoning capabilities, with a particular interest in speech and pattern recognition to help machines understand human affect. I have actively contributed to several research Problems in these domains. Previously, I have worked as a Senior Software Engineer at Infosys, where I focused on building complex AI agents for enterprise applications.

I hold an Honors Master's in Computer Science from the University of Wollongong. During my time there, my research area explored advanced lexical complexity and natural language understanding, achieving top performance on NAACL shared tasks. My goal is to build AI that doesn't just process data, but truly understands the nuances of human communication.

Projects & Research

Speech Emotion Recognition project visual

Multimodal Speech Emotion Recognition (SER)

Summary: Developed a robust affect detection system using wav2vec 2.0 acoustic features fused with textual sentiment embeddings. Optimized for high-latency environments to enable real-time emotional feedback in AI tutors.

Code
Speech AI Transformers
Emotion-Aware Voice Assistant project visual

End-to-End Emotion-Aware Voice Assistants

Summary: Built an integrated pipeline that modifies LLM response style based on the user's detected emotional state. The system uses a gated fusion mechanism to adjust 'empathy' levels in generated dialogue.

Demo
HCI NLP
Facial Recognition Project

Advanced Facial Recognition & Emotion Detection

Summary: High-performance system for real-time facial recognition and nuanced emotion detection. Includes model architecture for resource-constrained AR/VR environments.

Code
Computer Vision
3D Image Models & Building LLMs project visual

3D Image Models & Building LLMs with Object Detection

Summary: This research investigates the intersection of 3D vision and large language models (LLMs). The project includes a series of Neural Radiance Fields (NeRF) experiments, a novel method for object-aware image retrieval, and detailed integration notes for enhancing Retrieval-Augmented Generation (RAG) pipelines with structured visual data.

PDF Code
AI Machine Learning Computer Vision
Neural Radiance Fields (NeRF) project visual

Neural Radiance Fields — PyTorch

Summary: Implementation details, tips for fast training, and Object detection models examples.

AI Machine Learning Computer Vision