GenAI and LLMs for Developers
Project Hero Image
My Role
Generative AI and LLMs for Developers
Company
NVIDIA
Industry
GenAI
Timeline
In-progress : Completed Courses Listed
GenAI and LLMs for Developers
Project Hero Image
Degree
Generative AI and LLMs for Developers
School
NVIDIA
Graduated
In-progress : Completed Courses Listed
GenAI and LLMs for Developers
Project Hero Image
Project Name
Generative AI and LLMs for Developers
My Role
Generative AI and LLMs for Developers
Industry
GenAI
Timeline
In-progress : Completed Courses Listed

Overview

After pitching Warp Speed and knowing I would gain access to an 8 GPU inference server, I decided to go on a deep dive into using NVIDIA hardware and Inference Endpoints for GenAI applications.

This training provided hands-on experience in transformer architectures, retrieval-augmented generation (RAG), AI Agents, and sizing LLM inference systems, equipping me with the skills to design, optimize, and deploy AI-powered solutions at scale.

Program Description

NVIDIA’s Generative AI and LLM Training, s a comprehensive program covering the latest advancements in deep learning, large language models (LLMs), and AI-driven applications.

This multi-course GenAI and LLM training program focused on developing, optimizing, and deploying generative AI and LLM-based systems with an emphasis on practical AI deployment.

I also gained expertise in NVIDIA NIM™, deep learning with PyTorch and building RAG agents and ML pipelines using LangChain.

Through hands-on projects, I applied these skills to develop AI-native applications, optimize inference pipelines, and integrate LLMs into real-world AI solutions.

project-image
project-image

Projects

LLM-Driven RAG System for Research Papers
• Built a retrieval-augmented generation (RAG) agent to enable AI-powered document analysis.
• Designed an embedding-based search system for semantic similarity and guard railing.
• Deployed a vector-based knowledge retrieval system, enabling efficient and context-aware responses.

Low-Latency AI Chatbot using NVIDIA NIM™
• Deployed an AI chatbot using NVIDIA NIM™ microservices for real-time AI model inference.
• Integrated LLM inference pipelines, optimizing response times for high-performance AI applications.

Building RAG Agents with LLMs
• Compose an LLM system that can interact predictably with a user by leveraging internal and external reasoning components.
• Design a dialog management and document reasoning system that maintains state and coerces information into structured formats.
• Leverage embedding models for efficient similarity queries for content retrieval and dialog guardrailing.
• Implement, modularize, and evaluate a RAG agent that can answer questions about the research papers in its dataset without any fine-tuning.
• Exploration of LLM inference interfaces and microservices.
• Designing LLM pipelines using LangChain, Gradio, and LangServe.
• Managing dialog states and integrating knowledge extraction.
• Strategies for working with long-form documents.
• Utilizing embeddings for semantic similarity and guardrailing.
• Implementing vector stores for efficient document retrieval.

Focus Areas


Deep Learning Fundamentals – Training and fine-tuning deep learning models using PyTorch and transfer learning techniques.
Transformer-Based LLMs – Understanding text generation, text classification, summarization, named-entity recognition (NER), question answering and NLP model optimization.
Retrieval-Augmented Generation (RAG) – Designing AI-powered knowledge retrieval systems for enhanced chatbot and search capabilities. We explored preparing data sets, splitting data into chunks, loading vector databases and retrieval strategies.
Synthetic Data Generation – Went through an end-to-end development workflow for generating synthetic data using Transformers, including data preprocessing, model pre-training, fine-tuning, inference, and evaluation
Sizing LLM Inference Systems - Learned about model optimization and deployment, covering optimizations like, prefill and decoding, latency and throughput trade-offs, tensor parallelism, in-flight batching. We also covered benchmarking, scaling strategies and optimization for total cost of ownership (TCO) for on-prem and external cloud deployment.
AI Model Deployment & Optimization – Scaling multi-GPU AI workloads using NVIDIA NIM™ microservices and TensorRT-LLM.

Overview

After pitching Warp Speed and knowing I would gain access to an 8 GPU inference server, I decided to go on a deep dive into using NVIDIA hardware and Inference Endpoints for GenAI applications.

This training provided hands-on experience in transformer architectures, retrieval-augmented generation (RAG), AI Agents, and sizing LLM inference systems, equipping me with the skills to design, optimize, and deploy AI-powered solutions at scale.

project-image
project-image
project-image

Program Description

NVIDIA’s Generative AI and LLM Training, s a comprehensive program covering the latest advancements in deep learning, large language models (LLMs), and AI-driven applications.

This multi-course GenAI and LLM training program focused on developing, optimizing, and deploying generative AI and LLM-based systems with an emphasis on practical AI deployment.

I also gained expertise in NVIDIA NIM™, deep learning with PyTorch and building RAG agents and ML pipelines using LangChain.

Through hands-on projects, I applied these skills to develop AI-native applications, optimize inference pipelines, and integrate LLMs into real-world AI solutions.

project-image
project-image
project-image

Focus Areas


Deep Learning Fundamentals – Training and fine-tuning deep learning models using PyTorch and transfer learning techniques.
Transformer-Based LLMs – Understanding text generation, text classification, summarization, named-entity recognition (NER), question answering and NLP model optimization.
Retrieval-Augmented Generation (RAG) – Designing AI-powered knowledge retrieval systems for enhanced chatbot and search capabilities. We explored preparing data sets, splitting data into chunks, loading vector databases and retrieval strategies.
Synthetic Data Generation – Went through an end-to-end development workflow for generating synthetic data using Transformers, including data preprocessing, model pre-training, fine-tuning, inference, and evaluation
Sizing LLM Inference Systems - Learned about model optimization and deployment, covering optimizations like, prefill and decoding, latency and throughput trade-offs, tensor parallelism, in-flight batching. We also covered benchmarking, scaling strategies and optimization for total cost of ownership (TCO) for on-prem and external cloud deployment.
AI Model Deployment & Optimization – Scaling multi-GPU AI workloads using NVIDIA NIM™ microservices and TensorRT-LLM.

Projects

LLM-Driven RAG System for Research Papers
• Built a retrieval-augmented generation (RAG) agent to enable AI-powered document analysis.
• Designed an embedding-based search system for semantic similarity and guard railing.
• Deployed a vector-based knowledge retrieval system, enabling efficient and context-aware responses.

Low-Latency AI Chatbot using NVIDIA NIM™
• Deployed an AI chatbot using NVIDIA NIM™ microservices for real-time AI model inference.
• Integrated LLM inference pipelines, optimizing response times for high-performance AI applications.

Building RAG Agents with LLMs
• Compose an LLM system that can interact predictably with a user by leveraging internal and external reasoning components.
• Design a dialog management and document reasoning system that maintains state and coerces information into structured formats.
• Leverage embedding models for efficient similarity queries for content retrieval and dialog guardrailing.
• Implement, modularize, and evaluate a RAG agent that can answer questions about the research papers in its dataset without any fine-tuning.
• Exploration of LLM inference interfaces and microservices.
• Designing LLM pipelines using LangChain, Gradio, and LangServe.
• Managing dialog states and integrating knowledge extraction.
• Strategies for working with long-form documents.
• Utilizing embeddings for semantic similarity and guardrailing.
• Implementing vector stores for efficient document retrieval.

Outline

The Setup

Overview

After pitching Warp Speed and knowing I would gain access to an 8 GPU inference server, I decided to go on a deep dive into using NVIDIA hardware and Inference Endpoints for GenAI applications.

This training provided hands-on experience in transformer architectures, retrieval-augmented generation (RAG), AI Agents, and sizing LLM inference systems, equipping me with the skills to design, optimize, and deploy AI-powered solutions at scale.

Program Description

NVIDIA’s Generative AI and LLM Training, s a comprehensive program covering the latest advancements in deep learning, large language models (LLMs), and AI-driven applications.

This multi-course GenAI and LLM training program focused on developing, optimizing, and deploying generative AI and LLM-based systems with an emphasis on practical AI deployment.

I also gained expertise in NVIDIA NIM™, deep learning with PyTorch and building RAG agents and ML pipelines using LangChain.

Through hands-on projects, I applied these skills to develop AI-native applications, optimize inference pipelines, and integrate LLMs into real-world AI solutions.

Focus Areas


Deep Learning Fundamentals – Training and fine-tuning deep learning models using PyTorch and transfer learning techniques.
Transformer-Based LLMs – Understanding text generation, text classification, summarization, named-entity recognition (NER), question answering and NLP model optimization.
Retrieval-Augmented Generation (RAG) – Designing AI-powered knowledge retrieval systems for enhanced chatbot and search capabilities. We explored preparing data sets, splitting data into chunks, loading vector databases and retrieval strategies.
Synthetic Data Generation – Went through an end-to-end development workflow for generating synthetic data using Transformers, including data preprocessing, model pre-training, fine-tuning, inference, and evaluation
Sizing LLM Inference Systems - Learned about model optimization and deployment, covering optimizations like, prefill and decoding, latency and throughput trade-offs, tensor parallelism, in-flight batching. We also covered benchmarking, scaling strategies and optimization for total cost of ownership (TCO) for on-prem and external cloud deployment.
AI Model Deployment & Optimization – Scaling multi-GPU AI workloads using NVIDIA NIM™ microservices and TensorRT-LLM.

Projects

LLM-Driven RAG System for Research Papers
• Built a retrieval-augmented generation (RAG) agent to enable AI-powered document analysis.
• Designed an embedding-based search system for semantic similarity and guard railing.
• Deployed a vector-based knowledge retrieval system, enabling efficient and context-aware responses.

Low-Latency AI Chatbot using NVIDIA NIM™
• Deployed an AI chatbot using NVIDIA NIM™ microservices for real-time AI model inference.
• Integrated LLM inference pipelines, optimizing response times for high-performance AI applications.

Building RAG Agents with LLMs
• Compose an LLM system that can interact predictably with a user by leveraging internal and external reasoning components.
• Design a dialog management and document reasoning system that maintains state and coerces information into structured formats.
• Leverage embedding models for efficient similarity queries for content retrieval and dialog guardrailing.
• Implement, modularize, and evaluate a RAG agent that can answer questions about the research papers in its dataset without any fine-tuning.
• Exploration of LLM inference interfaces and microservices.
• Designing LLM pipelines using LangChain, Gradio, and LangServe.
• Managing dialog states and integrating knowledge extraction.
• Strategies for working with long-form documents.
• Utilizing embeddings for semantic similarity and guardrailing.
• Implementing vector stores for efficient document retrieval.

project-image
project-image
project-image

Making it Happen

project-image
project-image
project-image

Conclusion

Working Experience
R&D FW Design Engineer
Broadcom
2019 - Present
learn more
Senior RF Design Engineer
Garmin
2018 - 2018
learn more
Principle IC Design Engineer
Broadcom
2011 - 2018
learn more
Embedded Systems Engineer
Self
2014 - 2015
learn more
Associate Electrical Engineer
Logic PD
2007 - 2011
learn more
Lead Technician
GE Intelligent Platforms
2004 - 2007
learn more
Startup Technician
Self
2005 - 2006
learn more
Education Experience
GenAI & LLMs for Developers
NVIDIA
2025
learn more
AI for Product
Product School
2025
learn more
UX / UI Design
Springboard
2024
learn more
Product Management
Product School
2022
learn more
MSEE
University of Minnesota
2012
learn more
BEE
University of Minnesota
2011
learn more
AS in Electronics
Brown College
1999
learn more
Master Practitioner of NLP
iNLP Center
2020
learn more
Quantum Coch
QCA
2021
learn more