Mandatory Prerequisites
- Participants should have basic knowledge of Python, containerized environments, and experience working in Jupyter/Colab or similar notebook workflows.
- Languages/tools: Python
- Frameworks: PyTorch, TensorRT-LLM, Triton Inference Server™, SGLang, vLLM
- Get ready with build.nvidia.com.
- Bring your laptop to this workshop. Laptop with internet access—Ideal minimum: 5 Mbps download/1–2 Mbps upload. This will ensure consistent access to the lab.
Get Started With Your AI Inference Journey
Discover how Tech Mahindra is collaborating with NVIDIA to be at the forefront of generative AI innovation.
From developing therapeutic molecules to building India’s sovereign LLM in Hindi and 37+ dialects, Tech Mahindra is utilizing NVIDIA’s hardware and software stack to build the Nemotron-4-Mini-Hindi-4B model. Tech Mahindra’s work on Indus 2.0 with Indonesia on Bahasa is state of the art and built on NVIDIA AI Inference software.
Speakers
Bharat Giddwani
Senior Solutions Architect
NVIDIA
Bharat is a seasoned senior solutions architect specializing in enterprise-scale generative AI solutions, with deep expertise in large language models (LLMs), multimodal AI, and retrieval-augmented generation (RAG) optimizations. His proficiency lies in designing and implementing robust, secure AI architectures that deliver measurable business impact. His technical prowess extends to advanced LLM techniques, including inference and training optimization. His solutions emphasize production readiness, incorporating robust monitoring and security controls, enabling organizations such as cloud providers, ISVs, and enterprises to successfully navigate their AI transformation journey.
Agenda
NSUT, Dwarka, Delhi | Anish Mukherjee
NSUT, Dwarka, Delhi | Prof. Anand Srivastava, Vice Chancellor
NVIDIA AI Blueprint: Bring Your LLM to NIM
(Hands-On)
NSUT, Dwarka, Delhi | Anish Mukherjee and Bharat Giddwani
(Hands-On)
NSUT, Dwarka, Delhi | Bharat Giddwani and Anish Mukherjee
NSUT, Dwarka, Delhi | Anish Mukherjee and Bharat Giddwani
Event Details
NVIDIA Hands-On Training on Inference
Friday, October 31, 2025Netaji Subhas University of Technology, Azad Hind Fauj Marg
Dwarka
Delhi DL 110078
India
Venue
Dwarka
Delhi DL 110078
India
Additional Resources
NVIDIA AI Enterprise Solutions |
Explore the most advanced AI, ready for enterprise. Explore the latest breakthroughs made possible with NVIDIA AI. |
|
NVIDIA AI Inference Solutions |
Greater AI performance, compounded returns. Think SMART. Think NVIDIA Inference. |
|
NVIDIA Inference Performance |
Inference can be deployed in many ways, depending on the use case. Offline processing of data is best done at larger batch sizes, which can deliver optimal GPU utilization and throughput. Deliver great user experiences by lowering latency. |
|
NVIDIA TensorRT |
NVIDIA TensorRT is an ecosystem of tools for developers to achieve high-performance deep learning inference. |
|
NVIDIA TensorRT-LLM |
NVIDIA TensorRT LLM is an open-source library built to deliver high-performance, real-time inference optimization for large language models (LLMs) on NVIDIA GPUs—whether on a desktop or in a data center. |
|
NVIDIA Developer Program |
Access free tools, extensive learning opportunities, and expert help with the NVIDIA Developer Program. |
|
NVIDIA NIM Microservices |
NVIDIA NIM™ is a set of microservices for deploying AI models. Tap into the latest AI foundation models—like Stable Diffusion, ESMFold, and Llama 3—with downloadable NIM microservices for your application deployment. |
|
NVIDIA Run:ai Tech Blog |
Cut model deployment costs while keeping performance with GPU memory swap. |
|
Large Language Models |
Large language models (LLMs) are deep learning algorithms that can recognize, summarize, translate, predict, and generate content using very large datasets. |
|
NGC Containers |
Phind-CodeLlama-34B-v2-Instruct All you need to build AI—GPU-optimized containers, pretrained models, SDKs, and Helm charts—unified in one catalog for cloud, data center, or edge. |
|
NGC Containers |
Llama-3.1-Nemotron-70B-Instruct All you need to build AI—GPU-optimized containers, pretrained models, SDKs, and Helm charts—unified in one catalog for cloud, data center, or edge. |
|
NGC Containers |
Llama-3-Taiwan-70B-Instruct All you need to build AI—GPU-optimized containers, pretrained models, SDKs, and Helm charts—unified in one catalog for cloud, data center, or edge. |
|
NVOD—LLM Inference Benchmarking |
LLM inference benchmarking end-to-end inference systems. Learn how to choose the right path for your AI initiatives by understanding the key metrics in large language model (LLM) inference sizing. Watch the video. |
|
Technical Blog |
Large language models (LLMs) are deep learning algorithms that can recognize, summarize, translate, predict, and generate content using very large datasets. |

