Implementing AI Data Center Networks Workshop - ILO

Implementing AI Data Center Networks Workshop - ILO 
Classroom
3 days
$3,000 USD
Includes eBook

Implementing Artificial Intelligence Data Center Networks Workshop
Instructor-led Online
Monday, June 23, 2025 10:30 AM EST

Event Information
Starting: Monday, June 23, 2025 10:30 AM EST
Ending: Wednesday, June 25, 2025 6:30 PM EST
Region: Americas (AMER)
Language: English
Subject: Implementing Artificial Intelligence Data Center Networks Workshop
Location: Instructor-led Online
Facilitator: Juniper Networks
Registration Limit: 16
Number Currently Registered: 3
Registration Deadline: Jun 19, 2025
Course Details
This three-day, intermediate-level workshop provides students with knowledge that might be helpful when building and working with Juniper Apstra™ in an artificial intelligence data center (AI data center). This workshop will provide attendees with the background knowledge necessary to understand the usage of the four networks described in the Juniper Validated Design (JVD) titled AI Data Center Network with Juniper Apstra, NVIDIA GPUs, and WEKA Storage. These include the out-of-band (OOB), front-end, back-end graphics processing unit (GPU), and back-end storage networks. Students will learn to train AI models using the PyTorch framework on:
  • a single server with one GPU;
  • a single server with multiple GPUs (covering NVIDIA’s NVSwitch and AMD’s Infinity Fabric technology); and
  • multiple servers with each having multiple GPUs.
  • Students will gain familiarity with network interface cards (NICs) for AI (NVIDIA ConnectX-7 and Broadcom P2200G), Nvidia GPUs (A100, H100, H200, B200), AMD GPUs (MI300X), and compute platform architectures (NVIDIA DGX and AMD MI300X Platform). Students will be provided with a deep dive into the JVD for the AI data center (primarily NVIDIA-focused). In the case of the back-end GPU network, students will learn that using NVIDIA Collective Communication Library (NCCL), remote direct memory access (RDMA) over Converged Ethernet (RoCEv2), and a rail-optimized network design ensures an optimal communication path for the collective operations of NCCL. For both back-end networks, students will learn how to use both data center quantized congestion notification (DCQCN) and dynamic load balancing (DLB) to ensure lossless data transfer over an Ethernet-based network. Students will learn how to use Apstra to deploy the AI DC networks as well as orchestrate the training cluster using Slurm.

    Through lectures only, students will gain knowledge in deploying and training AI models in a DC based on the JVD titled AI Data Center Networks with Juniper Apstra, NVIDIA GPUs, and WEKA Storage.

    $3,000 USD