FlexPod Datacenter for AI/ML with CiscoUCS 480 ML for Deep LearningDeployment Guide for FlexPod Datacenter for AI/ML withCisco UCS 480 ML for Deep LearningPublished: January 16, 2020

Table of ContentsExecutive Summary . 9Solution Overview . 10Introduction . 10Audience . 10What’s New in this Release? . 10Solution Design . 11Architecture . 11Physical Topology . 12Base Infrastructure .13Hardware and Software Revisions . 13Required VLANs . 14Physical Infrastructure .14FlexPod Cabling .14Network Switch Configuration . 16vGPU-only Deployment in Existing VMware Environment . 16Enable Features . 16Cisco Nexus A and Cisco Nexus B . 16Global Configurations.17Cisco Nexus A and Cisco Nexus B . 17Create VLANs . 17Cisco Nexus A and Cisco Nexus B . 17Configure Virtual Port-Channel Parameters .18Cisco Nexus A . 18Cisco Nexus B . 18Configure Virtual Port-Channels . 19Cisco UCS 6454 Fabric Interconnect to Nexus 9336C-FX2 Connectivity . 19NetApp A800 to Nexus 9336C-FX2 Connectivity . 22Storage Configuration .25Remove Ports from Default Broadcast Domain . 25Disable flow control on 100GbE ports. . 25Disable Auto-Negotiate on 100GbE Ports .26Enable Cisco Discovery Protocol . 26Enable Link-layer Discovery Protocol on all Ethernet Ports . 26Create Management Broadcast Domain .26Create NFS Broadcast Domain . 263

Create Interface Groups .27Change MTU on Interface Groups . 27Create VLANs . 27Configure Network Time Protocol . 28Configure SNMP . 28Configure SNMPv1 Access . 28Create SVM . 28Create Load-Sharing Mirrors of SVM Root Volume . 29Create iSCSI Service .29Configure HTTPS Access . 29Configure NFSv3 . 29Create ONTAP FlexGroup Volume . 30Create FlexVol Volumes for Bare-Metal Hosts . 30Create FlexVol Volumes for VMware ESXi Hosts . 30Create Bare-Metal Server Boot LUNs . 30Create ESXi Boot LUNs .31Modify Volume Efficiency . 31Schedule Deduplication .31Create NFS LIFs . 31Create iSCSI LIFs . 32Add AI-ML SVM Administrator. 32Cisco UCS Configuration for VMware with vGPU . 33Cisco UCS Base Configuration . 33Create NFS VLAN. 33ADD VLAN to vNIC Template . 33VMware Setup and Configuration for vGPU .35Obtaining and installing NVIDIA vGPU Software . 35NVIDIA Licensing .35Download NVIDIA vGPU Software . 36Setup NVIDIA vGPU Software License Server. 36Register License Server to NVIDIA Software Licensing Center .38Install NVIDIA vGPU Manager in ESXi . 41Set the Host Graphics to SharedPassthru .42(Optional) Enabling vMotion with vGPU. 43Add a Port-Group to access AI/ML NFS Share . 44Red Hat Enterprise Linux VM Setup. 45VM Hardware Setup .464

Download RHEL 7.6 DVD ISO . 47Operating System Installation . 48Network and Hostname Setup . 49RHEL VM – Base Configuration . 53Log into RHEL Host using SSH. 54Setup Subscription Manager . 54Enable Repositories .54Install Net-Tools and Verify MTU. 54Install FTP . 55Enable EPEL Repository . 55Install NFS Utilities and Mount NFS Share .55Setup NTP . 57Disable Firewall . 57Disable IPv6 (Optional) . 57Install Kernel Headers .58Install gcc . 58Install wget . 59Install DKMS . 59NVIDIA and CUDA Drivers Installation . 59Add vGPU to the VM .60Install NVIDIA Driver .61Install CUDA Toolkit .62Verify the NVIDIA and CUDA Installation .64Verify CUDA Driver .64Verify NVIDIA Driver .65Setup NVIDIA vGPU Licensing on the VM .66Cisco UCS Configuration for Bare Metal Workload . 68Cisco UCS Base Configuration . 68Cisco UCS C220 M5 Connectivity . 68Enable Server Ports .69Cisco UCS C240 M5 Connectivity . 69Enable Server Ports .70Cisco UCS C480 ML M5 Connectivity . 70Enable Server Ports .70Create an IQN Pool for iSCSI Boot . 71Create iSCSI Boot IP Address Pools . 72Create MAC Address Pools. 745

Create UUID Suffix Pool .75Create Server Pool .76Create VLANs . 77Modify Default Host Firmware Package . 79Set Jumbo Frames in Cisco UCS Fabric .80Create Local Disk Configuration Policy . 81Create Network Control Policy to Enable Link Layer Discovery Protocol (LLDP). 82Create Power Control Policy . 83Create Server BIOS Policy. 84Update the Default Maintenance Policy . 86Create vNIC Templates .87Create Management vNIC Template . 88Create iSCSI Boot vNIC Templates . 89Create NFS vNIC Template . 90(Optional) Create Traffic vNIC Template .91Create LAN Connectivity Policy for iSCSI Boot. 92Create iSCSI Boot Policies . 94Create Service Profile Template . 95Configure Storage Provisioning . 95Configure Networking Options . 96Configure SAN Connectivity Options. 96Configure Zoning Options . 97Configure vNIC/HBA Placement . 97Configure vMedia Policy . 97Configure Server Boot Order . 97Configure Maintenance Policy . 100Configure Server Assignment . 101Configure Operational Policies . 101Create Service Profiles . 101Storage Configuration – Boot LUNs . 103ONTAP Boot Storage Setup . 103Create igroups . 103Map Boot LUNs to igroups . 103Bare Metal Server Setup and Configuration . 104Red Hat Enterprise Linux (RHEL) Bare Metal Installation . 104Download RHEL 7.6 DVD ISO . 104Log into Cisco UCS Manager . 1046

Operating System Installation

