VCF and VVF 9 New GPU Management Scripting with PowerShell and Python [CODEQT1767LV]

VCF and VVF 9 introduce some new methods for managing GPUs both in the GUI and through code such as PowerShell and Python. I presented on this at VMware Explore in a VMware {Code} session CODEQT1767LV. Here are are the details on whats new with GPU management in vSphere 9 from my session.

The biggest addition in GPU VCF and VVF are DirectPath Profiles. Direct Path Profiles or DPPs when vGPU profiles were added to them in 9.0. They provide a view of how the GPUs in your environment are being consumed. For example, lets say I have 10 NVIDIA GPUs in a cluster. At a cluster level I can see the combined consumption of vGPU profiles on everything in the cluster or I can drill down and look at consumption at a host level.

Bellow is a screen capture of what this looks like in vCenter.

In this image from my home lab you can see that DPPs are located under the monitoring section for both clusters and hosts. In use in the environment is one GPU consuming a NVIDIA L4-3B profile. We can see how that impacts the other profiles as well as the remaining number of each profile defined that are available.

Justin Murry did a fantastic blog on DPPs. To get a full understanding of what one is and how it works I recommend a quick read of his blog post.

CODEQT1767LV

Now let’s dig into why you are probably here and what I was presenting on. Interacting with DPPs. Below is a link to my slide deck that I presented at Explore and when the video posts, I will link that here as well.

Again the session will be linked once it is posted.

The MOB

With 9.0 the Managed Object Browser or MOB is disabled by default. This may not sound like a big deal to those who don’t do much coding for virtual environments. But for those of us who relish this, you’ll find that you spend a lot of time in the MOB looking up structures, functions, and paths.

Let’s quickly talk about enabling that. Broadcom KB 401669 provides details of how to enable the MOB. This is a rather straight forward task of modifying the vpxd.cfg file on the vSphere environment you want to enable it from.

There are a couple of things that need done first though. You’ll need to enable BASH shell access and (optionally) SSH access to the vCenter. This is so you can actually get to the vpxd file. Then you open up your favorite editor, in my case VI, and change the value of <enableDebugBrowse> to true. Only my instance didn’t have the <enableDebugBrowse> in the vpxd file, so I had to add it. I added it right under the <hostnameUrl> and it appears to work just fine. Here’s a screen shot of what that looks like.

Once the value has been set to true or added, save the vpxd.cfg file and restart the vpxd service. Then you’ll have MOB access.

If you would like to find DirectPath Profiles in the MOB. From the [home] location go to the content section, then select directPathProfileManager and you can see all the functions exposed for it. It should look like the screenshot below.

VCF PowerCLI Scripting

As you can see above there are five functions for DirectPath Profiles. One creates, one modifies, and one destroys, the other two are the interesting ones. These are the two I will focus on the Capacity and the List functions. These are the most valuable for scripting and automation. The other three are less impactful for development purposes as they tend to be one and done.

DirectPathProfileManagerList()

Let’s start simple with getting a list of DPPs. The code to do this is straight forward. I establish a connection to my vSphere environment and use the following code:

#------------- List all DPP specs -------------
$DPP_MGR = Get-View -Id 'DirectPathProfileManager-DirectPathProfileMgr'
$Spec = New-Object VMware.Vim.DirectPathProfileManagerFilterSpec
$DPP_MGR.DirectPathProfileManagerList($spec)

This code block has 3 lines to it. First we create a view of the DirectPathProfileManager and assign it to a variable. Next we create a new DirectPathProfileManagerFilterSpec or specification and assign it to a $Spec value. This is basically a filter object that allows us to specify what we are looking for. Lastly we use the DirectPathProfileManagerList function of our view and pass it the DPP spec we created in the previous line. When we run this we get the following output:

Id           : QML5Zhf++WjN2WcyOgEeNyzTdatxaAHj
Name         : NVIDIA L4-4Q profile
Description  : DPP for L4-4Q profile
VendorName   : Nvidia
DeviceConfig : VMware.Vim.DirectPathProfileManagerVmiopDirectPathConfig


Id           : LZYJZ1jfbn9zSli2xwatXrvO7FICWh+z
Name         : NEW --> NVIDIA L4-3B profile
Description  : NEW vGPU Profile in 19.0
               DPP for L4-3B profile
VendorName   : Nvidia
DeviceConfig : VMware.Vim.DirectPathProfileManagerVmiopDirectPathConfig


Id           : 3QuKKcBqBXRj1oF/fAwUAcjUXX5qWpio
Name         : L4-1Q profile
Description  :
VendorName   : Nvidia
DeviceConfig : VMware.Vim.DirectPathProfileManagerVmiopDirectPathConfig


Id           : ibQrFoQHGzIfYVVhQt9YZLTLyi2FYGnU
Name         : NVIDIA L4-2Q profile
Description  : DPP for L4-2Q profile
VendorName   : Nvidia
DeviceConfig : VMware.Vim.DirectPathProfileManagerVmiopDirectPathConfig


Id           : ucbsCQSDIl3CHG5b2WGDvfADuCUorM8t
Name         : NVIDIA L4-6Q profile
Description  : DPP for L4-6Q profile
VendorName   : Nvidia
DeviceConfig : VMware.Vim.DirectPathProfileManagerVmiopDirectPathConfig

It’s all the DPPs defined in the environment. This can be helpful when you need to know names or objects in the environment. You can iterate through the list and get details for each DPP.

Wouldn’t it be great if we could filter the list, find just the vGPU profile we are interested in? We can do that! Remember that $Spec variable we created in the previous PowerShell script? We’re going to use it now to filter our list to just a single DPP.

#------------- List specific DPP specs -------------
$DPP_MGR = Get-View -Id 'DirectPathProfileManager-DirectPathProfileMgr'
$Spec = New-Object VMware.Vim.DirectPathProfileManagerFilterSpec
$Spec.Names = 'NEW --> NVIDIA L4-3B profile'  #Accepts single items or arrays
$DPP_MGR.DirectPathProfileManagerList($Spec)  

In the above code block you’ll see that we added one new line. We added the $Spec.Names and set the value to a profile name. This allows filtering for only a specific DPP. This will come into play in a moment. The output from this code block looks like the following:

Id           : LZYJZ1jfbn9zSli2xwatXrvO7FICWh+z
Name         : NEW --> NVIDIA L4-3B profile
Description  : NEW vGPU Profile in 19.0
               DPP for L4-3B profile
VendorName   : Nvidia
DeviceConfig : VMware.Vim.DirectPathProfileManagerVmiopDirectPathConfig

DirectPathProfileManagerQueryCapacity(Target, Capacity_Query)

What we did was pretty cool, still not very useful on its own. Let’s step it up and do some capacity queries. This is where the usefulness starts to show. The capacity queries give us the utilization of our GPUs that we see in the GUI. Let’s start with the Cluster level and se what’s happening across all the hosts. This code snipit is a bit longer than the previous one but uses much of what we just saw.

$DPP_MGR = Get-View -Id 'DirectPathProfileManager-DirectPathProfileMgr'
$Spec = New-Object VMware.Vim.DirectPathProfileManagerFilterSpec
$Spec.Names = 'NEW --> NVIDIA L4-3B profile' 
#vSphere Cluster
$Cluster_View = Get-View -ViewType ClusterComputeResource -Filter @{"Name" = "GPU Cluster"}
#DPP Cluster
$DPP_Target_Entity = new-object VMware.Vim.DirectPathProfileManagerTargetCluster
$DPP_Target_Entity.Cluster = $Cluster_View.MoRef
#Target
$DPP_QC = New-Object VMware.Vim.DirectPathProfileManagerCapacityQueryByName
$DPP_QC.Name = $spec.Names
$DPP_MGR.DirectPathProfileManagerQueryCapacity($DPP_Target_Entity, $DPP_QC)

The first three lines should look familiar from the previous code blocks. Next we need to get our cluster object that we want to get DPP details from. We do this with a get-view call. Now we are going to create a target object with a type of DirectPathProfileManagerTargetCluster. We then set the cluster value to the MoRef of our cluster.

Using the MoRef of our cluster is important, as the query capacity function only operates with full objects.

Lastly we create our capacity query object and pass it the Spec we created on line two and set on line three. This gives us the following output:

Profile           : VMware.Vim.DirectPathProfileInfo
Consumed          : 1
Remaining         : 7
Max               : 8
UnusedReservation : 0

It shows how the specific vGPU profile is being consumed. Now we’re getting somewhere! We should see how we can query a specific host. That’s what this next block of code does.

$DPP_MGR = Get-View -Id 'DirectPathProfileManager-DirectPathProfileMgr'
$Spec = New-Object VMware.Vim.DirectPathProfileManagerFilterSpec
$Spec.Names = 'NEW --> NVIDIA L4-3B profile'  
#vSphere Host
$Host_View = Get-View -ViewType HostSystem -Filter @{"Name" = "ESX04.wondernerd.local"}
#DPP Host
$DPP_Target_Entity = new-object VMware.Vim.DirectPathProfileManagerTargetHost
$DPP_Target_Entity.Host = $Host_View.MoRef
#Target
$DPP_QC = New-Object VMware.Vim.DirectPathProfileManagerCapacityQueryByName
$DPP_QC.Name = $spec.Names
$DPP_MGR.DirectPathProfileManagerQueryCapacity($DPP_Target_Entity, $DPP_QC)Now lets look at a specific host. 

If you look this code block, at first glace it probably appears identical to the previous block. In reality we made two minor changes when we want to look at DPP for a specific host. We changed lines 5, 7, and 8. First we get a view containing our host on line 5. Then we call DirectPathProfileManagerTargetHost on line 7, and set the .host value on line 8. Everything else is the same as the cluster level. Which provides the following output:

Profile           : VMware.Vim.DirectPathProfileInfo
Consumed          : 1
Remaining         : 7
Max               : 8
UnusedReservation : 0

We can see that this host has one L4-3B profile consumed on it and can support 7 more. Now we have a complete set of tools to work with. Now we are going to make this do some cool stuff.

Something Useful

This is sort of useful. There are several things we can do at this point. We can use this to optimize VM placement for vGPUs, for example, I want all the L4-3B vGPUs consolidated on the same host, or maybe I want to use this with a VDI by day compute by night scenario, or something else. The possibilities are really endless. In this case I’m just going to share a simple script that just powers up several VMs until it reaches a specific amount of GPU capacity and then shuts them down.

$DPP_MGR = Get-View -Id 'DirectPathProfileManager-DirectPathProfileMgr'
$Spec = New-Object VMware.Vim.DirectPathProfileManagerFilterSpec
$Spec.Names = 'NEW --> NVIDIA L4-3B profile'  #Accepts blank, single, or arrays
#vSphere Host
$Host_View = Get-View -ViewType HostSystem -Filter @{"Name" = "ESX04.wondernerd.local"}
#DPP Host
$DPP_Target_Entity = new-object VMware.Vim.DirectPathProfileManagerTargetHost
$DPP_Target_Entity.Host = $Host_View.MoRef
#Target
$DPP_QC = New-Object VMware.Vim.DirectPathProfileManagerCapacityQueryByName
$DPP_QC.Name = $spec.Names

#Priming Read
$DPP_Usage = $DPP_MGR.DirectPathProfileManagerQueryCapacity($DPP_Target_Entity, $DPP_QC)

Write-Output("Setting Vars:")
$Reserve_VM_Space = 2
$VM_Floor = 0
$Starting_VM = $DPP_Usage.Consumed + 1

Write-Output("Starting Loop")
Write-Output("DPP Remaining: " + $DPP_Usage.remaining)
Write-Output("Reserved Space: " + $Reserve_VM_Space)
while ($DPP_Usage.remaining -gt $Reserve_VM_Space) {
    #Create VM Name
    $VM_Name = "VM_GPU_0" + $Starting_VM
    Write-Output("Starting VM: " + $VM_Name)
    $Next_VM = Get-VM -Name $VM_Name
    Start-VM -VM $Next_VM
    $DPP_Usage = $DPP_MGR.DirectPathProfileManagerQueryCapacity($DPP_Target_Entity, $DPP_QC)
    $Starting_VM = $Starting_VM + 1
}
Write-Output("No more VMs to start. Cleaning up") 
Start-Sleep -Seconds 20
Write-Output("Starting Value is: $Starting_VM")
while ($DPP_Usage.remaining -gt 1 -and $Starting_VM -gt 1){
    $Starting_VM = $Starting_VM - 1
    #Create VM Name
    $VM_Name = "VM_GPU_0" + $Starting_VM
    Write-Output("Stoppinng VM: " + $VM_Name)
    $Next_VM = Get-VM -Name $VM_Name
    Stop-VMGuest -VM $Next_VM -Confirm:$false
    if ($Starting_VM -gt $VM_Floor){
        do {
            Start-Sleep -Seconds 5
            $Next_VM = Get-VM -Name $VM_Name
            $VM_Power = $Next_VM.PowerState
            Write-Output("The power state is: " + $VM_Power)
        } until ($VM_Power -eq "PoweredOff")
    }
    else {
        Write-Output("Reached the floor. Bye!")
        break
    }
    $DPP_Usage = $DPP_MGR.DirectPathProfileManagerQueryCapacity($DPP_Target_Entity, $DPP_QC)
}

Everything until the priming read looks identical to what we had seen previously. From there it gets into basic coding. We have a loop that goes through and counts up as we start more VMs. Each time it checks to see what the new DPP numbers are and goes until it reaches the capacity set.

We then wait 20 seconds for all the VMs to fully come on line, at which point we start powering them down. And that loops through shutting down the VMs. Here’s a video of what running the code looks like.

That brings us to the end of this blog. I’m going to cover the python version of this code in another blog as there are still some things being worked out. In summary DPPs are an awesome addition to VCF and VVF 9.0. You should check them out!

Permanent link to this article: https://www.wondernerd.net/vcf-and-vvf-9-new-gpu-management-scripting-with-powershell-and-python-codeqt1767lv/

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.