Using IMDS to reallocate a replica
Using the IMDS, you can reallocate a replica from within the running container. Here, we’ll write a Python script using our IMDS SDK that will reallocate the running replica if it doesn’t have enough VRAM free. You also can use the JSON request endpoint.
Running NVIDIA-SMI retrieving the free VRAM
In this example, we’re running nvidia-smi
to check how much VRAM is available and piping the output to a log file.
Then, we parse it and compare. If it doesn’t have enough free VRAM (we’re choosing 2GB must be free here), it’ll fail
the test.
In this example, the machine has only 500MB of VRAM free. A we would expect, if this runs it will fail as it’s below the minimum requirement of 2048MB.
Using the IMDS SDK to reallocate a node
Now, using the Python IMDS SDK we can automatically reallocate this replica to search for a new replica that has free
VRAM. We’ll change the nodeFail
function to instead call the IMDS SDK.
Now, when a replica runs the check and fails, it will call to the IMDS SDK and automatically reallocate the replica. Below are examples in other languages for the IMDS SDK usage.
Using the JSON request endpoint to reallocate a node
We also can reallocate the node by sending a POST request to the IMDS reallocate endpoint. Here, we’ll use the same Python example, but with the POST request instead of the IMDS SDK.
Now, when the test fails, the code will send a JSON POST request to the IMDS endpoint to reallocate the node.
Was this page helpful?