Deploying Nutanix Enterprise AI (NAI) NVD Reference Application
Version 2.1.0
This version of the NAI deployment is based on the Nutanix Enterprise AI (NAI) v2.1.0
release.
stateDiagram-v2
direction LR
state DeployNAI {
[*] --> DeployNAIAdmin
DeployNAIAdmin --> InstallSSLCert
InstallSSLCert --> DownloadModel
DownloadModel --> CreateNAI
CreateNAI --> [*]
}
[*] --> PreRequisites
PreRequisites --> DeployNAI
DeployNAI --> TestNAI : next section
TestNAI --> [*]
Prepare NAI Docker Download Credentials
All NAI Docker images will be downloaded from the public Docker Hub registry. In order to download the images, you will need to logon to Nutanix Portal - NAI and create a Docker ID and access token.
- Login to Nutanix Portal - NAI using your credentials
- Click on Generate Access Token option
- Copy the generated Docker ID and access token
Warning
Currently there are issues with the Nutanix Portal to create a Docker ID and access token. This will be fixed soon.
Click on the Manage Access Token option and use the credentials listed there until the Nutanix Portal is fixed.
Deploy NAI
In this section we will use the NKP Applications Catalog to deploy NAI.
Warning
NAI catalog has to be added manually for now to NKP Application Catalog. However future releases of NKP v2.14 and beyond may be released with NAI application ready to setup.
-
Add the
nutanix-apps-catalog
to NKP -
In the NKP GUI, Go to Clusters
- Click on Management Cluster Workspace
- Go to Applications
- Search for Nutanix Enterprise AI
- Click on Enable
- Click on Configuration tab
-
Click on Workspace Application Configuration Override and paste the following yaml content
imagePullSecret: # Name of the image pull secret name: nai-iep-secret # Image registry credentials credentials: registry: https://index.docker.io/v1/ username: _GA_release_docker_username password: _GA_release_docker_password email: _Email_associated_with_service_account storageClassName: nai-nfs-storage
-
Click on Enable on the top right-hand corner to enable Nutanix Enterprise AI
-
Wait until Nutanix Enterprise AI operator is Enabled.
-
Verify that the NAI Core Pods are running and healthy
kubens nai-system kubectl get po,deploy ✔ Active namespace is "nai-system" NAME READY STATUS RESTARTS AGE pod/nai-api-5bc49f5445-2bsrj 1/1 Running 0 119m pod/nai-api-db-migrate-9z4s5-fsks5 0/1 Completed 0 119m pod/nai-db-0 1/1 Running 0 119m pod/nai-iep-model-controller-7fd4cfc5cc-cxh2g 1/1 Running 0 119m pod/nai-ui-77b7b789d5-wf67d 1/1 Running 0 119m pod/prometheus-nai-0 2/2 Running 0 119m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/nai-api 1/1 1 1 119m deployment.apps/nai-iep-model-controller 1/1 1 1 119m deployment.apps/nai-ui 1/1 1 1 119m
Install SSL Certificate
In this section we will install SSL Certificate to access the NAI UI. This is required as the endpoint will only work with a ssl endpoint with a valid certificate.
NAI UI is accessible using the Ingress Gateway.
The following steps show how cert-manager can be used to generate a self signed certificate using the default selfsigned-issuer present in the cluster.
If you are using Public Certificate Authority (CA) for NAI SSL Certificate
If an organization generates certificates using a different mechanism then obtain the certificate + key and create a kubernetes secret manually using the following command:
Skip the steps in this section to create a self-signed certificate resource.
-
Get the Ingress host using the following command:
-
Get the value of
INGRESS_HOST
environment variable -
We will use the command output e.g:
10.x.x.216
as the IP address for NAI as reserved in this section -
Construct the FQDN of NAI UI using nip.io and we will use this FQDN as the certificate's Common Name (CN).
-
Create the ingress resource certificate using the following command:
cat << EOF | k apply -f - apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: nai-cert namespace: istio-system spec: issuerRef: name: selfsigned-issuer kind: ClusterIssuer secretName: nai-cert commonName: nai.${INGRESS_HOST}.nip.io dnsNames: - nai.${INGRESS_HOST}.nip.io ipAddresses: - ${INGRESS_HOST} EOF
-
Create the certificate using the following command
-
Patch the ingress gateway's IP address to the certificate file.
Accessing the UI
-
In a browser, open the following URL to connect to the NAI UI
-
Change the password for the
admin
user -
Login using
admin
user and password.
Download Model
We will download and user llama3 8B model which we sized for in the previous section.
- In the NAI GUI, go to Models
- Click on Import Model from Hugging Face
- Choose the
meta-llama/Meta-Llama-3.1-8B-Instruct
model -
Input your Hugging Face token that was created in the previous section and click Import
-
Provide the Model Instance Name as
Meta-Llama-3.1-8B-Instruct
and click Import -
Go to VSC Terminal to monitor the download
Get jobs in nai-admin namespacekubens nai-admin ✔ Active namespace is "nai-admin" kubectl get jobs NAME COMPLETIONS DURATION AGE nai-c0d6ca61-1629-43d2-b57a-9f-model-job 0/1 4m56s 4m56
Validate creation of pods and PVCkubectl get po,pvc NAME READY STATUS RESTARTS AGE nai-c0d6ca61-1629-43d2-b57a-9f-model-job-9nmff 1/1 Running 0 4m49s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE nai-c0d6ca61-1629-43d2-b57a-9f-pvc-claim Bound pvc-a63d27a4-2541-4293-b680-514b8b890fe0 28Gi RWX nai-nfs-storage <unset> 2d
Verify download of model using pod logskubectl logs -f nai-c0d6ca61-1629-43d2-b57a-9f-model-job-9nmff /venv/lib/python3.9/site-packages/huggingface_hub/file_download.py:983: UserWarning: Not enough free disk space to download the file. The expected file size is: 0.05 MB. The target location /data/model-files only has 0.00 MB free disk space. warnings.warn( tokenizer_config.json: 100%|██████████| 51.0k/51.0k [00:00<00:00, 3.26MB/s] tokenizer.json: 100%|██████████| 9.09M/9.09M [00:00<00:00, 35.0MB/s]<00:30, 150MB/s] model-00004-of-00004.safetensors: 100%|██████████| 1.17G/1.17G [00:12<00:00, 94.1MB/s] model-00001-of-00004.safetensors: 100%|██████████| 4.98G/4.98G [04:23<00:00, 18.9MB/s] model-00003-of-00004.safetensors: 100%|██████████| 4.92G/4.92G [04:33<00:00, 18.0MB/s] model-00002-of-00004.safetensors: 100%|██████████| 5.00G/5.00G [04:47<00:00, 17.4MB/s] Fetching 16 files: 100%|██████████| 16/16 [05:42<00:00, 21.43s/it]:33<00:52, 9.33MB/s] ## Successfully downloaded model_files|██████████| 5.00G/5.00G [04:47<00:00, 110MB/s] Deleting directory : /data/hf_cache
-
Optional - verify the events in the namespace for the pvc creation
$ k get events | awk '{print $1, $3}' 3m43s Scheduled 3m43s SuccessfulAttachVolume 3m36s Pulling 3m29s Pulled 3m29s Created 3m29s Started 3m43s SuccessfulCreate 90s Completed 3m53s Provisioning 3m53s ExternalProvisioning 3m45s ProvisioningSucceeded 3m53s PvcCreateSuccessful 3m48s PvcNotBound 3m43s ModelProcessorJobActive 90s ModelProcessorJobComplete
The model is downloaded to the Nutanix Files pvc
volume.
After a successful model import, you will see it in Active status in the NAI UI under Models menu
Create and Test Inference Endpoint
In this section we will create an inference endpoint using the downloaded model.
- Navigate to Inference Endpoints menu and click on Create Endpoint button
-
Fill the following details:
- Endpoint Name:
llama-8b
- Model Instance Name:
Meta-LLaMA-8B-Instruct
- Use GPUs for running the models :
Checked
- No of GPUs (per instance):
- GPU Card:
NVIDIA-L40S
(or other available GPU) - No of Instances:
1
- API Keys: Create a new API key or use an existing one
- Endpoint Name:
-
Click on Create
-
Monitor the
nai-admin
namespace to check if the services are coming up -
Check the events in the
nai-admin
namespace for resource usage to make sure all$ kubectl get events -n nai-admin --sort-by='.lastTimestamp' | awk '{print $1, $3, $5}' 110s FinalizerUpdate Updated 110s FinalizerUpdate Updated 110s RevisionReady Revision 110s ConfigurationReady Configuration 110s LatestReadyUpdate LatestReadyRevisionName 110s Created Created 110s Created Created 110s Created Created 110s InferenceServiceReady InferenceService 110s Created Created
-
Once the services are running, check the status of the inference service