Deploying Nutanix Enterprise AI (NAI) NVD Reference Application

Version 2.1.0

This version of the NAI deployment is based on the Nutanix Enterprise AI (NAI) v2.1.0 release.

stateDiagram-v2
    direction LR

    state DeployNAI {
        [*] --> DeployNAIAdmin
        DeployNAIAdmin -->  InstallSSLCert
        InstallSSLCert --> DownloadModel
        DownloadModel --> CreateNAI
        CreateNAI --> [*]
    }

    [*] --> PreRequisites
    PreRequisites --> DeployNAI 
    DeployNAI --> TestNAI : next section
    TestNAI --> [*]

Prepare NAI Docker Download Credentials

All NAI Docker images will be downloaded from the public Docker Hub registry. In order to download the images, you will need to logon to Nutanix Portal - NAI and create a Docker ID and access token.

Login to Nutanix Portal - NAI using your credentials
Click on Generate Access Token option
Copy the generated Docker ID and access token

Warning

Currently there are issues with the Nutanix Portal to create a Docker ID and access token. This will be fixed soon.

Click on the Manage Access Token option and use the credentials listed there until the Nutanix Portal is fixed.

Deploy NAI

In this section we will use the NKP Applications Catalog to deploy NAI.

Warning

NAI catalog has to be added manually for now to NKP Application Catalog. However future releases of NKP v2.14 and beyond may be released with NAI application ready to setup.

Add the nutanix-apps-catalog to NKP

CommandCommand output

nkp create catalog nutanix-apps-catalog -w kommander-workspace --branch main \
--url https://github.com/nutanix-cloud-native/nkp-nutanix-product-catalog

Catalog GitRepository nutanix-apps-catalog created. Use 'nkp edit gitrepository -n kommander nutanix-apps-catalog' to change its configuration.

Validate if the catalog is added. Make sure the status of thenutanix-apps-catalog is Ready as shown in the command output

CommandCommand output

kubectl get gitrepo -A

NAMESPACE        NAME                   URL                                                                                                   AGE     READY   STATUS
kommander-flux   management             https://git-operator-git.git-operator-system.svc.cluster.local/repositories/kommander/kommander.git   5h47m   True    stored artifact for revision 'main@sha1:d5638636da96c752eac79b0d1710eea3f6b57215'
kommander        nutanix-apps-catalog   https://github.com/nutanix-cloud-native/nkp-nutanix-product-catalog                                   2s      True    stored artifact for revision 'main@sha1:662482cce8cc5f22189e09a85a0dce5d69266145'

In the NKP GUI, Go to Clusters
Click on Management Cluster Workspace
Go to Applications
Search for Nutanix Enterprise AI
Click on Enable
Click on Configuration tab

Click on Workspace Application Configuration Override and paste the following yaml content

Template yamlSample yaml

imagePullSecret:
  # Name of the image pull secret
  name: nai-iep-secret
  # Image registry credentials
  credentials:
    registry: https://index.docker.io/v1/
    username: _GA_release_docker_username
    password: _GA_release_docker_password
    email: _Email_associated_with_service_account
storageClassName: nai-nfs-storage

imagePullSecret:
  # Name of the image pull secret
  name: nai-iep-secret
  # Image registry credentials
  credentials:
    registry: https://index.docker.io/v1/
    username: ntnxsvcgpt
    password: dckr_pat_Yxxxxxxxxxxxxxxxxxxxx
    email: first.last@company.com
storageClassName: nai-nfs-storage

Click on Enable on the top right-hand corner to enable Nutanix Enterprise AI
Wait until Nutanix Enterprise AI operator is Enabled.

Verify that the NAI Core Pods are running and healthy

CommandCommand output

kubens nai-system
kubectl get po,deploy

kubens nai-system
kubectl get po,deploy
✔ Active namespace is "nai-system"
NAME                                            READY   STATUS      RESTARTS   AGE
pod/nai-api-5bc49f5445-2bsrj                    1/1     Running     0          119m
pod/nai-api-db-migrate-9z4s5-fsks5              0/1     Completed   0          119m
pod/nai-db-0                                    1/1     Running     0          119m
pod/nai-iep-model-controller-7fd4cfc5cc-cxh2g   1/1     Running     0          119m
pod/nai-ui-77b7b789d5-wf67d                     1/1     Running     0          119m
pod/prometheus-nai-0                            2/2     Running     0          119m

NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/nai-api                    1/1     1            1           119m
deployment.apps/nai-iep-model-controller   1/1     1            1           119m
deployment.apps/nai-ui                     1/1     1            1           119m

Install SSL Certificate

In this section we will install SSL Certificate to access the NAI UI. This is required as the endpoint will only work with a ssl endpoint with a valid certificate.

NAI UI is accessible using the Ingress Gateway.

The following steps show how cert-manager can be used to generate a self signed certificate using the default selfsigned-issuer present in the cluster.

If you are using Public Certificate Authority (CA) for NAI SSL Certificate

If an organization generates certificates using a different mechanism then obtain the certificate + key and create a kubernetes secret manually using the following command:

kubectl -n istio-system create secret tls nai-cert --cert=path/to/nai.crt --key=path/to/nai.key

Skip the steps in this section to create a self-signed certificate resource.

Get the Ingress host using the following command:

INGRESS_HOST=$(kubectl get svc -n istio-system istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

Get the value of INGRESS_HOST environment variable
CommandCommand output
echo $INGRESS_HOST
10.x.x.216
We will use the command output e.g: 10.x.x.216 as the IP address for NAI as reserved in this section
Construct the FQDN of NAI UI using nip.io and we will use this FQDN as the certificate's Common Name (CN).
Template URLSample URL
nai.${INGRESS_HOST}.nip.io
nai.10.x.x.216.nip.io

Create the ingress resource certificate using the following command:

cat << EOF | k apply -f -
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: nai-cert
  namespace: istio-system
spec:
  issuerRef:
    name: selfsigned-issuer
    kind: ClusterIssuer
  secretName: nai-cert
  commonName: nai.${INGRESS_HOST}.nip.io
  dnsNames:
  - nai.${INGRESS_HOST}.nip.io
  ipAddresses:
  - ${INGRESS_HOST}
EOF

Create the certificate using the following command

kubectl apply -f $HOME/airgap-nai/nai-cert.yaml

Patch the ingress gateway's IP address to the certificate file.

CommandCommand output

kubectl patch  gateways.networking.istio.io -n knative-serving knative-ingress-gateway --type merge --patch-file=/dev/stdin <<EOF
spec:
  servers:
  - hosts:
    - '*'
    port:
      name: https
      number: 443
      protocol: HTTPS
    tls:
      mode: SIMPLE
      credentialName: nai-cert
EOF

gateway.networking.istio.io/knative-ingress-gateway patched

Accessing the UI

In a browser, open the following URL to connect to the NAI UI
```
https://nai.10.x.x.216.nip.io
```
Change the password for the admin user
Login using admin user and password.

Download Model

We will download and user llama3 8B model which we sized for in the previous section.

In the NAI GUI, go to Models
Click on Import Model from Hugging Face
Choose the meta-llama/Meta-Llama-3.1-8B-Instruct model
Input your Hugging Face token that was created in the previous section and click Import
Provide the Model Instance Name as Meta-Llama-3.1-8B-Instruct and click Import

Go to VSC Terminal to monitor the download

CommandCommand output

Get jobs in nai-admin namespace

kubens nai-admin

kubectl get jobs

Validate creation of pods and PVC

kubectl get po,pvc

Verify download of model using pod logs

kubectl logs -f _pod_associated_with_job

Get jobs in nai-admin namespace

kubens nai-admin

✔ Active namespace is "nai-admin"

kubectl get jobs

NAME                                       COMPLETIONS   DURATION   AGE
nai-c0d6ca61-1629-43d2-b57a-9f-model-job   0/1           4m56s      4m56

Validate creation of pods and PVC

kubectl get po,pvc

NAME                                             READY   STATUS    RESTARTS   AGE
nai-c0d6ca61-1629-43d2-b57a-9f-model-job-9nmff   1/1     Running   0          4m49s

NAME                                       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      VOLUMEATTRIBUTESCLASS   AGE
nai-c0d6ca61-1629-43d2-b57a-9f-pvc-claim   Bound    pvc-a63d27a4-2541-4293-b680-514b8b890fe0   28Gi       RWX            nai-nfs-storage   <unset>                 2d

Verify download of model using pod logs

kubectl logs -f nai-c0d6ca61-1629-43d2-b57a-9f-model-job-9nmff 

/venv/lib/python3.9/site-packages/huggingface_hub/file_download.py:983: UserWarning: Not enough free disk space to download the file. The expected file size is: 0.05 MB. The target location /data/model-files only has 0.00 MB free disk space.
warnings.warn(
tokenizer_config.json: 100%|██████████| 51.0k/51.0k [00:00<00:00, 3.26MB/s]
tokenizer.json: 100%|██████████| 9.09M/9.09M [00:00<00:00, 35.0MB/s]<00:30, 150MB/s]
model-00004-of-00004.safetensors: 100%|██████████| 1.17G/1.17G [00:12<00:00, 94.1MB/s]
model-00001-of-00004.safetensors: 100%|██████████| 4.98G/4.98G [04:23<00:00, 18.9MB/s]
model-00003-of-00004.safetensors: 100%|██████████| 4.92G/4.92G [04:33<00:00, 18.0MB/s]
model-00002-of-00004.safetensors: 100%|██████████| 5.00G/5.00G [04:47<00:00, 17.4MB/s]
Fetching 16 files: 100%|██████████| 16/16 [05:42<00:00, 21.43s/it]:33<00:52, 9.33MB/s]
## Successfully downloaded model_files|██████████| 5.00G/5.00G [04:47<00:00, 110MB/s] 

Deleting directory : /data/hf_cache

Optional - verify the events in the namespace for the pvc creation

CommandCommand output

k get events | awk '{print $1, $3}'

$ k get events | awk '{print $1, $3}'

3m43s Scheduled
3m43s SuccessfulAttachVolume
3m36s Pulling
3m29s Pulled
3m29s Created
3m29s Started
3m43s SuccessfulCreate
90s   Completed
3m53s Provisioning
3m53s ExternalProvisioning
3m45s ProvisioningSucceeded
3m53s PvcCreateSuccessful
3m48s PvcNotBound
3m43s ModelProcessorJobActive
90s   ModelProcessorJobComplete

The model is downloaded to the Nutanix Files pvc volume.

After a successful model import, you will see it in Active status in the NAI UI under Models menu

Create and Test Inference Endpoint

In this section we will create an inference endpoint using the downloaded model.

Navigate to Inference Endpoints menu and click on Create Endpoint button
Fill the following details:
- Endpoint Name: llama-8b
- Model Instance Name: Meta-LLaMA-8B-Instruct
- Use GPUs for running the models : Checked
- No of GPUs (per instance):
- GPU Card: NVIDIA-L40S (or other available GPU)
- No of Instances: 1
- API Keys: Create a new API key or use an existing one
Click on Create

Monitor the nai-admin namespace to check if the services are coming up

CommandCommand output

kubens nai-admin
kubectl get po,deploy

kubens nai-admin
get po,deploy
NAME                                                     READY   STATUS        RESTARTS   AGE
pod/llama8b-predictor-00001-deployment-9ffd786db-6wkzt   2/2     Running       0          71m

NAME                                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/llama8b-predictor-00001-deployment   1/1     1            0           3d17h

Check the events in the nai-admin namespace for resource usage to make sure all

CommandCommand output

kubectl get events -n nai-admin --sort-by='.lastTimestamp' | awk '{print $1, $3, $5}'

$ kubectl get events -n nai-admin --sort-by='.lastTimestamp' | awk '{print $1, $3, $5}'

110s FinalizerUpdate Updated
110s FinalizerUpdate Updated
110s RevisionReady Revision
110s ConfigurationReady Configuration
110s LatestReadyUpdate LatestReadyRevisionName
110s Created Created
110s Created Created
110s Created Created
110s InferenceServiceReady InferenceService
110s Created Created

Once the services are running, check the status of the inference service

CommandCommand output

kubectl get isvc

kubectl get isvc

NAME      URL                                          READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION       AGE
llama8b   http://llama8b.nai-admin.svc.cluster.local   True           100                              llama8b-predictor-00001   3d17h