Deploying Nutanix Enterprise AI (NAI) NVD Reference Application
Version 2.6.0
This version of the NAI deployment is based on the Nutanix Enterprise AI (NAI) v2.6.0 release.
stateDiagram-v2
direction LR
state DeployNAI {
[*] --> DeployNAIAdmin
DeployNAIAdmin --> InstallSSLCert
InstallSSLCert --> DownloadModel
DownloadModel --> CreateNAI
CreateNAI --> [*]
}
[*] --> PreRequisites
PreRequisites --> DeployNAI
DeployNAI --> TestNAI : next section
TestNAI --> [*]
Prepare for NAI Deployment
Changes in NAI v2.6.0
- Kserve is of at least of
v0.15.0 - Cert-manager is at least of
v1.17.2 - OpenTelemetry operator is at least of
v0.102.0
Enable NKP Applications through NKP GUI
Enable these NKP Operators from NKP GUI.
Note
In this lab, we will be using the Management Cluster Workspace to deploy our Nutanix Enterprise AI (NAI)
However, in a customer environment, it is recommended to use a separate workload NKP cluster.
- In the NKP GUI, Go to Clusters
- Click on Management Cluster Workspace
- Go to Applications
-
Search and enable the following applications: follow this order to install dependencies for NAI application
- Kube-prometheus-stack: version
71.0.0or later (pre-installed on NKP cluster)
- Kube-prometheus-stack: version
Enable Pre-requisite Applications
Early Access(EA)/Technical Preview(TP) Software with NAI v2.6.0
In this lab, we will deploy EA and TP version of the following software to test the following:
- Envoy Gateway in a AI Gateway mode
-
Nutanix Enterprise AI
- Unified Endpoints - multiple endpoints for HA and token-based rate limiting
- Providers - Add remote endpoints from providers to utilize their models in Nutanix Enterprise AI workloads.
We will enable the following pre-requisite applications through command line:
- Envoy Gateway
v1.6.3 - Kserve:
v0.15.0in raw deployment mode
Note
The following application are pre-installed on NKP cluster with Pro license
- Cert Manager
v1.17.2or higher
Check if Cert Manager is installed (pre-installed on NKP cluster)
If not installed, use the following command to install it
-
Open Terminal in
VSCode -
Run the command to load the environment variables
-
Install Envoy Gateway CRDs
v1.6.3Pulled: docker.io/envoyproxy/gateway-crds-helm:v1.6.3 Digest: sha256:e94d3fdf5d4cb08e2c8efa8c1da133b9804c2e88a3acb4d0e20adb8755a60174 customresourcedefinition.apiextensions.k8s.io/backendtlspolicies.gateway.networking.k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/gatewayclasses.gateway.networking.k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/gateways.gateway.networking.k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/grpcroutes.gateway.networking.k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/httproutes.gateway.networking.k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/referencegrants.gateway.networking.k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/tcproutes.gateway.networking.k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/tlsroutes.gateway.networking.k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/udproutes.gateway.networking.k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/xbackendtrafficpolicies.gateway.networking.x-k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/xlistenersets.gateway.networking.x-k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/xmeshes.gateway.networking.x-k8s.io serverside-applied customresourcedefinition.apiextensions.k8s.io/backends.gateway.envoyproxy.io serverside-applied customresourcedefinition.apiextensions.k8s.io/backendtrafficpolicies.gateway.envoyproxy.io serverside-applied customresourcedefinition.apiextensions.k8s.io/clienttrafficpolicies.gateway.envoyproxy.io serverside-applied customresourcedefinition.apiextensions.k8s.io/envoyextensionpolicies.gateway.envoyproxy.io serverside-applied customresourcedefinition.apiextensions.k8s.io/envoypatchpolicies.gateway.envoyproxy.io serverside-applied customresourcedefinition.apiextensions.k8s.io/envoyproxies.gateway.envoyproxy.io serverside-applied customresourcedefinition.apiextensions.k8s.io/httproutefilters.gateway.envoyproxy.io serverside-applied customresourcedefinition.apiextensions.k8s.io/securitypolicies.gateway.envoyproxy.io serverside-applied -
Create the AI Gateway mode configuration template file
eg-config-for-gateway-mode.yamlto enable advanced featuresconfig: envoyGateway: gateway: controllerName: "gateway.envoyproxy.io/gatewayclass-controller" logging: level: default: "info" provider: kubernetes: rateLimitDeployment: container: image: "docker.io/envoyproxy/ratelimit:99d85510" patch: type: "StrategicMerge" value: spec: template: spec: containers: - imagePullPolicy: "IfNotPresent" name: "envoy-ratelimit" image: "docker.io/envoyproxy/ratelimit:99d85510" type: "Kubernetes" extensionApis: enableEnvoyPatchPolicy: true enableBackend: true extensionManager: maxMessageSize: 11Mi backendResources: - group: inference.networking.k8s.io kind: InferencePool version: v1 hooks: xdsTranslator: translation: listener: includeAll: true route: includeAll: true cluster: includeAll: true secret: includeAll: true post: - "Translation" - "Cluster" - "Route" service: fqdn: hostname: "ai-gateway-controller.nai-system.svc.cluster.local" port: 1063 rateLimit: backend: type: "Redis" redis: url: "redis-sentinel.nai-system.svc.cluster.local:6379" -
Install Envoy Gateway
v1.6.3Pulled: docker.io/envoyproxy/gateway-helm:v1.6.3 Digest: sha256:6dca101fdc0d41c702c1070eb42db119a2768a33388ba28041ae615cbe262aaf Release "eg" has been upgraded. Happy Helming! NAME: eg LAST DEPLOYED: Sat Apr 4 08:34:21 2026 NAMESPACE: envoy-gateway-system STATUS: deployed REVISION: 2 TEST SUITE: None NOTES: ************************************************************************** *** PLEASE BE PATIENT: Envoy Gateway may take a few minutes to install *** ************************************************************************** Envoy Gateway is an open source project for managing Envoy Proxy as a standalone or Kubernetes-based application gateway. Thank you for installing Envoy Gateway! 🎉 Your release is named: eg. 🎉 Your release is in namespace: envoy-gateway-system. 🎉 To learn more about the release, try: $ helm status eg -n envoy-gateway-system $ helm get all eg -n envoy-gateway-system To have a quickstart of Envoy Gateway, please refer to https://gateway.envoyproxy.io/latest/tasks/quickstart. To get more details, please visit https://gateway.envoyproxy.io and https://github.com/envoyproxy/gateway. -
Check if Envoy Gateway resources are ready
-
Open
$HOME/.envfile inVSCode -
Add (append) the following line and save it
-
Load the
.envvariables -
Install
kserveusing the following commandsPulled: ghcr.io/kserve/charts/kserve-crd:v0.15.0 Digest: sha256:57ad1a5475fd625cb558214ba711752aa77b7d91686a391a5f5320cfa72f3fa8 Release "kserve-crd" has been upgraded. Happy Helming! NAME: kserve-crd LAST DEPLOYED: Sat Apr 4 08:40:05 2026 NAMESPACE: kserve STATUS: deployed REVISION: 2 TEST SUITE: None -
Check if
kservepods are running -
Install OpenTelemetry Operator:
Release "opentelemetry-operator" has been upgraded. Happy Helming! NAME: opentelemetry-operator LAST DEPLOYED: Sat Apr 4 08:42:43 2026 NAMESPACE: opentelemetry STATUS: deployed REVISION: 2 NOTES: [WARNING] No resource limits or requests were set. Consider setter resource requests and limits via the `resources` field. opentelemetry-operator has been installed. Check its status by running: kubectl --namespace opentelemetry get pods -l "app.kubernetes.io/instance=opentelemetry-operator" Visit https://github.com/open-telemetry/opentelemetry-operator for instructions on how to create & configure OpenTelemetryCollector and Instrumentation custom resources by using the Operator.
Note
It may take a few minutes for each application to be up and running. Monitor the deployment to make sure that these applications are running before moving on to the next section.
Deploy NAI
We will use the Docker login credentials we created in the previous section to download the NAI Docker images.
Change the Docker login credentials
The following Docker based environment variable values need to be changed from your own Docker environment variables to the credentials downloaded from Nutanix Portal.
$DOCKER_NAI_USERNAME$DOCKER_NAI_ASSWORD$DOCKER_NAI_EMAIL
-
Open
$HOME/.envfile inVSCode -
Add (append) the following environment variables and save it
export REGISTRY_SECRET_NAME=_k8s_secret_for_nai export DOCKER_SERVER=https://index.docker.io/v1/ export DOCKER_NAI_USERNAME=_GA_release_docker_username export DOCKER_NAI_PASSWORD=_GA_release_docker_password export DOCKER_NAI_EMAIL=_GA_release_docker_email export NAI_CORE_VERSION=_GA_release_nai_core_version export NAI_API_RWX_STORAGECLASS=_nkp_rwx_storage_class export NAI_DEFAULT_RWO_STORAGECLASS=_nkp_rwo_storage_class export NKP_WORKSPACE_NAMESPACE=_nkp_workspace nameexport REGISTRY_SECRET_NAME=nai-regcred export DOCKER_SERVER=https://index.docker.io/v1/ export DOCKER_NAI_USERNAME=ntnxsvcgpt export DOCKER_NAI_PASSWORD=dckr_pat_XXXXXXXXXXXXXXXXXXXXXXXXX export DOCKER_NAI_EMAIL=ntnxsvcgpt export NAI_CORE_VERSION=2.6.0 export NAI_API_RWX_STORAGECLASS=nai-nfs-storage export NAI_DEFAULT_RWO_STORAGECLASS=nutanix-volume export NKP_WORKSPACE_NAMESPACE=kommander-workspace -
Source the environment variables
-
Create the nai-system namespace to install Nutanix Enterprise AI
-
Create docker registry Secrets in both
nai-systemandenvoy-gateway-systemnamespaces. -
Add NAI helm charts
-
Install NAI operator
-
Check if all NAI operator pods are running
-
Install NAI core in AI gateway mode
Info
This installation takes about 5-10 minutes depending on the available resources
helm upgrade --install nai-core ntnx-charts/nai-core --version=$NAI_CORE_VERSION -n nai-system --create-namespace --wait \ --set "global.imagePullSecrets[0].name=${REGISTRY_SECRET_NAME}" \ --set "naiAIGateway.enabled=true" \ --set "naiApi.storageClassName=${NAI_API_RWX_STORAGECLASS}" \ --set "defaultStorageClassName=${NAI_DEFAULT_RWO_STORAGECLASS}" \ --set "naiMonitoring.opentelemetry.storageClassName=${NAI_API_RWX_STORAGECLASS}" \ --set "nai-clickhouse-keeper.clickhouseKeeper.storage.storageClass=${NAI_DEFAULT_RWO_STORAGECLASS}" \ --set "nai-clickhouse-server.clickhouse.storage.storageClass=${NAI_DEFAULT_RWO_STORAGECLASS}" \ --set "naiMonitoring.nodeExporter.serviceMonitor.namespaceSelector.matchNames[0]=${NKP_WORKSPACE_NAMESPACE}" \ --set "naiMonitoring.dcgmExporter.serviceMonitor.namespaceSelector.matchNames[0]=${NKP_WORKSPACE_NAMESPACE}" \ --insecure-skip-tls-verifyhelm upgrade --install nai-core ntnx-charts/nai-core --version=2.6.0 -n nai-system --create-namespace --wait \ --set "global.imagePullSecrets[0].name=nai-regcred" \ --set "naiAIGateway.enabled=true" \ --set "naiApi.storageClassName=nai-nfs-storage" \ --set "defaultStorageClassName=nutanix-volume" \ --set "naiMonitoring.opentelemetry.storageClassName=nai-nfs-storage" \ --set "nai-clickhouse-keeper.clickhouseKeeper.storage.storageClass=nutanix-volume" \ --set "nai-clickhouse-server.clickhouse.storage.storageClass=nutanix-volume" \ --set "naiMonitoring.nodeExporter.serviceMonitor.namespaceSelector.matchNames[0]=kommander" \ --set "naiMonitoring.dcgmExporter.serviceMonitor.namespaceSelector.matchNames[0]=kommander" \ --insecure-skip-tls-verify -
Verify that the NAI Core Pods are running and healthy - there should be more jobs completing and pods coming up to establish NAI functionality
Active namespace is "nai-system". NAME READY STATUS RESTARTS AGE ai-gateway-controller-6b786974b5-6gqt6 1/1 Running 0 13m chi-nai-clickhouse-server-chcluster1-0-0-0 1/1 Running 0 2m1s chk-nai-clickhouse-keeper-chkeeper-0-0-0 1/1 Running 0 105s iam-database-bootstrap-vaszw-s7mpb 0/1 Completed 0 2m16s iam-proxy-686fff8f6d-cbgjs 1/1 Running 0 2m16s iam-proxy-control-plane-854c76b8cc-tblt8 1/1 Running 0 2m16s iam-themis-757776777b-mq9pz 1/1 Running 0 2m15s iam-themis-bootstrap-oeiws-7vmcr 0/1 Completed 0 2m16s iam-ui-7f6bb5b477-cb9tz 1/1 Running 0 2m16s iam-user-authn-78d6b7d8df-tsgs4 1/1 Running 0 2m16s nai-api-557d94c66f-hxx57 1/1 Running 0 2m16s nai-api-db-migrate-g2bni-mbdt4 0/1 Completed 0 2m16s nai-clickhouse-schema-job-1775300590-4gfnh 0/1 Completed 0 2m16s nai-db-0 1/1 Running 0 2m16s nai-iep-model-controller-77f44f88c-ndpp2 1/1 Running 0 2m16s nai-labs-86cb964886-xhdxk 1/1 Running 0 2m16s nai-oauth2-proxy-bdf7f85cf-2nxp5 1/1 Running 0 2m15s nai-oidc-client-registration-fbb1j-jdwt5 0/1 Completed 0 2m16s nai-operators-nai-clickhouse-operator-f8f666db9-z8vfc 2/2 Running 0 13m nai-otel-collector-collector-94mm9 1/1 Running 0 2m14s nai-otel-collector-collector-cpdx6 1/1 Running 0 2m14s nai-otel-collector-collector-pqk5q 1/1 Running 0 2m14s nai-otel-collector-collector-qxg5r 1/1 Running 0 2m14s nai-otel-collector-collector-rpl4f 1/1 Running 0 2m14s nai-otel-collector-collector-wf2fl 1/1 Running 0 2m14s nai-otel-collector-collector-wz5qz 1/1 Running 0 2m14s nai-otel-collector-targetallocator-fbc8688d7-c9vm5 1/1 Running 0 2m14s nai-ui-8648bd7dbc-mgf5z 1/1 Running 0 2m16s redis-standalone-684f6dd8f7-7rz2v 2/2 Running 0 13m -
The Prometheus monitoring from the NKP catalog has specific RBAC rules applied. Create the required
clusterRoleto enable Nutanix Enterprise AI to fetch metricskubectl patch clusterrole nai-otel-role --type='json' -p='[ { "op": "add", "path": "/rules/-", "value": { "apiGroups": [""], "resources": ["services/kube-prometheus-stack-prometheus-node-exporter"], "verbs": ["get"] } } ]'kubectl patch servicemonitor nai-node-exporter-monitor -n nai-system --type='json' -p='[ {"op": "add", "path": "/spec/endpoints/0/bearerTokenFile", "value": "/var/run/secrets/kubernetes.io/serviceaccount/token"}, {"op": "replace", "path": "/spec/endpoints/0/scheme", "value": "https"}, {"op": "add", "path": "/spec/endpoints/0/tlsConfig", "value": {"insecureSkipVerify": true}} ]'
Install SSL Certificate and Gateway Elements
In this section we will install SSL Certificate to access the NAI UI. This is required as the endpoint will only work with a ssl endpoint with a valid certificate.
NAI UI is accessible using the Ingress Gateway.
The following steps show how cert-manager can be used to generate a self signed certificate using the default selfsigned-issuer present in the cluster.
If you are using Public Certificate Authority (CA) for NAI SSL Certificate
If an organization generates certificates using a different mechanism then obtain the certificate + key and create a kubernetes secret manually using the following command:
Skip the steps in this section to create a self-signed certificate resource.
-
Get the NAI UI ingress gateway host using the following command:
NAI_UI_ENDPOINT=$(kubectl get svc -n envoy-gateway-system -l "gateway.envoyproxy.io/owning-gateway-name=nai-ingress-gateway,gateway.envoyproxy.io/owning-gateway-namespace=nai-system" -o jsonpath='{.items[0].status.loadBalancer.ingress[0].ip}' | grep -v '^$' || kubectl get svc -n envoy-gateway-system -l "gateway.envoyproxy.io/owning-gateway-name=nai-ingress-gateway,gateway.envoyproxy.io/owning-gateway-namespace=nai-system" -o jsonpath='{.items[0].status.loadBalancer.ingress[0].hostname}') -
Get the value of
NAI_UI_ENDPOINTenvironment variable=== " Command"
```bash echo $NAI_UI_ENDPOINT ``` -
We will use the command output e.g:
10.x.x.216as the IP address for NAI as reserved in this section -
Construct the FQDN of NAI UI using nip.io and we will use this FQDN as the certificate's Common Name (CN).
-
Create the ingress resource certificate using the following command:
cat << EOF | k apply -f - apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: nai-cert namespace: nai-system spec: issuerRef: name: selfsigned-issuer kind: ClusterIssuer secretName: nai-cert commonName: nai.${NAI_UI_ENDPOINT}.nip.io dnsNames: - nai.${NAI_UI_ENDPOINT}.nip.io ipAddresses: - ${NAI_UI_ENDPOINT} EOF -
Patch the Envoy gateway with the
nai-certcertificate details -
Create EnvoyProxy
-
Patch the
nai-ingress-gatewayresource with the newEnvoyProxydetails
Accessing the UI
-
In a browser, open the following URL to connect to the NAI UI
-
Change the password for the
adminuser -
Login using
adminuser and password.
Download Model
We will download and user llama3 8B model which we sized for in the previous section.
- In the NAI GUI, go to Models
- Click on Import Model from Hugging Face
- Choose the
meta-llama/Meta-Llama-3.1-8B-Instructmodel -
Input your Hugging Face token that was created in the previous section and click Import
-
Provide the Model Instance Name as
Meta-Llama-3.1-8B-Instructand click Import -
Go to VSC Terminal to monitor the download
Get jobs in nai-admin namespacekubens nai-admin ✔ Active namespace is "nai-admin" kubectl get jobs NAME COMPLETIONS DURATION AGE nai-c0d6ca61-1629-43d2-b57a-9f-model-job 0/1 4m56s 4m56Validate creation of pods and PVCkubectl get po,pvc NAME READY STATUS RESTARTS AGE nai-c0d6ca61-1629-43d2-b57a-9f-model-job-9nmff 1/1 Running 0 4m49s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE nai-c0d6ca61-1629-43d2-b57a-9f-pvc-claim Bound pvc-a63d27a4-2541-4293-b680-514b8b890fe0 28Gi RWX nai-nfs-storage <unset> 2dVerify download of model using pod logskubectl logs -f nai-c0d6ca61-1629-43d2-b57a-9f-model-job-9nmff /venv/lib/python3.9/site-packages/huggingface_hub/file_download.py:983: UserWarning: Not enough free disk space to download the file. The expected file size is: 0.05 MB. The target location /data/model-files only has 0.00 MB free disk space. warnings.warn( tokenizer_config.json: 100%|██████████| 51.0k/51.0k [00:00<00:00, 3.26MB/s] tokenizer.json: 100%|██████████| 9.09M/9.09M [00:00<00:00, 35.0MB/s]<00:30, 150MB/s] model-00004-of-00004.safetensors: 100%|██████████| 1.17G/1.17G [00:12<00:00, 94.1MB/s] model-00001-of-00004.safetensors: 100%|██████████| 4.98G/4.98G [04:23<00:00, 18.9MB/s] model-00003-of-00004.safetensors: 100%|██████████| 4.92G/4.92G [04:33<00:00, 18.0MB/s] model-00002-of-00004.safetensors: 100%|██████████| 5.00G/5.00G [04:47<00:00, 17.4MB/s] Fetching 16 files: 100%|██████████| 16/16 [05:42<00:00, 21.43s/it]:33<00:52, 9.33MB/s] ## Successfully downloaded model_files|██████████| 5.00G/5.00G [04:47<00:00, 110MB/s] Deleting directory : /data/hf_cache -
Optional - verify the events in the namespace for the pvc creation
$ k get events | awk '{print $1, $3}' 3m43s Scheduled 3m43s SuccessfulAttachVolume 3m36s Pulling 3m29s Pulled 3m29s Created 3m29s Started 3m43s SuccessfulCreate 90s Completed 3m53s Provisioning 3m53s ExternalProvisioning 3m45s ProvisioningSucceeded 3m53s PvcCreateSuccessful 3m48s PvcNotBound 3m43s ModelProcessorJobActive 90s ModelProcessorJobComplete
The model is downloaded to the Nutanix Files pvc volume.
After a successful model import, you will see it in Active status in the NAI UI under Models menu

Create and Test Inference Endpoint
In this section we will create an inference endpoint using the downloaded model.
- Navigate to Inference Endpoints menu and click on Create Endpoint button
-
Fill the following details based on GPU or CPU availability:
Tip
NAI from
v2.3can host a model up to 7 billion parameters on CPU only Nutanix nodes- Endpoint Name:
llama-8b - Model Instance Name:
Meta-LLaMA-8B-Instruct - Use GPUs for running the models :
Checked - No of GPUs (per instance):
- GPU Card:
NVIDIA-L40S(or other available GPU) - No of Instances:
1 - API Keys: Create a new API key or use an existing one
- Endpoint Name:
llama-8b - Model Instance Name:
Meta-LLaMA-8B-Instruct - Use GPUs for running the models :
leave unchecked - No of Instances:
1 - API Keys: Create a new API key or use an existing one
- Endpoint Name:
-
Click on Create
-
Monitor the
nai-adminnamespace to check if the services are coming up -
Check the events in the
nai-adminnamespace for resource usage to make sure there are no errors$ kubectl get events -n nai-admin --sort-by='.lastTimestamp' | awk '{print $1, $3, $5}' 110s FinalizerUpdate Updated 110s FinalizerUpdate Updated 110s RevisionReady Revision 110s ConfigurationReady Configuration 110s LatestReadyUpdate LatestReadyRevisionName 110s Created Created 110s Created Created 110s Created Created 110s InferenceServiceReady InferenceService 110s Created Created -
Once the services are running, check the status of the inference service