Deploying GPT-in-a-Box NVD Reference Application using GitOps (FluxCD)
stateDiagram-v2
direction LR
state TestNAI {
[*] --> CheckInferencingService
CheckInferencingService --> TestChatApp
TestChatApp --> [*]
}
[*] --> PreRequisites
PreRequisites --> DeployNAI
DeployNAI --> TestNAI : previous section
TestNAI --> [*]
Test Querying Inference Service API
-
Prepare the API key that was created in the previous section
-
Construct your
curl
command using the API key obtained above, and run it on the terminalcurl -k -X 'POST' 'https://nai.10.x.x.216.nip.io/api/v1/chat/completions' \ -H "Authorization: Bearer $API_KEY" \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "model": "llama-8b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ], "max_tokens": 256, "stream": false }'
{ "id": "9e55abd1-2c91-4dfc-bd04-5db78f65c8b2", "object": "chat.completion", "created": 1728966493, "model": "llama-8b", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris. It is a historic city on the Seine River in the north-central part of the country. Paris is also the political, cultural, and economic center of France." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 17, "completion_tokens": 41, "total_tokens": 58 }, "system_fingerprint": "" }
We have a successful NAI deployment.
Accessing LLM Frontend UI
-
In the NAI GUI, under Endpoints, click on the llama8b
-
Click on Test
-
Provide a sample prompt and check the output
Sample Chat Application
We have a sample chat application that uses NAI to provide chatbot capabilities. We will install and use the chat application in this section.
-
Download and push the chat application container from upstream registry to internal harbor registry
-
Run the following command to deploy the chat application.
Create the namespace
Create the application
kubectl apply -f -<<EOF apiVersion: apps/v1 kind: Deployment metadata: name: nai-chatapp namespace: chat labels: app: nai-chatapp spec: replicas: 1 selector: matchLabels: app: nai-chatapp template: metadata: labels: app: nai-chatapp spec: containers: - name: nai-chatapp image: johnugeorge/nai-chatapp:0.12 ports: - containerPort: 8502 EOF
Create the service
-
Insert
chat
as the subdomain in thenai.10.x.x.216.nip.io
main domain.Example: complete URL
kubectl apply -f -<<EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: nai-chatapp-route namespace: chat # Same namespace as your chat app service spec: parentRefs: - name: nai-ingress-gateway namespace: nai-system # Namespace of the Gateway hostnames: - "chat.nai.10.x.x.216.nip.io" # Input Gateway IP address rules: - matches: - path: type: PathPrefix value: / backendRefs: - name: nai-chatapp kind: Service port: 8502 EOF
-
We should be able to see the chat application running on the NAI cluster.
-
Input the following:
- Endpoint URL - e.g.
https://nai.10.x.x.216.nip.io/api/v1/chat/completions
(can be found in the Endpoints on NAI GUI) - Endpoint Name - e.g.
llama-8b
- API key - created during Endpoint creation
- Endpoint URL - e.g.
We have successfully deployed the following:
- Inferencing endpoint
- A sample chat application that uses NAI to provide chatbot capabilities