Skip to content

Commit 0126778

Browse files
xiguiwchickenrae
authored andcommitted
Add build_chatbot_blog
1 parent 2793ccb commit 0126778

9 files changed

+251
-0
lines changed

Diff for: getting-started/assets/UI.png

30.2 KB
Loading

Diff for: getting-started/assets/chatqna-flow.png

195 KB
Loading

Diff for: getting-started/assets/cpu-amx.png

31 KB
Loading

Diff for: getting-started/assets/cpu-model.png

56.5 KB
Loading

Diff for: getting-started/assets/framework.png

63.2 KB
Loading

Diff for: getting-started/assets/kube_pod.png

34 KB
Loading

Diff for: getting-started/assets/kube_service.png

43.6 KB
Loading

Diff for: getting-started/assets/output.gif

2.09 MB
Loading

Diff for: getting-started/build_chatbot_blog.md

+251
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
# Build Your ChatBot with Open Platform for Enterprise AI
2+
3+
## Generative AI: A Transformational Force for Enterprises
4+
5+
Generative AI demonstrates immense potential in enhancing productivity and driving innovation across various industries. Its ability to address enterprise challenges by offering innovative and efficient solutions makes it a powerful tool for businesses seeking a competitive edge.
6+
7+
Here are several ways in which generative AI can assist enterprises:
8+
9+
* Data Analysis and Insights: By analyzing vast amounts of enterprise data, generative AI can uncover patterns, provide actionable insights, and support better decision-making processes.
10+
11+
* Document Management: Generative AI streamlines the organization, summarization, and retrieval of documents, enhancing efficiency in knowledge management systems.
12+
13+
* Customer Support and Chatbots: AI-driven chatbots can provide 24/7 customer service, respond to inquiries, and even handle complex issues by understanding user intents and offering personalized solutions.
14+
15+
* Code Generation and Software Development: AI models can write code snippets, debug software, and even recommend solutions to programming challenges, accelerating the software development lifecycle.
16+
17+
* Fraud Detection and Risk Management: By analyzing transaction patterns and detecting anomalies, generative AI helps enterprises identify and mitigate potential risks or fraudulent activities.
18+
19+
* Healthcare and Well-being: In enterprises with healthcare initiatives, generative AI can support mental health programs by generating therapeutic content or helping manage employee well-being through tailored recommendations.
20+
21+
By leveraging generative AI in these areas, enterprises can not only solve existing problems but also unlock new opportunities for innovation and growth.
22+
23+
In this blog, we introduce the Open Platform for Enterprise AI (OPEA), a powerful GenAI framework to help you build your GenAI applications. First, we explore the features and attributes of OPEA, and then we show you how to build your ChatBot with OPEA step by step.
24+
25+
## Open Platform for Enterprise AI
26+
27+
Open Platform for Enterprise AI (OPEA) is an open platform project that allows you to create open, multi-provider, robust, and composable GenAI solutions that harness the best innovations across the ecosystem.
28+
29+
OPEA platform includes:
30+
31+
* Detailed framework of composable building blocks for state-of-the-art generative AI systems including LLMs, data stores, and prompt engines.
32+
* Architectural blueprints of retrieval-augmented generative AI component stack structure and end-to-end workflows.
33+
* A four-step assessment for grading generative AI systems around performance, features, trustworthiness, and enterprise-grade readiness.
34+
35+
OPEA is designed with the following considerations:
36+
37+
**Efficient**
38+
Infrastructure Utilization: Harnesses existing infrastructure, including AI accelerators or other hardware of your choosing.
39+
OPEA supports a wide range of hardware, including Intel Xeon, Gaudi Accelerator, Intel Arc GPU, Nvidia GPU, and AMD RoCm.
40+
41+
**Seamless**
42+
Enterprise Integration: Seamlessly integrates with enterprise software, providing heterogeneous support and stability across systems and networks.
43+
44+
**Open**
45+
Innovation and Flexibility: Brings together best-of-breed innovations and is free from proprietary vendor lock-in, ensuring flexibility and adaptability.
46+
47+
**Ubiquitous**
48+
Versatile Deployment: Runs everywhere through a flexible architecture designed for cloud, data center, edge, and PC environments.
49+
50+
**Trusted**
51+
Security and Transparency: Features a secure, enterprise-ready pipeline with tools for responsibility, transparency, and traceability.
52+
53+
**Scalable**
54+
Ecosystem and Growth: Access to a vibrant ecosystem of partners to help build and scale your solution.
55+
56+
## Build Your ChatBot with OPEA
57+
58+
OPEA [GenAIExamples](https://github.com/opea-project/GenAIExamples) are designed to give developers an easy entry into generative AI, featuring microservice-based samples that simplify the processes of deploying, testing, and scaling GenAI applications.
59+
All examples are fully compatible with Docker and Kubernetes, supporting a wide range of hardware platforms such as Gaudi, Xeon, and NVIDIA GPU, and other hardwares, ensuring flexibility and efficiency for your GenAI adoption.
60+
61+
In this section, we deploy a GenAIExample, ChatQnA, on Amazon Web Services (AWS) in two different ways: **Docker** and **Kubernetes**.
62+
63+
ChatQnA is a Retrieval-Augmented Generation (RAG) chatbot, which integrates the power of retrieval systems to fetch relevant, domain-specific knowledge with generative AI to produce human-like responses. ChatQnA dataflow is shown in Figure 1.
64+
65+
RAG chatbots can address various use cases by providing highly accurate and context-aware interactions, which can be used in customer support, internal knowledge management, finance and accounting, and technical support.
66+
67+
![chatbot_dataflow](assets/chatqna-flow.png)
68+
<div align="center">
69+
Figure 1. ChatQnA Dataflow
70+
</div>
71+
72+
### Prerequisites
73+
74+
**Hardware**
75+
76+
* CPU: the 4th (and later) Gen Intel Xeon with Intel (Advanced Matrix Extension) AMX
77+
* Minimum Memory Size: 64G
78+
* Storage: 100GB disk space
79+
80+
The recommended configurations are Amazon EC2 c7i.8xlarge and c7i.16xlarge instance types. These instances are Intel Xeon with AMX, to leverage 4th Generation (and later) Intel Xeon Scalable processors that are optimized for demanding workloads.
81+
82+
**Software**
83+
84+
* OS: Ubuntu 22.04 LTS
85+
86+
**Required Models:**
87+
88+
By default, the embedding, reranking, and LLM models are set to the following values:
89+
90+
|Service | Model|
91+
|-----------|---------------------------|
92+
|Embedding | BAAI/bge-base-en-v1.5 |
93+
|Reranking | BAAI/bge-reranker-base |
94+
| LLM | Intel/neural-chat-7b-v3-3 |
95+
96+
### Deploy by Docker on AWS EC2 Instance
97+
98+
Here are the steps to deploy ChatQnA using Docker
99+
100+
1. Download code and set up the environment variables.
101+
2. Run docker compose.
102+
3. Consume the ChatQnA service.
103+
104+
#### 1. Download Code and Setup Environment Variable
105+
106+
Follow these steps to download the code and set up environment variables:
107+
108+
```
109+
git clone https://github.com/opea-project/GenAIExamples.git
110+
```
111+
112+
Set the required environment variables:
113+
```
114+
cd GenAIExamples/ChatQnA/docker_compose/intel/cpu/xeon
115+
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
116+
source set_env.sh
117+
```
118+
119+
#### 2. Start Docker Container
120+
121+
```
122+
docker compose up -d
123+
```
124+
It automatically downloads the following Docker images from Docker Hub and starts up the Docker container.
125+
|Image name | tag |
126+
|-----------|---------------------------|
127+
| redis/redis-stack |7.2.0-v9 |
128+
| opea/dataprep-redis | latest|
129+
| ghcr.io/huggingface/text-embeddings-inference |cpu-1.5|
130+
| opea/retriever-redis | latest |
131+
| ghcr.io/huggingface/text-embeddings-inference |cpu-1.5|
132+
| ghcr.io/huggingface/text-generation-inference |sha-e4201f4-intel-cpu|
133+
| opea/chatqna | latest |
134+
| opea/chatqna-ui | latest |
135+
| opea/nginx | latest |
136+
137+
#### Check Docker Container Status
138+
139+
Run this command to check Docker container status
140+
`docker ps -a`
141+
142+
Make sure all the docker container status are `UP` as following:
143+
144+
```
145+
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
146+
ef155b97ef13 opea/nginx:latest "/docker-entrypoint.…" 3 minutes ago Up 3 minutes 0.0.0.0:80->80/tcp, :::80->80/tcp chatqna-xeon-nginx-server
147+
79173ee7a359 opea/chatqna-ui:latest "docker-entrypoint.s…" 3 minutes ago Up 3 minutes 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp chatqna-xeon-ui-server
148+
bdb99b1263cd opea/chatqna:latest "python chatqna.py" 3 minutes ago Up 3 minutes 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp chatqna-xeon-backend-server
149+
7e5c3f8c2bba opea/retriever-redis:latest "python retriever_re…" 3 minutes ago Up 3 minutes 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp retriever-redis-server
150+
7e8254869ee4 opea/dataprep-redis:latest "python prepare_doc_…" 3 minutes ago Up 3 minutes 0.0.0.0:6007->6007/tcp, :::6007->6007/tcp dataprep-redis-server
151+
135e0e180ce5 ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu "text-generation-lau…" 3 minutes ago Up 41 seconds 0.0.0.0:9009->80/tcp, [::]:9009->80/tcp tgi-service
152+
ffefc6d4ada2 ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 "text-embeddings-rou…" 3 minutes ago Up 3 minutes 0.0.0.0:6006->80/tcp, [::]:6006->80/tcp tei-embedding-server
153+
17b22a057002 ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 "text-embeddings-rou…" 3 minutes ago Up 3 minutes 0.0.0.0:8808->80/tcp, [::]:8808->80/tcp tei-reranking-server
154+
cf91b1a4f5d2 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" 3 minutes ago Up 3 minutes 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp redis-vector-db
155+
156+
```
157+
158+
#### Check TGI Service Is Ready
159+
160+
It takes minutes for TGI service to download LLM models and do warm up inference.
161+
162+
Check the TGI service log to make sure it is ready.
163+
164+
Run this command to check the log:
165+
`docker logs tgi-service | grep Connected`
166+
167+
The following log indicates TGI service is ready.
168+
```
169+
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
170+
```
171+
172+
#### Consume the ChatQnA Service
173+
174+
Please consume ChatQnA service until **tgi-service is ready**
175+
176+
Open the following URL in your browser:
177+
178+
```
179+
http://{Public-IPv4-address}:80
180+
```
181+
Make sure to access the AWS EC2 instance through the `Public-IPv4-address`.
182+
183+
![consume chagtqna](assets/output.gif)
184+
<div align="center">
185+
Figure 2. Access ChatQnA
186+
</div>
187+
188+
189+
### Deploy by Kubernetes on AWS EC2 Instance
190+
191+
Assumed you set up the Kubernetes on EC2 instance. Please refer to [k8s_install_kubespray](https://github.com/opea-project/docs/blob/main/guide/installation/k8s_install/k8s_install_kubespray.md) to set up Kubernetes.
192+
193+
Here are the steps to deploy ChatQnA using Kubernetes:
194+
195+
1. Download code and set up the environment variables.
196+
2. Start kubernetes Services
197+
3. Consume the ChatQnA service.
198+
199+
#### 1. Download Code and Setup Environment Variable
200+
201+
Follow these steps to download code set up environment variables:
202+
203+
(Skip this if you have already downloaded the code)
204+
```
205+
git clone https://github.com/opea-project/GenAIExamples.git
206+
```
207+
208+
Set the required environment variables:
209+
```
210+
cd GenAIExamples/ChatQnA/kubernetes/intel/cpu/xeon/manifest
211+
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
212+
sed -i "s|insert-your-huggingface-token-here|${HUGGINGFACEHUB_API_TOKEN}|g" chatqna.yaml
213+
```
214+
#### 2. Start Kubernetes Services
215+
216+
```
217+
kubectl apply -f chatqna.yaml
218+
```
219+
##### Check Kubernetes Status
220+
221+
1. Check services status to get the port number to access the ChatQnA:
222+
```
223+
kubectl get services
224+
```
225+
226+
![kubernetes sercies](assets/kube_service.png)
227+
<div align="center">
228+
Figure 3. Kubernetes Service
229+
</div>
230+
The nginx nodeport is **31146**.
231+
232+
2. Check pod status
233+
```
234+
kubectl get pods
235+
```
236+
Make sure all pods are ready in state
237+
![kubernetes pods](assets/kube_pod.png)
238+
<div align="center">
239+
Figure 4. Kubernetes Pod Status
240+
</div>
241+
242+
#### Consume the ChatQnA Service
243+
244+
Open the following URL in your browser.
245+
```
246+
http://{Public-IPv4-address}:31146
247+
```
248+
249+
Here the port number `31146` is from Kubernetes service `chatqna-nginx` exposed port in Figure 2.
250+
251+
For ChatQnA example interaction, please refer to the [consume service section](#Consume-the-ChatQnA-Service) for details.

0 commit comments

Comments
 (0)