Skip to content

Commit c1c1d6a

Browse files
committed
fix: updates for RedisAI v1.2 - switch backend to PyTorch use Hummingbird.ml for scikit-learn -> TorchScript
1 parent 7983181 commit c1c1d6a

File tree

5 files changed

+67
-47
lines changed

5 files changed

+67
-47
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
.DS_Store
22
.vscode
3-
venv
3+
.venv
4+
target
5+
iris.pt

README.md

Lines changed: 29 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2,33 +2,29 @@
22

33
## Step 0: Setup RedisAI
44

5-
To use RedisAI, well, you need RedisAI. I've found the easiest way to do this is with Docker. First, pull the redismod image—it contians Redis with several popular modules ready to go:
5+
To use RedisAI, well, you need RedisAI. I've found the easiest way to do this is with Docker. First, pull the redismod image—it contains Redis with several popular modules ready to go:
66

77
$ docker image pull redislabs/redismod
88

99
Then run the image:
1010

11-
$ docker run \
12-
-p 6379:6379 \
13-
redislabs/redismod \
14-
--loadmodule /usr/lib/redis/modules/redisai.so \
15-
ONNX redisai_onnxruntime/redisai_onnxruntime.so
11+
$ docker run -p 6379:6379 --name redismod redislabs/redismod
1612

1713
And, you've got RedisAI up and running!
1814

1915
## Step 1: Setup Python Environment
2016

21-
You need a Python environment to make this all work. I used Python 3.8—the latest, greatest, and most updatest at the time of this writing. I also used `venv` to manage my environment.
17+
You need a Python environment to make this all work. I used Python 3.9—the latest, greatest, and most updatest at the time of this writing. I also used `venv` to manage my environment.
2218

23-
I'll assume you can download and install Python 3.8 on your own. So lets go ahead and setup the environment:
19+
I'll assume you can download and install Python 3.9 on your own. So lets go ahead and setup the environment:
2420

25-
$ python3.8 -m venv venv
21+
$ python3.9 -m venv .venv
2622

2723
Once `venv` is installed, you need to activate it:
2824

29-
$ . venv/bin/activate
25+
$ . ./.venv/bin/activate
3026

31-
Now when you run `python` from the command line, it will always point to Python3.8 and any libraries you install will only be for this specific environment. Usually, this includes a dated version of pip so go ahead an update that as well:
27+
Now when you run `python` from the command line, it will always point to Python3.9 and any libraries you install will only be for this specific environment. Usually, this includes a dated version of pip so go ahead an update that as well:
3228

3329
$ pip install --upgrade pip
3430

@@ -45,17 +41,18 @@ Next, let's install all the dependencies. These are all listed in `requirements.
4541

4642
Run that command, and you'll have all the dependencies installed and will be ready to run the code.
4743

48-
## Step 3: Build the ONNX Model
44+
## Step 3: Build the TorchScript Model
4945

50-
This is as easy as running the following:
46+
Load and train a Sklearn LogisticRegression model using the Iris Data Set. Use Microsoft's Hummingbird.ml to convert the Sklearn model into a TorchScript model for loading into RedisAI. Run the `build.py` Python script to generate the `iris.pt` model file:
5147

5248
$ python build.py
5349

5450
## Step 4: Deploy the Model into RedisAI
5551

5652
NOTE: This requires redis-cli. If you don't have redis-cli, I've found the easiest way to get it is to download, build, and install Redis itself. Details can be found at the [Redis quickstart](https://redis.io/topics/quickstart) page:
5753

58-
$ redis-cli -x AI.MODELSET iris ONNX CPU BLOB < iris.onnx
54+
$ redis-cli -x AI.MODELSTORE iris TORCH CPU BLOB < iris.pt
55+
OK
5956

6057
## Step 5: Make Some Predictions
6158

@@ -67,21 +64,30 @@ Set the input tensor with 2 sets of inputs of 4 values each:
6764

6865
> AI.TENSORSET iris:in FLOAT 2 4 VALUES 5.0 3.4 1.6 0.4 6.0 2.2 5.0 1.5
6966

70-
Make the predictions:
67+
Make the predictions (inferences) by executing the model:
7168

72-
> AI.MODELRUN iris INPUTS iris:in OUTPUTS iris:inferences iris:scores
69+
> AI.MODELEXECUTE iris INPUTS 1 iris:in OUTPUTS 2 iris:inferences iris:scores
7370

7471
Check the predictions:
7572

76-
> AI.TENSORGET iris_out:predictions VALUES
77-
73+
> AI.TENSORGET iris:inferences VALUES
7874
1) (integer) 0
7975
2) (integer) 2
8076

8177
Check the scores:
8278

83-
> AI.TENSORGET iris_out:scores VALUES
84-
85-
(error) ERR tensor key is empty
86-
87-
What? The output tensor for the scores is required to run the model, but nothing is written to it. I'm still trying to track down this bug. `¯\_(ツ)_/¯`
79+
> AI.TENSORGET iris:scores VALUES
80+
1) "0.96567678451538086"
81+
2) "0.034322910010814667"
82+
3) "3.4662525649764575e-07"
83+
4) "0.00066925224382430315"
84+
5) "0.45369619131088257"
85+
6) "0.54563456773757935"
86+
87+
### References
88+
89+
* https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html
90+
* https://pytorch.org
91+
* https://pytorch.org/docs/stable/jit.html
92+
* https://microsoft.github.io/hummingbird/
93+
* https://github.com/microsoft/hummingbird

build.py

Lines changed: 21 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,36 @@
11
from sklearn.datasets import load_iris
22
from sklearn.model_selection import train_test_split
33
from sklearn.linear_model import LogisticRegression
4+
from hummingbird.ml import convert
5+
from zipfile import ZipFile
6+
import shutil
7+
import os
48

5-
from skl2onnx import convert_sklearn
6-
from skl2onnx.common.data_types import FloatTensorType
9+
TORCH_FILE = 'iris.torch'
10+
TORCH_ARCHIVE = f'{TORCH_FILE}.zip' # the output of torch model save()
11+
TORCHSCRIPT_BLOB_SRC = 'deploy_model.zip' # internal (in zip) torchscript blob
12+
TORCHSCRIPT_BLOB_DEST = 'iris.pt' # output name for extracted torchscript blob
713

814
# prepare the train and test data
915
iris = load_iris()
1016
X, y = iris.data, iris.target
1117
X_train, X_test, y_train, y_test = train_test_split(X, y)
1218

13-
# train a model
19+
# train the model - using logistic regression classifier
1420
model = LogisticRegression(max_iter=5000)
1521
model.fit(X_train, y_train)
1622

17-
# convert the model to ONNX
18-
initial_types = [
19-
('input', FloatTensorType([None, 4]))
20-
]
21-
22-
onnx_model = convert_sklearn(model, initial_types=initial_types)
23+
# use hummingbird.ml to convert sklearn model to torchscript model (torch.jit backend)
24+
torch_model = convert(model, 'torch.jit', test_input=X_train, extra_config={})
2325

2426
# save the model
25-
with open("iris.onnx", "wb") as f:
26-
f.write(onnx_model.SerializeToString())
27+
torch_model.save(TORCH_FILE)
28+
29+
# extract the TorchScript binary payload
30+
with ZipFile(TORCH_ARCHIVE) as z:
31+
with z.open(TORCHSCRIPT_BLOB_SRC) as zf, open(TORCHSCRIPT_BLOB_DEST, 'wb') as f:
32+
shutil.copyfileobj(zf, f)
33+
34+
# clean up - remove the zip file
35+
if os.path.exists(TORCH_ARCHIVE):
36+
os.remove(TORCH_ARCHIVE)

iris.onnx

-685 Bytes
Binary file not shown.

requirements.txt

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
1-
joblib==0.17.0
2-
numpy==1.19.2
3-
onnx==1.7.0
4-
onnxconverter-common==1.7.0
5-
protobuf==3.13.0
6-
scikit-learn==0.23.2
7-
scipy==1.5.2
8-
six==1.15.0
9-
skl2onnx==1.7.0
10-
sklearn==0.0
11-
threadpoolctl==2.1.0
12-
typing-extensions==3.7.4.3
1+
dill==0.3.4
2+
hummingbird-ml==0.4.0
3+
joblib==1.0.1
4+
numpy==1.21.1
5+
onnx==1.9.0
6+
onnxconverter-common==1.8.1
7+
protobuf==3.17.3
8+
psutil==5.8.0
9+
scikit-learn==0.24.2
10+
scipy==1.7.0
11+
six==1.16.0
12+
threadpoolctl==2.2.0
13+
torch==1.9.0
14+
typing-extensions==3.10.0.0

0 commit comments

Comments
 (0)