-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Triton 24.04: Incorrect input type passed to GTIL predict() #391
Comments
Thanks for raising the issue. I'd like to start working on a fix soon. Question: In the model repository, did you put the LightGBM model file ( |
@hcho3 its a model.txt exported. In case its useful here's the head
|
@casassg When you say "export", do you mean exporting the model from LightGBM, or exporting it from Treelite? Can you post the code snippet for exporting the model? |
I mean its saving the lightgbm model: model: lgb.LGBMModel
model.booster_.save_model(tmp_model_file) |
@casassg Apologies for the delay. I recently began troubleshooting the issue. I am currently having trouble reproducing the error on my end. Can you look at my setup and see how it differs from yours? Training script, using LightGBM 4.4.0 and scikit-learn 1.4.1: import lightgbm
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, n_features=235, n_informative=200)
print(X.dtype, y.dtype) # Prints: float64 int64
dtrain = lightgbm.Dataset(X, label=y)
params = {
"num_leaves": 31,
"metric": "binary_logloss",
"objective": "binary",
}
bst = lightgbm.train(
params,
dtrain,
num_boost_round=10,
valid_sets=[dtrain],
valid_names=["train"],
callbacks=[lightgbm.log_evaluation()],
)
bst.save_model("example/1/model.txt") First few lines of
Triton-FIL model configuration (
I launched the Triton server locally using the Docker container:
Using the following client inference script, I was able to get the result: import numpy as np
import tritonclient.http as triton_http
x = np.zeros((1, 235), dtype=np.float32)
client = triton_http.InferenceServerClient(url="localhost:8000")
triton_input = triton_http.InferInput("input__0", x.shape, "FP32")
triton_input.set_data_from_numpy(x)
output0 = triton_http.InferRequestedOutput("output__0")
output_treeshap = triton_http.InferRequestedOutput("treeshap_output")
r = client.infer(
"example",
model_version="1",
inputs=[triton_input],
outputs=[output0, output_treeshap],
)
print(r.as_numpy("output__0"))
print(r.as_numpy("treeshap_output")) |
Trying to run your example, One thing I notice is I run the 3.3.5 version of LightGBM vs 4.4.0: import lightgbm as lgb
from sklearn.datasets import make_classification
print(lgb.__version__) # Prints: 3.3.5
X, y = make_classification(n_samples=1000, n_features=235, n_informative=200)
print(X.dtype, y.dtype) # Prints: float64 int64
m = lgb.LGBMClassifier()
m.fit(X, y)
import os
os.makedirs("models/example/1", exist_ok=True)
m.booster_.save_model("models/example/1/model.txt")
docker run --rm -d -p 8000:8000 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:24.05-py3 tritonserver --model-repository=/models import numpy as np
import tritonclient.http as triton_http
x = np.zeros((1, 235), dtype=np.float32)
client = triton_http.InferenceServerClient("localhost:8000")
triton_input = triton_http.InferInput("input__0", x.shape, "FP32")
triton_input.set_data_from_numpy(x)
output0 = triton_http.InferRequestedOutput("output__0")
output_treeshap = triton_http.InferRequestedOutput("treeshap_output")
r = client.infer(
"example",
model_version="1",
inputs=[triton_input],
outputs=[output0, output_treeshap],
)
print(r.as_numpy("output__0"))
print(r.as_numpy("treeshap_output")) This errors out:
|
@casassg Interesting. I was able to reproduce the error when I turned off the GPU and ran the inference on the CPU:
Error:
|
maybe something in the CPU gtil functions vs the GPU ones? |
@casassg Yes, the bug is only affecting GTIL. I will make a bug fix soon. In the meanwhile, you can work around the bug by modifying
Note the addition of |
Fix is available at #394. It will be part of the upcoming release (24.06). |
Treelite 4.3.0 contains the following improvements: * Support XGBoost 2.1.0, including the UBJSON format (dmlc/treelite#572, dmlc/treelite#578) * [GTIL] Allow inferencing with FP32 input + FP64 model (dmlc/treelite#574). Related: triton-inference-server/fil_backend#391 * Prevent integer overflow for deep LightGBM trees by using DFS order (dmlc/treelite#570). * Support building with latest RapidJSON (dmlc/treelite#567) Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - James Lamb (https://github.com/jameslamb) - Dante Gama Dessavre (https://github.com/dantegd) URL: #5968
We are finding an issue on FIL backend after 24.04 release. Same model artifact works on 24.01 with no issue.
Model: LightGBM model exported
Config.pbtxt:
Suspicion is it has to do with treelite 4.0 release as code was introduced in dmlc/treelite#528
Theory is 4.0 implements fp64 compatibility but removes compatibility w fp32
The text was updated successfully, but these errors were encountered: