FEA: engine accepts dpnp.ndarray and dpt.usm_ndarray objects as input data. #62

fcharras · 2022-11-23T17:23:37Z

I think I'm happy with the result, the code is a little messy but not more than sklearn's and I think the PR basically succeeds in mimicking almost all aspects of sklearn's UX regarding what inputs are accepted and how it's casted under the hood, and also performance regarding the cautiousness with memory copies, while additionally creating support for dpnp.ndarray and dpt.usm_ndarray inputs, and recycling the most possible code that already exists in sklearn.

The PR still needs some polishing, and a few tests with dpnp/dpt tensors.

I think one (minor) difference is that, while sklearn try to convert numpy arrays with object dtype to float64, our engine will error out in this case because dpt.asarray refuses object dtype as input. Those kind of edge cases are complicated to unifiy because different array libraries can have different choices and I think it's fine we let it fail in this case.

…o pass dpnp.ndarray and dpt.usm_ndarray objects as input data.

benchmark/ext_helpers/daal4py.py

fcharras · 2022-11-23T17:26:53Z

sklearn_numba_dpex/common/kernels.py

+
+@lru_cache
+def make_sum_reduction_2d_axis1_kernel(
+    size0, size1, work_group_size, device, dtype, fused_unary_func=None


The changes to this kernel are for enabling the addition of this new argument fused_unary_func which enable fusing a unary ops on the elements of the input before summing. It's used in this PR to compute variance for scaling tolerance.

fcharras · 2022-11-23T17:29:06Z

sklearn_numba_dpex/kmeans/engine.py

@@ -1,14 +1,29 @@
-import warnings


With those changes (that consists in almost rewriting the whole file but that was expected and the sequence of previous refacto was building toward that) the behavior of sklearn regarding input validation is almost identically implemented.

The old behavior is removed (and in particular: printing warning when avoidable copies are implicitly triggered)

fcharras · 2022-11-23T17:30:07Z

sklearn_numba_dpex/kmeans/tests/test_kmeans.py

@@ -364,7 +366,7 @@ def _get_score_with_centers(centers):
        [
            -1827.22702,
            -1027.674243,
-            -865.257397,
+            -865.257501,


The difference in those values comes from numerical instability after including the X_mean removal step.

fcharras · 2022-11-23T17:45:47Z

sklearn_numba_dpex/kmeans/engine.py

+    ), override_attr_context(
+        sklearn_validation,
+        get_namespace=_get_namespace,
+        _asarray_with_order=_asarray_with_order,


Using monkey patching here enables an efficient workflow as a whole. I wonder if rather than having those 2 functions limited to array api specs, we should also consider pluginyfying it.

… into accept_dpnp_dpt_inputs

fcharras · 2022-11-24T14:18:22Z

Out of WIP

jjerphan

Thank you, @fcharras. Here is a first pass.

jjerphan · 2022-11-25T08:14:49Z

sklearn_numba_dpex/testing/config.py

+    try:
+        attrs_before = dict()
+        for attr_name, attr_value in attrs.items():
+            # raise AttributeError if obj does not have the attribute attr_name
+            attrs_before[attr_name] = getattr(obj, attr_name)
+            setattr(obj, attr_name, attr_value)
+
+        yield
+
+    finally:
+        for attr_name, attr_value in attrs_before.items():
+            setattr(obj, attr_name, attr_value)


How about this?

Suggested change

try:

attrs_before = dict()

for attr_name, attr_value in attrs.items():

# raise AttributeError if obj does not have the attribute attr_name

attrs_before[attr_name] = getattr(obj, attr_name)

setattr(obj, attr_name, attr_value)

yield

finally:

for attr_name, attr_value in attrs_before.items():

setattr(obj, attr_name, attr_value)

attrs_before = dict()

for attr_name, attr_value in attrs.items():

attribute = getattr(obj, attr_name, None)

if attribute is not None:

# Only replace the value of the `attr_name` does exist.

attrs_before[attr_name] = attribute

setattr(obj, attr_name, attr_value)

yield

for attr_name, attr_value in attrs_before.items():

setattr(obj, attr_name, attr_value)

The try/finally is not only about the AttributeError that was documented, but about any error that can be raised after yield and before exiting the context. I thought initially that before exiting the code after yield is always executed, in the same way it's true for __exit__ when expliciting context managers, but in case of such errors it seems not to be true, and caused failing tests (in the sklearn pipeline). Adding the try/finally fixes that.

Regarding only replacing the value if attr_name does exist, i'd prefer not, raising an AttributeError is a safeguard against typos, or unseen changes in input objects. The context manager is only intended to be used for replacing attributes that exist and I like it to fail otherwise.

jjerphan · 2022-11-25T08:17:08Z

sklearn_numba_dpex/kmeans/tests/test_kmeans.py

+        engine_kmeans_plusplus_centers = engine.init_centroids(X_prepared)
+        engine_kmeans_plusplus_centers = dpt.asnumpy(engine_kmeans_plusplus_centers.T)


Do we need to _t-suffix one of them?

jjerphan · 2022-11-25T08:17:44Z

sklearn_numba_dpex/kmeans/tests/test_kmeans.py

+    centers, indices = engine._kmeans_plusplus(X_prepared)
+    centers = dpt.asnumpy(centers.T)


Similarly, do we need to _t-suffix one of them?

jjerphan · 2022-11-25T08:25:25Z

sklearn_numba_dpex/common/kernels.py

@@ -172,12 +171,16 @@ def make_sum_reduction_2d_axis1_kernel(size0, size1, work_group_size, device, dt
    minus_one_idx = np.int64(-1)
    two_as_a_long = np.int64(2)

-    is_1d = size1 is None
+    if fused_unary_func is None:


Can you document fused_unary_func, please?

It's documented in the public function that follows. For readability it would be better I think to swap the definitions of those, but it would have increased the diff and masked the true diff.

jjerphan · 2022-11-25T08:26:59Z

sklearn_numba_dpex/kmeans/drivers.py

+    if (X_mean == 0).astype(int).sum() == len(X_mean):
+        X_mean = None


Can you document why None is conventionally used in this case?

jjerphan · 2022-11-25T08:27:56Z

sklearn_numba_dpex/kmeans/drivers.py

+        n_features * n_samples,
+        None,
+        max_work_group_size,


Suggested change

n_features * n_samples,

None,

max_work_group_size,

size0=n_features * n_samples,

size1=None,

work_group_size=max_work_group_size,

jjerphan · 2022-11-25T08:28:48Z

sklearn_numba_dpex/kmeans/drivers.py

+def _minus(x, y):
+    return x - y
+
+
+def _plus(x, y):
+    return x + y


Can you group and document those helpers function for ops?

sklearn_numba_dpex/kmeans/drivers.py

jjerphan · 2022-11-25T08:31:12Z

sklearn_numba_dpex/common/kernels.py

+    # NB: inplace. # Optimized for C-contiguous array and for
+    # size1 >> preferred_work_group_size_multiple
+    @dpex.kernel
+    def broadcast_ops(left_operand_array, right_operand_vector):


I think it's worth indicating that the left operand is modified in place and that the right one isn't modified at all.

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

fcharras · 2022-11-25T14:31:00Z

Pushed your suggestions, also amended some changes to enable compute follow data. (the preferred device for compute is the device that stores de data if data is already on-device)

jjerphan

LGTM up to a few suggestions and questions.

Thank you, @fcharras!

jjerphan · 2022-11-25T12:51:02Z

sklearn_numba_dpex/kmeans/engine.py

+    # future instances. It is only used for testing purposes, using
+    # `sklearn_numba_dpex.testing.config.override_attr_context` context, for instance
+    # in the benchmark script.
+    # For normal usage, the compute will follow the __compute_follow_data__ principle.


Is "__compute_follow_data__" an emphasis, here?
Do you have a reference for this concept?

Yes, let me replace with *compute follows data* which would be more correct makrdown.

The concept is anchored in dpctl/numba_dpex design

in dpctl 1

in numba dpex 1 2

jjerphan · 2022-11-25T12:51:23Z

sklearn_numba_dpex/kmeans/engine.py

+        # NB: numba_dpex kernels only currently supports working with C memory layout
+        # (see https://github.com/IntelPython/numba-dpex/issues/767) but our KMeans
+        # implementation is hypothetized to be more efficient with the F-memory layout.
+        # As a workaround the kernels work with the transpose of X, X_t, where X_t
+        # is created with a C layout, which results in equivalent memory access
+        # patterns than with a F layout for X.
+        # TODO: when numba_dpex supports inputs with F-layout:
+        # - use X rather than X_t and adapt the codebase (better for readability and
+        # more consistent with sklearn notations)
+        # - test the performances with both layouts and use the best performing layout.


jjerphan · 2022-11-28T09:25:54Z

sklearn_numba_dpex/common/kernels.py

+    )
+    # subsequent kernel calls only sum the data.
+    nofunc_kernel = _make_partial_sum_reduction_2d_axis1_kernel(
+        n_rows, work_group_size, None, dtype


Suggested change

n_rows, work_group_size, None, dtype

n_rows, work_group_size, fused_unary_func=None, dtype=dtype

jjerphan · 2022-11-28T09:32:02Z

sklearn_numba_dpex/kmeans/drivers.py

+
+    if (X_mean == 0).astype(int).sum() == len(X_mean):


Suggested change

if (X_mean == 0).astype(int).sum() == len(X_mean):

X_mean_is_zeroed = (X_mean == 0).astype(int).sum() == len(X_mean)

if X_mean_is_zeroed:

jjerphan · 2022-11-28T09:34:17Z

sklearn_numba_dpex/kmeans/drivers.py

@@ -401,6 +404,80 @@ def _relocate_empty_clusters(
    )


+def prepare_data_for_lloyd(X_t, init, tol, copy_x):


Can you document prepare_data_for_lloyd indicating that this is centering X for numerical stability?

✔️ mostly reported sklearn docstring

jjerphan · 2022-11-28T09:36:47Z

sklearn_numba_dpex/kmeans/engine.py

+            # NB: sampling without replacement must be executed sequentially so
+            # it's better done on CPU


jjerphan · 2022-11-28T09:37:53Z

sklearn_numba_dpex/kmeans/engine.py

+            # Poor man's fancy indexing
+            # TODO: write a kernel ? or replace with better equivalent when available ?
+            centers_t = dpt.concat(
+                [dpt.expand_dims(X[center_idx], axes=1) for center_idx in centers_idx],
+                axis=1,
+            )


Should we open an issue on dpctl? What do you think?

dpctl implements the Array API and the Array API doesn't list such tools. Best things to have would be take but here are the list of supported functions 1 2

I'm not aware enough of where the array api is going to say what would be the best action here 🤔

Actually take has been added recently to the Array APi, dpctl issue

jjerphan · 2022-11-28T09:38:59Z

sklearn_numba_dpex/kmeans/engine.py

+        use_uniform_weights = (sample_weight == sample_weight[0]).astype(
+            int
+        ).sum() == len(sample_weight)


Suggested change

use_uniform_weights = (sample_weight == sample_weight[0]).astype(

int

).sum() == len(sample_weight)

use_uniform_weights = (

(sample_weight == sample_weight[0]).astype(int).sum()

== len(sample_weight)

)

black doesn't like this, but it does accept

use_uniform_weights = ( (sample_weight == sample_weight[0]).astype(int).sum() ) == len(sample_weight)

i'll go for it

jjerphan · 2022-11-28T09:39:42Z

sklearn_numba_dpex/kmeans/engine.py

    def get_labels(self, X, sample_weight):
-        labels, _ = self._get_labels_inertia(X, with_inertia=False)
+        # TODO: sample_weight actually not used for get_labels. Fix in sklearn ?


Is there an issue already open for it?

Done in scikit-learn/scikit-learn#25066

jjerphan · 2022-11-28T09:42:11Z

sklearn_numba_dpex/kmeans/engine.py

+            sample_weight = dpt.ones(n_samples, dtype=dtype, device=device)
+        elif isinstance(sample_weight, numbers.Number):
+            sample_weight = dpt.full(n_samples, 1, dtype=dtype, device=device)


What are the reasons not to have only one branch here?

A mistake! the latter line should read:

Suggested change

sample_weight = dpt.ones(n_samples, dtype=dtype, device=device)

elif isinstance(sample_weight, numbers.Number):

sample_weight = dpt.full(n_samples, 1, dtype=dtype, device=device)

sample_weight = dpt.ones(n_samples, dtype=dtype, device=device)

elif isinstance(sample_weight, numbers.Number):

sample_weight = dpt.full(n_samples, sample_weight, dtype=dtype, device=device)

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

fcharras · 2022-11-29T11:10:23Z

Thanks for the review @jjerphan I've pushed your suggestions and answered other comments. I'll go ahead and merge when the pipeline is green.

… into accept_dpnp_dpt_inputs

When using KMeans with sklearn_numba_dpex engine, it's now possible t…

c0e956b

…o pass dpnp.ndarray and dpt.usm_ndarray objects as input data.

fcharras commented Nov 23, 2022

View reviewed changes

fcharras mentioned this pull request Nov 24, 2022

fix kmeansplusplus daal benchmark #63

Merged

fcharras added 3 commits November 24, 2022 12:44

Add tests with dpnp and dpt.usm_ndarray inputs

c21dcd2

Clarity, commenting, fix caching with lambda functions

03f97b2

Merge branch 'main' of https://github.com/soda-inria/sklearn-numba-dpex…

6c78da6

… into accept_dpnp_dpt_inputs

fcharras requested review from ogrisel and jjerphan November 24, 2022 14:18

fcharras changed the title ~~[Draft] FEA: engine accepts dpnp.ndarray and dpt.usm_ndarray objects as input data.~~ FEA: engine accepts dpnp.ndarray and dpt.usm_ndarray objects as input data. Nov 24, 2022

jjerphan reviewed Nov 25, 2022

View reviewed changes

fcharras and others added 2 commits November 25, 2022 15:04

Clarity and commenting.

d11d91f

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

Implement compute follow data

ac649ba

jjerphan approved these changes Nov 28, 2022

View reviewed changes

Clarity and commenting.

c1698bd

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

fcharras added 2 commits November 29, 2022 13:04

Merge branch 'main' of https://github.com/soda-inria/sklearn-numba-dpex…

8d90a65

… into accept_dpnp_dpt_inputs

linting

eb4e3ab

fcharras merged commit 4489f5a into main Nov 29, 2022

fcharras deleted the accept_dpnp_dpt_inputs branch November 29, 2022 12:54

fcharras mentioned this pull request Dec 1, 2022

Make it possible to specify a given SYCL device programmatically #21

Open

		engine_kmeans_plusplus_centers = engine.init_centroids(X_prepared)
		engine_kmeans_plusplus_centers = dpt.asnumpy(engine_kmeans_plusplus_centers.T)

		centers, indices = engine._kmeans_plusplus(X_prepared)
		centers = dpt.asnumpy(centers.T)

		if (X_mean == 0).astype(int).sum() == len(X_mean):
		X_mean = None

-        n_features * n_samples,
-        None,
-        max_work_group_size,
+        size0=n_features * n_samples,
+        size1=None,
+        work_group_size=max_work_group_size,

	n_rows, work_group_size, None, dtype
	n_rows, work_group_size, fused_unary_func=None, dtype=dtype

-    if (X_mean == 0).astype(int).sum() == len(X_mean):
+    X_mean_is_zeroed = (X_mean == 0).astype(int).sum() == len(X_mean)
+    if X_mean_is_zeroed:

		@@ -401,6 +404,80 @@ def _relocate_empty_clusters(
		)


		def prepare_data_for_lloyd(X_t, init, tol, copy_x):

		# NB: sampling without replacement must be executed sequentially so
		# it's better done on CPU

FEA: engine accepts dpnp.ndarray and dpt.usm_ndarray objects as input data. #62

FEA: engine accepts dpnp.ndarray and dpt.usm_ndarray objects as input data. #62

Conversation

fcharras commented Nov 23, 2022 • edited Loading

fcharras Nov 23, 2022 • edited Loading

Choose a reason for hiding this comment

fcharras Nov 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fcharras Nov 23, 2022 • edited Loading

Choose a reason for hiding this comment

fcharras commented Nov 24, 2022

jjerphan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fcharras Nov 25, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fcharras Nov 25, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fcharras commented Nov 25, 2022

jjerphan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fcharras Nov 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fcharras Nov 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fcharras commented Nov 29, 2022

fcharras commented Nov 23, 2022 •

edited

Loading

fcharras Nov 23, 2022 •

edited

Loading

fcharras Nov 23, 2022 •

edited

Loading

fcharras Nov 23, 2022 •

edited

Loading

fcharras Nov 25, 2022 •

edited

Loading

fcharras Nov 25, 2022 •

edited

Loading

fcharras Nov 29, 2022 •

edited

Loading

fcharras Nov 29, 2022 •

edited

Loading