Normalization #281

caglayantuna · 2021-06-17T11:49:47Z

This pull request is related to issue #264. We suggest some updates in _nn_cp and _cp.py files in order to have consistent normalization options for non_negative_parafac, non_negative_parafac_hals and parafac functions. In these 3 functions, we added weights to mttkrp and pseudo_inverse (accum for non_negative_parafac) computation;

mttkrp = unfolding_dot_khatri_rao(tensor, (weights, factors), mode)
pseudo_inverse = tl.reshape(weights, (-1, 1)) * pseudo_inverse * tl.reshape(weights, (1, -1))

Since we have used weights to compute mttkrp, we removed the weights from the iprod computation;

iprod = tl.sum(tl.sum(mttkrp * factors[-1], axis=0))

Finally, we suggest to use cp_normalize function after the error computation for all functions.

According to our experiments, we don't observe any error as in issue #264 with these modifications.

codecov · 2021-06-17T11:52:36Z

Codecov Report

Merging #281 (cff18e7) into main (c41092d) will increase coverage by 0.03%.
The diff coverage is 97.56%.

@@            Coverage Diff             @@
##             main     #281      +/-   ##
==========================================
+ Coverage   88.09%   88.12%   +0.03%     
==========================================
  Files         103      103              
  Lines        5963     5988      +25     
==========================================
+ Hits         5253     5277      +24     
- Misses        710      711       +1

Impacted Files	Coverage Δ
tensorly/decomposition/_nn_cp.py	`85.83% <94.44%> (+1.39%)`	⬆️
tensorly/decomposition/_cp.py	`87.22% <100.00%> (-0.32%)`	⬇️
tensorly/decomposition/tests/test_cp.py	`100.00% <100.00%> (ø)`
tensorly/tenalg/proximal.py	`66.98% <0.00%> (-0.48%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c41092d...cff18e7. Read the comment docs.

JeanKossaifi · 2021-06-29T20:14:01Z

I'm not sure this is necessarily better - it will be slightly heavier computationally. It's also not necessary as the weights would just be naturally absorbed in the last factor and we discussed, I don't have a strong intuition on whether one is better than the other. In Kolda's seminal paper(s) (and in the tensor-toolbox), I believe they do something similar to what we already have.

I think a discussion is needed to simplify and make both code and API more uniform, I'll post that in the issue.

It would be helpful to benchmark both approaches and see how it affects convergence, performance and numerical stability (e.g. are the factors in a better range, etc)?

cohenjer · 2021-09-20T08:35:49Z

@JeanKossaifi Sure we should try even further to study normalization, but for now the nonnegative Parafac normalization is bugged (#264) and with the PR, we add a class method for normalization, which we can always tinker with later on. I think it would be better to merge the current PR, and maybe open an issue for further discussions?

JeanKossaifi

Do you mean an attribute for normalization? I don't see a class method in the PR.
I left a few comments in the code.

The first priority is of course correctness. Once this is achieved, however, I'm weary about adding additional complexity in the basic methods / algorithms that are used often as this can quickly result in much slower algorithms. For instance if the weights don't influence convergence or numerical stability then why incorporate them in the mttkrp / update calculations when they otherwise would automatically get absorbed in the last term.

JeanKossaifi · 2021-09-23T15:13:03Z

tensorly/decomposition/_cp.py

-
-            mttkrp = unfolding_dot_khatri_rao(tensor, (None, factors), mode)
+            pseudo_inverse = tl.reshape(weights, (-1, 1)) * pseudo_inverse * tl.reshape(weights, (1, -1))
+            mttkrp = unfolding_dot_khatri_rao(tensor, (weights, factors), mode)


I don't think we need all that since the weights are automatically absorbed in the last factor. If there is no advantage it's just additional computation (slower algo) for no strong reason.

I guess we can compare the two version on examples of various sizes to check if this change is making the algorithm effectively slower?
If indeed we observe that the algorithm runs slower, we will revert to ignoring weights when normalization is not asked by the user. However if the user asks for normalization, then we must somehow store the weights somewhere. The whole idea of normalization, I think, was to avoid factors exploding in norm or being conversely extremely small, so we should not pull the norm in them if the user asks normalization.
But this would mean having the no-weights updates when normalization is off, and weighted updates with normalization is on. I would argue that if the proposed change does not lead to any sensible computation time difference, we should keep it so that the code is easier to understand and maintain?

I agree with all these points :)

JeanKossaifi · 2021-09-23T15:13:19Z

tensorly/decomposition/_cp.py

            factors[mode] = factor
+            if normalize_factors and mode != modes_list[-1]:
+                   weights, factors = cp_normalize((weights, factors))


This is much better - thanks for uniformizing it

JeanKossaifi · 2021-09-23T15:15:18Z

tensorly/decomposition/_nn_cp.py

@@ -243,29 +245,28 @@ def non_negative_parafac(tensor, rank, n_iter_max=100, init='svd', svd='numpy_sv
                    accum *= tl.dot(tl.transpose(factors[e]), factors[e])
                else:
                    accum = tl.dot(tl.transpose(factors[e]), factors[e])
-
+            accum = tl.reshape(weights, (-1, 1)) * accum * tl.reshape(weights, (1, -1))


Is this correct? The weights are for the full tensor, this is scaling the factors up everytime.
e.g. the full tensor is \sum_r weights_r factor[0, r] \outer ... \outer factor[-1, r]
in other words, each element of weights is used only once

It is correct I think:

If there is no normalization, weights are always 1 and this line does nothing.

If there is normalization, in the current version all the factors have unit norm columns, and the weights must be accounted for in the gradient computation. Just like the non-updated factors appear twice each in accum, the weights also appear twice.

JeanKossaifi · 2021-10-10T20:30:39Z

Thanks @cohenjer, I agree with all your points.

JeanKossaifi · 2021-11-25T17:09:33Z

Thanks for the great work @caglayantuna and @cohenjer.
Is everyone happy with merging this?

caglayantuna · 2021-11-26T08:27:13Z

Thanks @JeanKossaifi. From my side, it is ok.

cohenjer · 2021-11-29T09:31:36Z

Let's go !

JeanKossaifi · 2021-11-29T20:40:37Z

Awesome, merging!

caglayantuna mentioned this pull request Jul 21, 2021

normalization_notebooks tensorly/lab#2

Open

caglayantuna force-pushed the normalization branch from 1372f3c to e581164 Compare September 9, 2021 08:11

JeanKossaifi reviewed Oct 8, 2021

View reviewed changes

caglayantuna force-pushed the normalization branch from fe4a613 to d5f5467 Compare October 8, 2021 13:07

caglayantuna added 4 commits November 15, 2021 10:45

fix normalization

f81b78a

fix hals normalization

f051dbd

hals class normalization

c0b1b6e

normalize inner loop and test normalize

cff18e7

caglayantuna force-pushed the normalization branch from d5f5467 to cff18e7 Compare November 15, 2021 09:45

JeanKossaifi merged commit c2fa4ec into tensorly:main Nov 29, 2021

caglayantuna deleted the normalization branch November 30, 2021 08:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalization #281

Normalization #281

caglayantuna commented Jun 17, 2021

codecov bot commented Jun 17, 2021 •

edited

Loading

JeanKossaifi commented Jun 29, 2021

cohenjer commented Sep 20, 2021

JeanKossaifi left a comment

JeanKossaifi Sep 23, 2021

cohenjer Oct 8, 2021

JeanKossaifi Oct 10, 2021

JeanKossaifi Sep 23, 2021

JeanKossaifi Sep 23, 2021

cohenjer Oct 8, 2021

JeanKossaifi commented Oct 10, 2021

JeanKossaifi commented Nov 25, 2021

caglayantuna commented Nov 26, 2021

cohenjer commented Nov 29, 2021

JeanKossaifi commented Nov 29, 2021

Normalization #281

Normalization #281

Conversation

caglayantuna commented Jun 17, 2021

codecov bot commented Jun 17, 2021 • edited Loading

Codecov Report

JeanKossaifi commented Jun 29, 2021

cohenjer commented Sep 20, 2021

JeanKossaifi left a comment

Choose a reason for hiding this comment

JeanKossaifi Sep 23, 2021

Choose a reason for hiding this comment

cohenjer Oct 8, 2021

Choose a reason for hiding this comment

JeanKossaifi Oct 10, 2021

Choose a reason for hiding this comment

JeanKossaifi Sep 23, 2021

Choose a reason for hiding this comment

JeanKossaifi Sep 23, 2021

Choose a reason for hiding this comment

cohenjer Oct 8, 2021

Choose a reason for hiding this comment

JeanKossaifi commented Oct 10, 2021

JeanKossaifi commented Nov 25, 2021

caglayantuna commented Nov 26, 2021

cohenjer commented Nov 29, 2021

JeanKossaifi commented Nov 29, 2021

codecov bot commented Jun 17, 2021 •

edited

Loading