-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[SPARK-35357][GRAPHX] Allow to turn off the normalization applied by static PageRank utilities #32485
[SPARK-35357][GRAPHX] Allow to turn off the normalization applied by static PageRank utilities #32485
Conversation
…nk with a 'normalized' parameter to trigger or not the normalization
ok to test |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #138334 has finished for PR 32485 at commit
|
I think it's fine. cc @srowen FYI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK, only one tiny comment about 'since'
graphx/src/test/scala/org/apache/spark/graphx/lib/PageRankSuite.scala
Outdated
Show resolved
Hide resolved
Test build #138375 has finished for PR 32485 at commit
|
Thank you @Ayushsunny @HyukjinKwon @srowen for the review 🙏 . |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Merged to master |
What changes were proposed in this pull request?
Overload methods
PageRank.runWithOptions
andPageRank.runWithOptionsWithPreviousPageRank
(not to break any user-facing signature) with anormalized
parameter that describes "whether or not to normalize the rank sum".Why are the changes needed?
https://issues.apache.org/jira/browse/SPARK-35357
When dealing with a non negligible proportion of sinks in a graph, algorithm based on incremental update of ranks can get a precision gain for free if they are allowed to manipulate non normalized ranks.
Does this PR introduce any user-facing change?
No
How was this patch tested?
By adding a unit test that verifies that (even when dealing with a graph containing a sink) we end up with the same result for both these scenarios:
a)
PageRank.runWithOptions
with normalization enabledb)
PageRank.runWithOptions
with normalization disabledpreRankGraph1
and run 2 more iterations usingPageRank.runWithOptionsWithPreviousPageRank
with normalization disabledpreRankGraph2
and run 2 more iterations usingPageRank.runWithOptionsWithPreviousPageRank
with normalization enabled