Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix performance issue with nanstd and nanvar #2308

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zhujun98
Copy link
Contributor

Checklist

  • The title and commit message(s) are descriptive.
  • Small commits made to fix your PR have been squashed to avoid history pollution.
  • Tests have been added for new features or bug fixes.
  • API of new functions and classes are documented.

Description

@JohanMabille
Copy link
Member

I think the failures are due to missing overloads of isnan in xsimd. These functions were not called before your optimization.

I have on going work on xsimd to push these functions, I'll try to push them quickly so that we can merge this PR.

@zhujun98
Copy link
Contributor Author

I think the failures are due to missing overloads of isnan in xsimd. These functions were not called before your optimization.

I have on going work on xsimd to push these functions, I'll try to push them quickly so that we can merge this PR.

Thanks a lot @JohanMabille ! Actually, the unittest without SIMD failed too. I will investigate it in the meanwhile.

@zhujun98 zhujun98 force-pushed the improve_performance_of_nanvar_and_nanstd branch from bc16cdd to 622647d Compare April 7, 2021 21:01
@zhujun98 zhujun98 force-pushed the improve_performance_of_nanvar_and_nanstd branch from 622647d to d01be62 Compare May 31, 2021 09:51
return nanmean<result_type>(square(cast<result_type>(sc) - std::move(mrv)), std::forward<X>(axes), es);
// note: otherwise the result is wrong with 'immediate' evaluation strategy
auto sc_shifted = eval(cast<result_type>(sc) - std::move(mrv));
return nanmean<result_type>(square(sc_shifted), std::forward<X>(axes), es);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am completely lost and I summarize my findings here:

Test code:

xt::xarray<double> aN = {{ nanv, nanv, 123, 3 }, { 1, 2, nanv, 3 }, { 1, 1, nanv, 3 }};

  xt::xarray<double> ret2 = xt::nanvar(aN, {1});
  std::cout << "Lazy: " << ret2 << std::endl;

  xt::xarray<double> ret = xt::nanvar(aN, {1}, xt::evaluation_strategy::immediate);
  std::cout << "Immediate: " << ret << std::endl;
  • First case:
auto sc_shifted = eval(cast<result_type>(sc) - std::move(mrv));
return nanmean<result_type>(square(sc_shifted), std::forward<X>(axes), es);

Result with gcc8:

Lazy: { 2400.      ,     0.666667,     0.888889}
Immediate: { 3600.      ,     0.666667,     0.888889}

Result with Clang7: segfault

	je     0x4198a0                   ; <xt::xreducer_stepper<xt::xreducer_functors<xt::detail::nan_plus, xt::const_value<double>, xt::detail::nan_plus>, xt::xshared_expression<xt::xfunction<xt::detail::lambda_adapt<xt::square_fct>, xt::xarray_container<xt::uvector<double, std::allocator<double> >, (xt::layout_type)1, xt::svector<unsigned long, 4ul, std::allocator<unsigned long>, true>, xt::xtensor_expression_tag> const&> >, std::array<unsigned long, 1ul>, xt::reducer_options<double, std::tuple<xt::evaluation_strategy::lazy_type> > >::aggregate_impl(unsigned long, std::integral_constant<bool, false>) const+784>
	shl    $0x3,%rcx
	neg    %rax
	mov    %rdi,%rsi
	nopw   0x0(%rax,%rax,1)
	vmovsd (%rsi),%xmm0
  • Second case:
return nanmean<result_type>(square(cast<result_type>(sc) - std::move(mrv)), std::forward<X>(axes), es);

Result with gcc8:

Lazy: { 3600.      ,     0.666667,     0.888889}
Immediate: { 7365.388889,     4.666667,    12.      }

Result with clang7:

Lazy: { 3600.      ,     0.666667,     0.888889}
Immediate: { 7365.388889,    12.      ,   363.      }

@JohanMabille @tdegeus Can you give some hint? Thank you!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants