Skip to content

Implement S/C/D/Z AXPBY #1048

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

grisuthedragon
Copy link
Contributor

Description

This PR implements the AXPBY operation

$$y \leftarrow \alpha x + \beta y$$

which extends the axpy operation by the second scaling factor, just like in gemm or gemv .

This is required to reduce the memory transfers in algorithms like the CG algorithm, where one step is

$$\mathbf{p}_{k+1} := \mathbf{r}_{k+1} + \beta_k \mathbf{p}_k$$

Until now, this needs to be implemented in one scal and one axpy step. The introduction of the axpby routine allows to read and write p_{k+1} only once from the memory. In other iterative algorithms, like BiCGStab, the subroutine can be used as well.

The routine already exists, for example, in

Checklist

  • The documentation has been updated.
  • Tests for Fortran
  • Tests for CBLAS

Copy link
Contributor

@langou langou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes related to TFSM (both in SRC and LAPACKE). I do not think this should be part of this commit.

@martin-frbg
Copy link
Collaborator

please rebase, it looks like your fork is about two weeks out of date...

@grisuthedragon
Copy link
Contributor Author

Rebased to current master's state.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants