Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Using hessian-vector products in TRU method #348

Closed
eweine opened this issue Dec 13, 2024 · 4 comments
Closed

Using hessian-vector products in TRU method #348

eweine opened this issue Dec 13, 2024 · 4 comments

Comments

@eweine
Copy link

eweine commented Dec 13, 2024

Hello,

I'm interested in using a trust region optimizer, where I have access to fast gradients and hessian vector products due to autograd. I would like to use some sort of incomplete cholesky pre-conditioner, as is discussed in Nocedal and Wright Algorithm 7.3 (which lead me here). However, in the python documentation for TRU, I don't see any way to feed in a hessian vector product function. Am I misreading the documentation or is this currently not possible? If it's not possible in python, is it perhaps possible in C, and I could port this over to python and / or R?

Thanks,

Eric.

@nimgould
Copy link
Contributor

Thanks for your question, Eric. What you ask for is possible in the fortran, C and Julia versions of the package, but not currently in Python. There is, of course, no reason not to
provide this, except for 24-hour days. We'll add this to the list, but in the meantime, I hope the C version is enough for you (the fact that the interface from C to Julia works does at least give some hope!)

@eweine
Copy link
Author

eweine commented Dec 13, 2024

Thanks for you quick response!

I may actually start with the Julia version (I am actually writing most of my code in R, and there exists an interface there to call Julia).

I see where I can provide my hessian vector product function.

However, I have a question about creating the pre-conditioner. Ideally, I would do something similar to algorithm 7.3 in Nocedal and Wright, which seems to only require hessian-vector products to form the pre-conditioner. Is there a built in method in Galahad to help me do this? Looking at PSLS, it looks like I have to provide the full matrix to get the preconditioner. Or, does TRU create this pre-conditioner using a sparse cholesky factorization by default?

Thanks,

Eric.

@nimgould
Copy link
Contributor

Eric, there is considerable freedom on how the preconditioner is applied. The key is the value of the norm component of the control structure. This enables you to allow the code itself to build and apply a variety of preconditioners based on the (sparse) Hessian or an LBFGS approximation, or to allow you to do this externally (what is often called reverse communication). For this, control is passed back from the minimizer to the user with a request to form u = P^-1 v for a given v, and then to return this u. This flexibility allows you to use whatever preconditioner you can provide, but there is undoudtedly a cost associated with popping to and from the minimizer. There are a number of incomplete factorizations provided, see norm=6,7

@nimgould
Copy link
Contributor

nimgould commented Jan 3, 2025

I hope that the above explained how things worked, so I am now closing this

@nimgould nimgould closed this as completed Jan 3, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants