Using hessian-vector products in TRU method #348

eweine · 2024-12-13T15:03:49Z

Hello,

I'm interested in using a trust region optimizer, where I have access to fast gradients and hessian vector products due to autograd. I would like to use some sort of incomplete cholesky pre-conditioner, as is discussed in Nocedal and Wright Algorithm 7.3 (which lead me here). However, in the python documentation for TRU, I don't see any way to feed in a hessian vector product function. Am I misreading the documentation or is this currently not possible? If it's not possible in python, is it perhaps possible in C, and I could port this over to python and / or R?

Thanks,

Eric.

nimgould · 2024-12-13T15:15:05Z

Thanks for your question, Eric. What you ask for is possible in the fortran, C and Julia versions of the package, but not currently in Python. There is, of course, no reason not to
provide this, except for 24-hour days. We'll add this to the list, but in the meantime, I hope the C version is enough for you (the fact that the interface from C to Julia works does at least give some hope!)

eweine · 2024-12-13T15:58:29Z

Thanks for you quick response!

I may actually start with the Julia version (I am actually writing most of my code in R, and there exists an interface there to call Julia).

I see where I can provide my hessian vector product function.

However, I have a question about creating the pre-conditioner. Ideally, I would do something similar to algorithm 7.3 in Nocedal and Wright, which seems to only require hessian-vector products to form the pre-conditioner. Is there a built in method in Galahad to help me do this? Looking at PSLS, it looks like I have to provide the full matrix to get the preconditioner. Or, does TRU create this pre-conditioner using a sparse cholesky factorization by default?

Thanks,

Eric.

nimgould · 2024-12-13T16:15:50Z

Eric, there is considerable freedom on how the preconditioner is applied. The key is the value of the norm component of the control structure. This enables you to allow the code itself to build and apply a variety of preconditioners based on the (sparse) Hessian or an LBFGS approximation, or to allow you to do this externally (what is often called reverse communication). For this, control is passed back from the minimizer to the user with a request to form u = P^-1 v for a given v, and then to return this u. This flexibility allows you to use whatever preconditioner you can provide, but there is undoudtedly a cost associated with popping to and from the minimizer. There are a number of incomplete factorizations provided, see norm=6,7

nimgould · 2025-01-03T14:19:08Z

I hope that the above explained how things worked, so I am now closing this

nimgould closed this as completed Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using hessian-vector products in TRU method #348

Using hessian-vector products in TRU method #348

eweine commented Dec 13, 2024

nimgould commented Dec 13, 2024

eweine commented Dec 13, 2024

nimgould commented Dec 13, 2024

nimgould commented Jan 3, 2025 •

edited

Loading

Using hessian-vector products in TRU method #348

Using hessian-vector products in TRU method #348

Comments

eweine commented Dec 13, 2024

nimgould commented Dec 13, 2024

eweine commented Dec 13, 2024

nimgould commented Dec 13, 2024

nimgould commented Jan 3, 2025 • edited Loading

nimgould commented Jan 3, 2025 •

edited

Loading