Sparse Higher Order Derivatives

Question

Sparse Higher Order Derivatives

Closed this issue a year ago · 5 comments

I was reading your documentation and the ability to create sparse Hessian's caught my attention. I was wondering if it would be possible to leverage sparsity for higher order derivatives? In my use case I'm after the 3rd and 4th order versions of a hessian and I know that these tensors will be fairly sparse (99.5% zero). My feeling is that the answer is no as sparse formats for tensors are not really supported in julia but beyond that is there a reason this is infeasible? Auto-diff is already much faster than doing finite-diff for my code and if I could take advantage of sparsity that would be even better.

Thanks!

Answer 1 · 2023-09-22T21:00:00.000Z

FastDifferentiation can compute sparsity for tensors of any order. But not sure there is a Julia package that represents sparse tensors efficiently. If there is one it would be straightforward for me to add support for it. An alternative would be to pack the sparse tensor into a sparse matrix or vector.

FastDifferentiation is not optimized for higher order derivatives so derivative size increases rapidly as order goes up. Might still be reasonable if you only want 3rd or 4th order. Adding this is on my list but there are a lot of other things on that list as well so it could be a while

Do you have an example you'd like to try Fast Differentiation on?

Answer 2 · 2023-09-22T22:19:21.000Z

I realized my problem is block sparse so I can handle the sparsity manually in my own code. And like you said there really is not a good sparse tesnor representation in Julia (or most places).

The higher order thing is something I have to do though. Based on the speed of the Hessian in FastDiff I'd be surprised if TaylorDiff competes on third order. I will have to test that and if you have any recommendations to maintain speed at 3rd and 4th (or higher) Id love to hear them.

The specific problem I'm doing is the calculation of force constants whose equation is here. Basically I just define that function U and take a bunch of derivatives for all combos of i,j,k (third order version of a hessian). I can describe more but I dont think this really requires an interface change as long as I can chain gradient calls.

Answer 3 · 2023-09-23T00:26:54.000Z

There shouldn't be any problem chaining gradient calls, at least if I understand what you mean by this. Maintaining speed on really high order derivatives will probably require algorithmic changes to the FastDifferentiation algorithm to avoid recomputing redundant terms so not much you could do to fix that.

The trend is that each extra derivative order doubles the number of operations. So 4th is probably still competitive with other algorithms even though it will be roughly 8x more ops than first derivative because FastDifferentiation first derivatives are so efficient.

Answer 4 · 2023-09-23T14:51:28.000Z

I'll test it out and compare against TaylorDiff. Will share results here if I remember.

Answer 5 · 2023-09-24T01:42:46.000Z

I'm converting this to a discussion item rather than closing the issue. This will leave it available for others to see without having to search through closed issues.