Slow deserialization
notEvil opened this issue · 5 comments
Consider the following
KERNEL_PATH = '/opt/Mathematica/bin/WolframKernel'
import wolframclient.evaluation as w_evaluation
_ = 'Times[Exp[Times[Global`thetarho12, Plus[Times[0.5, Global`thetarho12, Plus[Power[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Times[Plus[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Times[-1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Global`thetatheta1, Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]], Global`thetasigma2], Global`y20]]]], 2.], Power[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Global`thetasigma1], Global`y10]]]], 2.]]], Times[-1, Times[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Times[Plus[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Times[-1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Global`thetatheta1, Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]], Global`thetasigma2], Global`y20]]]], Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Global`thetasigma1], Global`y10]]]]]]], Power[Plus[Power[Global`thetarho12, 2.], Times[-1, 1.]], -1]]], Power[Power[Plus[1., Times[-1, Power[Global`thetarho12, 2.]]], 0.5], -1], Exp[Times[Global`thetarho13, Plus[Times[0.5, Global`thetarho13, Plus[Power[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Plus[1., Times[Plus[Times[-1, Global`thetatheta2, Exp[Times[-1, Global`thetatheta1, Global`thetax]]], Times[Global`thetatheta1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]]], Global`thetasigma3], Global`y30]]]], 2.], Power[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Global`thetasigma1], Global`y10]]]], 2.]]], Times[-1, Times[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Plus[1., Times[Plus[Times[-1, Global`thetatheta2, Exp[Times[-1, Global`thetatheta1, Global`thetax]]], Times[Global`thetatheta1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]]], Global`thetasigma3], Global`y30]]]], Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Global`thetasigma1], Global`y10]]]]]]], Power[Plus[Power[Global`thetarho13, 2.], Times[-1, 1.]], -1]]], Power[Power[Plus[1., Times[-1, Power[Global`thetarho13, 2.]]], 0.5], -1], Exp[Times[Global`thetarho23g1, Plus[Times[0.5, Global`thetarho23g1, Plus[Power[Times[Plus[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Plus[1., Times[Plus[Times[-1, Global`thetatheta2, Exp[Times[-1, Global`thetatheta1, Global`thetax]]], Times[Global`thetatheta1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]]], Global`thetasigma3], Global`y30]]]], Times[-1, Times[Global`thetarho13, Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Global`thetasigma1], Global`y10]]]]]]], Power[Power[Plus[1., Times[-1, Power[Global`thetarho13, 2.]]], 0.5], -1]], 2.], Power[Times[Plus[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Times[Plus[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Times[-1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Global`thetatheta1, Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]], Global`thetasigma2], Global`y20]]]], Times[-1, Times[Global`thetarho12, Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Global`thetasigma1], Global`y10]]]]]]], Power[Power[Plus[1., Times[-1, Power[Global`thetarho12, 2.]]], 0.5], -1]], 2.]]], Times[-1, Times[Plus[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Plus[1., Times[Plus[Times[-1, Global`thetatheta2, Exp[Times[-1, Global`thetatheta1, Global`thetax]]], Times[Global`thetatheta1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]]], Global`thetasigma3], Global`y30]]]], Times[-1, Times[Global`thetarho13, Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Global`thetasigma1], Global`y10]]]]]]], Power[Power[Plus[1., Times[-1, Power[Global`thetarho13, 2.]]], 0.5], -1], Plus[Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Times[Plus[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Times[-1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Global`thetatheta1, Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]], Global`thetasigma2], Global`y20]]]], Times[-1, Times[Global`thetarho12, Quantile[NormalDistribution[0, 1], Plus[1., Times[-1, CDF[NormalDistribution[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Global`thetasigma1], Global`y10]]]]]]], Power[Power[Plus[1., Times[-1, Power[Global`thetarho12, 2.]]], 0.5], -1]]]], Power[Plus[Power[Global`thetarho23g1, 2.], Times[-1, 1.]], -1]]], Power[Power[Plus[1., Times[-1, Power[Global`thetarho23g1, 2.]]], 0.5], -1], PDF[NormalDistribution[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Global`thetasigma1], Global`y10], PDF[NormalDistribution[Times[Plus[Exp[Times[-1, Global`thetatheta1, Global`thetax]], Times[-1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Global`thetatheta1, Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]], Global`thetasigma2], Global`y20], PDF[NormalDistribution[Plus[1., Times[Plus[Times[-1, Global`thetatheta2, Exp[Times[-1, Global`thetatheta1, Global`thetax]]], Times[Global`thetatheta1, Exp[Times[-1, Global`thetatheta2, Global`thetax]]]], Power[Plus[Global`thetatheta2, Times[-1, Global`thetatheta1]], -1]]], Global`thetasigma3], Global`y30]]'
_ = 'D[Log[{}], {{{{Global`thetatheta1, Global`thetatheta2, Global`thetasigma1, Global`thetasigma2, Global`thetasigma3, Global`thetarho12, Global`thetarho13, Global`thetarho23g1}}, 2}}]'.format(_)
with w_evaluation.WolframLanguageSession(kernel=KERNEL_PATH) as session:
result = session.evaluate(_)
print(result)
In the profile I see 50.4s in session.evaluate and 49.9s in binary_deserialize. I assume there is not much asynchronous execution, so timing should be correct. Is there any chance to speed things up? In this case, the result is composed of many similar sub expressions (due to chain rule) which could be cached in transmission.
The resulting expression has a LeafCount above one million. It certainly takes some time to deserialize even if 50s seems a little long.
In order to help you getting a better experience with the library, I need to understand the next step. What will you do with such a big Wolfram Language expression?
If you plan to use it in a subsequent evaluation, just store it in a variable and use it directly (don't forget the semicolon). This will prevent the entire expression to round trip. The second command becomes:
_ = 'result = D[Log[{}], {{{{Global`thetatheta1, Global`thetatheta2, Global`thetasigma1, Global`thetasigma2, Global`thetasigma3, Global`thetarho12, Global`thetarho13, Global`thetarho23g1}}, 2}}];'.format(_)
You can also store the result somewhere, in which case I recommend to Export
the intermediary variable above, directly from the kernel:
with w_evaluation.WolframLanguageSession(kernel=KERNEL_PATH) as session:
session.evaluate(_)
session.evaluate(wl.Export('/tmp/test.wxf', wl.Global.result, "WXF"))
It'll create a new file /tmp/test.wxf
with the resulting expression. It can be later loaded using Import["/tmp/test.wxf"]
.
When cached (CSE), the result is not that big. My initial goal was to get the derivatives (and use Mathematicas implicit simplification) as an alternative to my current approach (R Deriv). That is, I would transform the result into a different representation (for instance python code based on numpy) and use it as a drop-in replacement. Of course, if I would do the entire evaluation in Mathematica, the deserialization wouldn't be necessary.
Can you explain what you mean by:
When cached (CSE)
I'm not familiar with this acronym.
If I understand correctly there is a repeated pattern in the result that artificially inflates the leaf count of the result. If you plan to traverse the resulting python expression and turn it into something else, then why don't you do a Replace
of the repeated expression with something easy to identify and short (e.g. an unique symbol name like myRepeatedPattern
), which you can later expand to its value (in the python numpy representation).
Sry, CSE is common subexpression elimination and the repeated expressions are a result of the chain rule and only to a minor extent of the base function.
I looked at the profile more closely and couldn't find obviously inefficient parts. Therefore I will try to do CSE in Mathematica (like https://community.wolfram.com/groups/-/m/t/394421) before transmission.
Closing this issue, feel free to reopen it with if needed.