finmath/finmath-lib

Fourier Methods - Major rewrite needed.

AlessandroGnoatto opened this issue · 9 comments

The structure of this parts need to be revisited. Issues I see at the moment:

Models feature a riskFreeRate and dividendYield. It makes much more sense to use DiscountCurve (standalone, without any AnalyticModel overkill) for the purpose and refactor the characteristic function as fundamental transform, using the terminology of Alan Lewis.

Secondly, I can currently instantiate a Heston model where the correlation is 3, or the starting istantaneous variance is negative, which does not make any sense. Anyway I can do that because parameters are doubles currently. Models need to be equipped with constraint classes for the parameters and the library should complain when I feed it with nonsense.

In Quantlib this is solved e.g. in the Class HestonModel where in the constructor you see something like

arguments_[3] = ConstantParameter(process->rho(),BoundaryConstraint(-1.0, 1.0));

I see your points (but as usually I try to challenge it a bit :-)

For the first one: I am not sure if this will work, e.g. for more general products. For the Black Scholes and Heston CF the scalars r, sigma, xi, etc. are model parameters. They are not market data. So if you supply a discount curve or forward curve - what is the model then? Do we have a term-structure (r is replaced by r(t))? - I believe that this is a new model (new class) then. The model takes a forward curve because it calibrates to that curve. I view the current Heston CF is similar to the BS formula and the SABR formula in AnalyticFormulas - quick and easy formulas.
W.r.t. the AnalyticModel: I would not see it as overkill. The curves cannot live without it, because it provides the context. Otherwise ForwardCurveFormDiscountCurve mapping to a single curve would not work.

For the second: With respect to constraints: It is good to have constraints/exceptions, but they should focus on parameter specifications which would otherwise result in NAN or other issues. In other cases I would go for no constraints even if they look reasonable. Examples for situations where a constrains look reasonable but are: 1) I read a paper (not too old!) claiming that interest rates should be constrained to be positive, otherwise we will have arbitrage in the model. 2) Constraints are a problem for root finders which like to calibrate a model. Sometimes it is good to let the root finder push over the limit. For example: volatility in Heston model can be negative. -sigma dW is like +sigma dW with a negative correlation. Unbounded root finders are usually much better than bounded ones. If a bound throws an exception, its difficult to recover within the calibration. 3) In a credit curve the Hazard rate can be calibrated negative. Calibrating a negative hazarded rate implies that the calibrate hazard curve has a better quality than the base curve exp(- (r+lambda) t) with a negative lambda. For example many bond spreads are quoted over LIBOR 3M, so you will have negative Bond spreads. If you have a root finder wrapping around a curve and that curve throws an exception that might be harder to handle. 4) The dividend rate can be negative. It appears unreasonable that a stock pays negative dividend, so one might be tempted to set a restriction for positivitiy, but in our Heston CF we specify repoRate and discountRate and if the derivative is colateralized discountRate < repoRate. - So with respect to constraints I believe that it is good to have unconstrained models on the lower level of the code, run calibration without too many constaints , then report / contain parameters and/or contain parameters in the user interface.

PS: In the Monte-Carlo Heston we would have rho sigma dW1 + sqrt(1-rho^2) dW2 an we we have to contain rho to prevent NAN! Agreed. But funny: I am not sure about Heston rho in the CF. There rho only appears as rho*xi in and if I run the model the price is continuous / smooth even for rho > 1. For rho >> 1 the model breaks down. In a calibration I would constrain not inside the model, but inside the root finder - maybe. (I wonder what rho > 1 does in Heston CF. The formula still works.)

Hi and thanks for the answer.

I disagree with your interpretation: you are mixing model parameters and market data. The repo and the dividend curve are not part of the model, they are features of the stock itself. You can decide at a later point if you use for the stock a geometric Brownian motion or a stoch-vol process or whatever, but repo and dividend yield are not part of the model. Your interpretation is semantically wrong in my opinion. In your view, one might be tempted to calibrate also the dividend yield and the risk free rate…. Do you see it? ;-)

The separation I am proposing can be easily built if you allow for a simple stand-alone DiscountCurve which simply provides interpolation and extrapolation. A constant dividend yield or repo rate are then simply special cases of the more general construct. In principle to price an option then you would need three different DiscountCurves;

  1. Repo Curve
  2. Dividend Yield Curve
  3. Collateral/Funding Curve

where 1) and 2) are grouped together with the spot price in a general stock class.

  1. is a property of the product

The parameters are then part of a stochastic process. You couple an instance of say HestonProcess and a Stock instance to obtain a HestonModel for the stock.

Semantically speaking, constraints are part of the model/process since the very moment when you write the dynamics, hiding them in the root-finder is for me poor style.

Some of your examples are odds: while the positivity of rates has always been debated (Vasicek has been out there since the seventies), there is not doubt that a variance (this is what you model in Heston) should be positive since it is a square or that a correlation should be in the interval [-1,1]. Anyway I want to stay focused on the main point without starting an annoying discussion on the single examples: A class should correctly represent a concept: if I define an instance of Person with a negative age because I am using double for the age I am doing poor programming. My 2 cents.

A small comment to rho: if we specify a model by
dX = simga1 dW1
dY = sigma2 (rho dW1 + sqrt(1-rho^2) dW2)
we have ensure rho <= 1 to avoid NAN. But if we specify the model by
dX = a dW1
dY = b dW1 + c dW2
then all parameters are unconstrained. Even negative is allowed. And we have
sigma_X = abs(a)
sigma_Y = sqrt(b^2 + c^2)
rho_XY = ab / (abs(a) * sqrt(b^2 + c^2)
with -1 <= rho_XY <= 1.
I wonder which of the two parameter spaces performs better when used in a calibration (say LM algorithm to fit a smile). In the sqrt(1-rho^2) it is maybe just an imaginary vol from which we take the real part in the FFT - maybe its just using sqrt(max(1-rho^2,0))

I have to think about this. I am not sure if we agree or disagree.

For me a repo curve, divendent curve, etc. is market data encoded in a model of its own - a curve (you have quotes for repo rates and interpolate them on a curve).
The r_c and r_d in the Heston model are model parameters.
Now, there is an addition step - calibration - which determines how r_c and r_d are determined from the curve.

So it is still not answered to me what it means if we replace r_c and r_d in the constructor by the curves: we have to specify how the calibration is done. In average (one r_c to fit the whole curve)? Or product specific calibration (r_c = r(T) where T is the product maturity)? Or do we have a term-structure model? Is it then reasonable to separate the core model (with the curve for r) form the forward curve and perform a "calibration").

PS: A person with a negative age would be an unborn child. W.r.t. to life: "t=0 <=> birth" is a definition (like t=0 <=> spot and spot is not today). Assume you have a computer system maintaining data for persons and you would like to enter data of unborn childs. Then the field age in the UI should display -0.5 (expected birth in 6M)... - I am just picking up your example, but what I wanted to say is that adding constraints too lightheartedly will remove functionality which could be useful. I know that some of my examples are odd, but if a few are reasonable then they are maybe sufficient to make my point that constraints at a low level should be used only if required for the numerics. For example: negative intensities of survival probabilities are reasonable (if the base curve if of poorer quality that the target curve) and work just fine for the numerics in the model.
I am not sure if I could make my point with the root finder... I do not want to hide something. It just that constrained optimisation is much more compilcated/inefficient and it should be avoided if not required. Also: what does contain mean: do you consider showing an exception if some initialised the model violating the constrain.

Also: I am not objecting against writing a Heston model which takes curves instead of r_c, r_d. I am fine with that. I would just like to keep to old one with r_c and r_d as a separate class. So instead of a re-write (deleting the old simple formulas) please just add a separate class / framework (possibly using the simple models).

To be clear, you write the dynamics of an asset (equity of Fx rate) in general as

$S_T = S_t \frac{P^{x}(t,T)}{P^{y}(t,T)} M_T$

x = dividend or foreign curve
y = repo or domestic curve
P^{z}(t,T), z\in {x,y} is a discount factor (see DiscountCurve)
and M_T is a positive true martingale starting at 1 depending on model parameters (BS, local vol, stoch vol, jump diffusion, whatever). This is the process whose parameters are to be calibrated. In the case of Black Scholes the process is fairly trivial:

$M_T = \exp\left{-\frac{1}{2}\int_t^T\sigma^2_s ds + \int_t^T\sigma_s dW_s\right}$

The question on the calibration of the curves P^{x}(t,T) and P^{y}(t,T) is easily answered:

  1. When you consider an Equity option you consider Equity forwards as calibration instruments.
  2. When you have an FX option you have a liquid market for FX forwards which delivers you the curves you need. You do not want to jointly calibrate on FX forwards and FX options (!!!).

If you think about it for a second, this is exactly what happens in the whole rest of the library with interest rate models like the Libor Market Model: you bootstrap on linear instrument so that you reprice them exactly and then set the model in motion with a martingale which is calibrated to non-linear instruments.

If you give me some time I hope to get MaFinLib to the point where this is implemented and then we can hopefully merge the implementations: the calibration would then be automatically retriggered by the observer pattern as market information changes.

I know! - This is clear. I am sorry that I could not explain my point:
The current Heston model is assuming dS(t) = r S(t) dt + ... with a scalar r and scalar model parameters
If you provide a curve you have to consider dS(t) = r(t) S(t) dt + ... with a function t -> r(t).
If you provide a volatility surface you may consider a term-structure of volatility, etc...

All I wanted to say is that I would prefer to keep the old simple r = const model and rather add a new model instead of rewriting the old one.
If you are forced to provide Curves to Heston the user has to understand curve bootstrapping etc. (but maybe he just likes to calculate implied vols/smiles for given r).

(Thinking of Heston here like SABR is used - without a term structure, one model for one forward).

I have added a constructor which takes a discount curve and the CF now uses it. I have done it only to the Heston, but will also add it to Bates and BS. But I am not sure if this is the "major rewrite" you wanted.

Note: I keep the internal storage of a constant rate, if desired, because that makes a 10%-15% performance difference and in case someone likes to calibrate "per maturity", he made calculated the rate outside without a curve.

(Apart from the additional constructor/fields) it is one line of code that has changed in the CF:

return A.add(B.multiply(volatility*volatility)).add(iargument.multiply(Math.log(initialValue) - getLogDiscountFactorForForward(time))).add(getLogDiscountFactorForDiscounting(time)).exp();

instead of

return A.add(B.multiply(volatility*volatility)).add(iargument.multiply(Math.log(initialValue) + riskFreeRate*time)).add(-discountRate*time).exp();
where getLogDiscountFactor... reads the value from the curve.

I think this is well addressed and understood so we should close it.