Basic-Math

Calculus

 ∫ a_b f(x)dx = F(b)−F(a)

Discrete and Continuous variable
- The probability of specific value is zero. For example, the weather forecast says that it will rain afternoon, the probability that raining at 13:05 is zero
Parameter estimation
- Maximum likelihood
  - Principle: maximizes the likelihood of the observed data
  - θ^* = argmax p(D; θ)
  - Solution: let partial derivative of ln of likelihood function to the parameter = 0
- Maximum a posteriori
  - Incorporate prior distribution of the parameter p(θ)
  - Posteriori: p(θ|D)
  - Bayes rule: p(θ|D) = (p(D|θ) * p(θ)) / p(D), i.e., posterior = likelihood * prior / evidence
  - θ^* = argmax p(θ|D) = argmax (p(D|θ) * p(θ)) / p(D) = argmax p(D|θ) * p(θ)
- Bayesian estimation
  - Principle: take into account all possible posterior values of θ, fully calculates (or approximates) the posterior distribution p(θ|D)

Basis is a group of vectors in linear space, any vectors can be represented by the linear combination of these vectors