/lagrange-for-gini-croatia

Program that estimates the Gini coefficient of a country using Lagrange Interpolation for Lorenz curve approximation.

Primary LanguagePythonMIT LicenseMIT

Estimating Croatia's GINI Coefficient using Lagrange Interpolation Method for Lorenz Curve Approximation

This project consists of a program that calculates the Gini coefficient in the case of Croatia 2018. For the gini calculation, the program first reads from a csv file containing the income distribution by tenths of population found in Eurostat. Then, creates the coordinates needed to plot Lorenz curve. The actual Lorenz curve is calculated as a polynomial approximation using Lagrange Interpolation Method with calculated coordinates. The program then calculates the gini coefficient by integrating x minus the lorenz function. Two other methods are added to calculate gini coefficient, but only for comparison.

The pdf document also in this repo explains further theory and procedures followed. Its LaTeX source file is NM_SeminarPaper.zip.

Specifications

Python version: Python 3.6.2
Matplotlib version: 2.1.2
Sympy version: 1.5.1
Numpy version: 1.14.1

Usage

  1. Download this repo and store it in your computer.
  2. Go to the folder's directory where the repo is stored.
  3. Run lagrange.py by typing in Powershell: python lagrange.py, once located in the project's directory.

Results

The Lorenz curve polynomial approximation using Lagrange Interpolation Method is the following:

alt text


If we plot a line from point to point (gray line) and compare it to the approximated lorenz curve (red line) we get:

alt text

Further Experiments

Additionally and outside of the projects strict boundaries, the method for integrating the Lorenz curve in order to get the Gini coefficent were three: Sympy's integrate function, Monte Carlo simulation and Riemann Sums. The following plot is the resulting time performance of these three. Interestingly, Monte Carlo simulation showed the most accurate result, with 97.91% of accuracy.

alt text

Monte Carlo Simulation is presented here in order to integrate the Lorenz Curve and get the Gini Coefficient, where the amount of tests increases to show how it works.

alt text

Making this project made me so happy.