This Python package provides tools for calculating and visualizing Gaussian and Binomial distributions. It is a part of the AWS AI/ML Scholarship program and is particularly useful in educational and practical applications within the fields of Machine Learning and Artificial Intelligence.
- Data Modeling: Both Gaussian and Binomial distributions are crucial in the statistical modeling of data, which is a fundamental aspect of training and evaluating AI models.
- Feature Engineering: Understanding the distribution of various features can help in creating better features that improve machine learning models.
- Algorithm Assumptions: Many machine learning algorithms assume data to be normally distributed. This package can help in verifying these assumptions.
- Performance Metrics: Binomial distribution is particularly useful in classification tasks where the outcome is binary, such as computing the probability of success or failure.
Clone this repository to your local machine:
git clone git@github.com:snufkinwa/gaussian-binomial-distribution.git
Navigate into the package directory:
cd gaussian-binomial-distribution
Install the package:
pip install .
To use the Gaussian distribution module:
from distributions import Gaussian
# Create a Gaussian distribution instance
gaussian = Gaussian(mu=5, sigma=2)
# Read data from a file
gaussian.read_data_file('gaussian_data.txt')
# Calculate mean and standard deviation
mean = gaussian.calculate_mean()
stdev = gaussian.calculate_stdev()
# Plot histogram and PDF
gaussian.plot_histogram_pdf()
# Plot 3D histogram and PDF
gaussian.plot_3d_histogram_pdf()
To use the Binomial distribution module:
from distributions import Binomial
# Create a Binomial distribution instance
binomial = Binomial(p=0.4, n=20)
# Read data from a file
binomial.read_data_file('binomial_data.txt')
# Calculate mean and standard deviation
mean = binomial.calculate_mean()
stdev = binomial.calculate_stdev()
# Plot histogram and PDF
binomial.plot_bar()
binomial.plot_pdf()
To use the Bivariate Gaussian distribution module:
from distributions import BivariateGaussian
# Create a Bivariate Gaussian distribution instance
bivariate = BivariateGaussian(mu_x=5, mu_y=5, sigma_x=1, sigma_y=1, rho=0)
# Generate a grid of points
x_values, y_values, z_values = bivariate.generate_grid(0, 10, 0, 10, 0.1)
# Plot the Bivariate Gaussian distribution
bivariate.plot_bivariate_normal(x_values, y_values, z_values)
Note: This class is still being improved upon
This project was developed with the help of hints and foundational code provided by Udacity as part of the AI Programming with Python.