/Data-Analytics-Lab

410243:: Data Analytics Lab Practicals Repository

Primary LanguagePython

Data Analytics Lab

LAB 1. IRIS Dataset Analysis

Download the Iris flower dataset or any other dataset into a DataFrame. (eg https://archive.ics.uci.edu/ml/datasets/Iris )
Use Python/R and Perform following-

  • How many features are there and what are their types (e.g., numeric, nominal)?
  • Compute and display summary statistics for each feature available in the dataset.
  • (eg. minimum value, maximum value, mean, range, standard deviation, variance and percentiles
  • Data Visualization-Create a histogram for each feature in the dataset to illustrate the feature distributions. Plot each histogram.
  • Create a boxplot for each feature in the dataset. All of the boxplots should be combined into a single plot. Compare distributions and identify outliers.
  • LAB 2. Pima Indians Diabetes Dataset Analysis (W.I.P)

    Download Pima Indians Diabetes dataset. Use Naive Bayes‟ Algorithm for classification.

  • Load the data from CSV file and split it into training and test datasets.
  • summarize the properties in the training dataset so that we can calculate probabilities and make predictions.
  • Classify samples from a test dataset and a summarized training dataset.