/Data-Mining

Individual and Group project of this course

Primary LanguagePython

Data-Mining

Individual and Group project of this course


Individual Project

1-1 Random number generation:

1-1.1 Generate three streams (1D, 2D and 3D) of random numbers with 1,000 samples, you may use the Matlab command rand.

1-1.2 Visualize the generated samples, you may use a scatterplot.

1-1.3 Compute the histogram of the three streams, then normalize them to become a probability density function (pdf).

1-1.4 Visualize the pdf’s of the three streams. Are the samples uniformly distributed? Do the pdf’s represent a standard uniform distributions? Comment.

1-2 Image manipulation – the image LenaGrey is formed by 512x512 pixels with intensity from 0 to 255

1-2.1 Import LenaGrey to show and see the image.

1-2.2 Calculate its mean, standard deviation, median, min, max, and mode.

1-2.3 Plot the histogram of the LenaGrey.

1-2.4 With the intensity as the third dimension (normalize it), plot its 3D shape (although this is not its 3D shape but it has some 3D impression.

1-3 Image range reduction – partition image intensity range into several bins and check to see how the image appearance change

1-3.1 Partition image intensity into 2 bins, i.e., change the image to 1 bit image (binary image)

1-3.2 Partition image intensity into 3. 4, 5, 6, 7 bins to check image quality change compared with the original Lena image (8 bit image with intensity range from 0 to 255).


Group Project

  1. The inputs should be over 1000 data, or points, or samples.

  2. You need to write codes for at least one of the methods in classification or clustering for your implementation, that means you can use any library or function except the required one method.

  3. Documentation includes tile, names/student IDs, aim, method, implementation result, discussion, conclusion, codes.

What we choose to do

Clustering Australian weather with using min-temperatures and max-temperatures (1000 data samples).