/Statistical-Methods-and-Machine-Learning-in-R

This is an initiative to help understand Statistical methods and Machine learning in a naive manner. You will find scripts, and theoretical contents required to clarify concepts, especially for bio-informatic students.

Primary LanguageR

Statistical Methods & Machine Learning with R

Click on above link to visit our website.

This repo has been created and organized to work as a mini guide for students with none or very little background in the field of Computer Science, especially in Bio-Informatics, to have a general idea and understand the underlying concepts of Statistics paving its way towards Multivariate Statistics & ultimately Machine Learning Algorithms. We have written R scripts for the mentioned pathway as we proceeded which we have provided in our repository along with corresponding csv file which we used in our scripting process.

  • Download our codes, see how the process works for yourself in our elaborately commented scripts.
  • Visit our wiki which has been designed to enhance your understanding concepts with provision of various sources from where we have learnt as well as actual images of results while we worked on the concepts & scripts

We tried to acknowledge every possible source which have helped us to create this repository. All of our personal contents are open source & feel free to use them for learning, teaching, creating repositories

Happy Learning!!


TABLE OF CONTENT:

  • Tools(RScripts) for Data Analysis in Bio-Informatics.
  • Theoretical Concepts to the Analyses being performed.
  • Lectures + Tutorials for Deeper Understanding.

Tools (RScripts) for Data Analysis in Bio-Informatics:

An important aspect of data in bio-informatics being that the data is not often organized in a manner where Rows represent Instances and Columns represent Variables. Rather it is the other way round mostly in population biology, sample study etc. So our codes are tailored to work with datasets which are configured as following :

                   Column(Instance) 1      Column(Instance) 2     . . . .    Column(Instance) n
  
  Row(Variable) 1      (Value)                  (Value)           . . . .         (Value)
  
  Row(Variable) 2      (Value)                  (Value)           . . . .         (Value)
  
  .                       .                        .                                 .
  .                       .                        .                                 .
  .                       .                        .                                 .
  .                       .                        .                                 .
  
  Row(Variable) n      (Value)                  (Value)           . . . .         (Value)

Our team has uploaded all the scripts in R language. This may help to understand the usage of Programming in Bio-Informatic. Moreover, under the provided links you will find subfolders each containing a Data-set file, a Programming script in R, and a Read_me file (description about the program). In order to get access to the scripts click on the links mentioned below.

Theoretical Concepts:

Apart from Codes, our team has also gathered the theoretical approach to all the ANALYSIS which we automated using R. To get the link to that Section click on the link provided under this line:

Theory and Roadmap

Lectures + Tutorials:

We have also created a series of presentations & R Scripts which can act as a complete tutorial for an individual or a group to learn Statistical Methods & Machine Learning with R

The tutorial consists of presentation files along with a RScript which can be run simultaneously as someone goes through the slides. There is a task as well with each exercise which can provide as an assessment for the learner. 

  • Click on the following link to download Lecture+ Tutorial folder Statistical Methods & Machine Learning with R (Open in new tab if link does not start download on clicking or visit http://www.mpa.ovgu.de/index.php/statistical-analysis/)

  • The files inside the downloaded folder are password protected. To obtain the password & to receive the solutions for the tasks in each exercise, please drop a mail to  heyer@mpi-magdeburg.mpg.de 

  • There are Read Me files in each exercise to guide you through the folders

  • Much Appreciation towards Julian Lange, Daniel Walke & Max Wolf for their valuable feedbacks and making this Tutorial possible

The tutorial has the following content (The download link for our tutorial is provided after the content) :


All THE PERMALINKS BELOW WILL GUIDE YOU TO OUR RSCRIPTS FOR YOUR UNDERSTANDING


Visit our page at the MPA website

http://www.mpa.ovgu.de/index.php/statistical-analysis/

  • Special thanks to Mr Kay Schallert for making this webpage possible

Do not forget to clone the repository to you GITHUB-DEKSTOP.
This way you will get the folders on your dekstop(with all the datasets and scripts).


About Team:





                                       Dr. -Ing. Robert Heyer
 E-mail: robert.heyer@ovgu.de
 Linked-In: https://www.linkedin.com/in/dr-ing-robert-heyer-a288a219b/








                                       Faizan Ali
 M.Sc Digital Engineering (OVGU-Magdeburg, Germany)
 E-mail: faizan1.ali@st.ovgu.de
 Linked-In : https://www.linkedin.com/in/faizanalidataengineer/









                                       Rahul Mondal
 M.Sc Digital Engineering (OVGU-Magdeburg, Germany)
 E-mail: rahulmondal415@gmail.com
 Linked-In : https://www.linkedin.com/in/rahul-mondal-0241b719b/








                                       Syed Abdullah Rizvi
 M.Sc Digital Engineering (OVGU-Magdeburg, Germany)
 E-mail: syedabdullahrizvi@gmail.com
 Linked-In : linkedin.com/in/abdullahrizvi/









                                       Ammar Ateeq
 M.Sc Data Knowledge Engineering (OVGU-Magdeburg, Germany)
 E-mail: ammarateeq@hotmail.com
 Linked-In : https://www.linkedin.com/in/ammar-ateeq-291b94a7/