aqmen_data_science_dw
This is an AQMEN repository for -
Data Wrangling & Munging
Organising, Managing and Enabling Data for Analysis
Stata Workshop (July 2018)
A two day hands-on workshop led by Professor John MacInnes and Dr Diarmuid McDonnell, University of Edinburgh.
Topics:
The course introduces participants to the skills required to discover data sources and organise raw data for data analysis. Practical operations such as matching and merging files, recoding measures, documenting data and storing data will be covered. There will be an emphasis on developing accurate, efficient, transparent and reproducible working practices when analysing data.
Rationale:
There are many organisations that increasingly require employees to understand data, and to analyse data using statistical methods.
The known universe of data with potential value for organisations is constantly expanding. A consequence of the new forms of data is that they are not usually in a format that is ready to analyse. Datasets are often unsystematic and messy and they require a great deal of work to be undertaken before they can be used for data analysis. Data wrangling, which is sometimes referred to as data munging, or data enabling, is the process of organising raw data and transforming it into formats that enable data analyses to be undertaken.
Many organisations report that they have an absence of data analysis skills. Other organisations report that some employees have skills but lack confidence in exercising them. The lack of skills and capacity is currently a major obstacle to some organisations undertaking data analysis.
This course will provide an fast-track, comprehensive introduction for employees wishing to rapidly improve their data wrangling skills.
Reading List:
These links are to Amazon (https://www.amazon.co.uk) but there are a number of other reputable academic book sellers (e.g. https://blackwells.co.uk/bookshop/shops/).
Stata
Pevalin, D. and Robson, K., 2009. The Stata survival manual. McGraw-Hill Education (UK).
https://www.amazon.co.uk/stata-survival-manual-Pevalin/dp/0335223885/ref=sr_1_1?ie=UTF8&qid=1530725105&sr=8-1&keywords=stata+survival+manual
Many students like this introductory textbook.
Mehmetoglu, M. and Jakobsen, T.G., 2016. Applied statistics using Stata: a guide for the social sciences. Sage.
https://www.amazon.co.uk/Applied-Statistics-Using-Stata-Sciences/dp/1473913233/ref=sr_1_2?s=books&ie=UTF8&qid=1530725649&sr=1-2&keywords=Stata
This is a first class text book. It is clearly written and very comprehensive
Kohler, U. and Kreuter, F., 2012. Data analysis using Stata. Stata press.
https://www.amazon.co.uk/Data-Analysis-Using-Stata-Third/dp/1597181102/ref=sr_1_1?s=books&ie=UTF8&qid=1530725888&sr=1-1&keywords=kohler+and+kreuter
This is a first class text book. It is clearly written and very comprehensive and has successfully been used as a core textbook on several courses that I have taught.
Workflow
Gayle, V.J. and Lambert, P.S. (2017) The Workflow: A Practical Guide to Producing Accurate, Efficient, Transparent and Reproducible Social Survey Data Analysis. NCRM Working Paper. NCRM.
http://eprints.ncrm.ac.uk/4000/
A practical guide to the data analysis workflow.
Long, J.S. and Long, J.S., 2009. The workflow of data analysis using Stata. College Station, TX: Stata Press.
https://www.amazon.co.uk/Workflow-Data-Analysis-Using-Stata/dp/1597180475/ref=sr_1_1?s=books&ie=UTF8&qid=1530726163&sr=1-1&keywords=stata+workflow
A fantastic book. This is the 'bible' of good data analysis workflow practices.
Longitudinal Data Analysis
Gayle, V. and Lambert, P., 2018. What is Quantitative Longitudinal Data Analysis?. Bloomsbury Publishing.
https://www.amazon.co.uk/Quantitative-Longitudinal-Analysis-Research-Methods/dp/1472515404/ref=sr_1_1?s=books&ie=UTF8&qid=1530726864&sr=1-1&keywords=vernon+gayle
My recent book on longitudinal data analysis (using Stata).
Useful Stata Related Websites
https://stats.idre.ucla.edu/stata/
https://www.stata.com
https://www.stata.com/links/resources-for-learning-stata/
http://www.restore.ac.uk/Longitudinal/
Cheat Sheets (Stata)
https://www.stata.com/links/resources-for-learning-stata/
https://geocenter.github.io/StataTraining/portfolio/01_resource/
https://www.datasciencecentral.com/group/resources/forum/topics/stata-cheat-sheet
Two Page Guide to Stata
http://www-personal.umich.edu/~agrogan/stata/TwoPageStata.pdf