AllenDowney/ThinkStats2

Utility of Exercise 1.2

julienstark opened this issue · 2 comments

The book is great and concepts are well explained, however I struggle to understand the educational value of exercise 1.2, which basically asks us to write a function that reads the respondent file, 2002FemResp.dat.gz.

This is, more or less, a copy-paste of file nsfg.py that we already had the opportunity to review in the introduction notes. One can argue that we shouldn't review this nsfg file before writing functions for this exercise, but the "ReadFemResp" function is actually quite hard to design without any hints. Not because of the body of the function itself, but because of the dependencies on thinkstat2, which invokes some heavy parsing function.

So I'm still a bit confused about the purpose of this exercise. Am I supposed to write the program while reviewing nsfg.py (in this case, I just literally need to copy paste huge portion of the code) ? Or should I design the functions from scratch ? (which will take me a lot of time reviewing available class and methods in thinkstats2.py and try to understand how the dict file is parsed... Something which seems to be a bit too overwhelming for a chapter 1 exercise targeted to people with no prior experience with Pandas...)

Should additional information be provided or am I, most likely, missing something ?

I got a lot out of this exercise. Although it is somewhat of a repeat, it forced me to think things through and helped me put a lot of pieces together.

I'm new to data science, so it may not be helpful for more experienced folks. But, I found it very useful.