resbaz/lessons

General thoughts/suggestions on NLTK content

Opened this issue · 1 comments

A few comments based on experience/reflection from day 1 of ResBaz - feel free to discuss, discard, or modify as appropriate...

  • when introducing significant whitespace, should we use the concept of a 'code block' eg see http://en.wikipedia.org/wiki/Block_%28programming%29
  • when talking about lists the exercise/demo uses a list sent4 or sent7 that was defined elsewhere and invisible to the learner. Would it be better to have the user create two new lists from scratch, and then see how you can join and manipulate them?
  • some examples use the python print and others just type the variable name and have the interpreter display the contents. We should be consistent and (i suggest) always use print variable - this way learner get used to the idea of using lots of print statements to look inside their variables, a useful debugging skill.
  • defining variables challenge 1 - some people tried to solve the more generic problem, and write a function that will recognise a ; in (any) array and store/print anything before it - but were stumped because of lack of skills/practice yet - perhaps reword this example so it's clear we 'know' what is in the array and you want us to count and slice given a known content.
  • similar to the last point - the fidst challenge that follows, some folks tried to write a general function that will take four corpora as input and compare (which proved to be difficult with current practice/skills/knowledge) - but I think the aim is just to write one function that will take one input text and return the top 15 most common words.
  • occasionally an example in the explanation/notes is one that you wouldn't want the user to actually type because the output is too large - eg sorted(whole_corpora) or array[8:] type things - check that all examples are "runnable" if the user tries them on the loaded texts.
  • the python construct [len(w) for w in text] etc - I think it might better to write these out longform - especially as learners haven't been introduced to for loops yet etc.
  • the challenge to write code that will find all the words in a text that are more than seven letters long and occur more than seven times - this requires the use of and conditional which wasn't introduced earlier...
  • variable name - sometimes we use w for "word" and sometimes we use word - be consistent. Perhaps the longer form is easier for learners to follow than w

On 16 February 2015 at 20:27, Cameron McLean notifications@github.com
wrote:

A few comments based on experience/reflection from day 1 of ResBaz - feel
free to discuss, discard, or modify as appropriate...

  • when introducing significant whitespace, should we use the concept
    of a 'code block' eg see
    http://en.wikipedia.org/wiki/Block_%28programming%29
  • when talking about lists the exercise/demo uses a list sent4 or sent7
    that was defined elsewhere and invisible to the learner. Would it be better
    to have the user create two new lists from scratch, and then see how you
    can join and manipulate them?

I think repeat the lines declaring the variables, but commented out, as a
reminder that they had been declared previously.

  • some examples use the python print and others just type the variable
    name and have the interpreter display the concepts. We should be consistent
    and (i suggest) always use print variable - this way learner get used
    to the idea of using lost of print statements to look inside their
    variables, a useful debugging skill.

You are right, I was removing the print statements in the day-1 because I
am a lazy programmer. But it's probably better to put them in as you say.

  • defining variables challenge 1 - some people tried to solve the more
    generic problem, and write a function that will recognise a ; in the
    array and store/print anything before it - but were stumped because of lack
    of skills/practice yet - perhaps reword this example so it's clear we
    'know' what is in the array and you want us to count and slice given a
    known content.
  • similar to the last point - the fidst challenge that follows, some
    folks tried to write a general function that will take four corpora as
    input and compare (which proved to be difficult with current
    practice/skills/knowledge) - but I think the aim is just to write one
    function that will take one input text and return the top 15 most common
    words.
  • occasionally an example in the explanation/notes is one that you
    wouldn't want the user to actually type because the output is too large -
    eg sorted(whole_corpora) or array[8:] type things - check that all
    examples are "runnable" if the user tries them on the loaded texts.
  • the python construct [len(w) for w in text] etc - I think it might
    better to write these out longform - especially as learners haven't been
    introduced to for loops yet etc.

That's fair. I suggested to Daniel that the most idiomatic python be used,
but that isn't always the easiest to understand.

  • the challenge to write code that will find all the words in a text
    that are more than seven letters long and occur more than seven times -
    this requires the use of and conditional which wasn't introduced
    earlier...

Frame it as the intro to the conditional?

  • variable name - sometimes we use w for "word" and sometimes we use
    word - be consistent. Perhaps the longer form is easier for learners
    to follow than w

Agreed.

cheers
L.


"This is a profound psychological violence here. How can one even begin to
speak of dignity in labor when one secretly feels one's job should not
exist?"

On the Phenomenon of Bullshit Jobs, David Graeber
http://strikemag.org/bullshit-jobs/