freeCodeCamp/CurriculumExpansion

Probability Calculator Calculator (Certification Project)

Closed this issue ยท 31 comments

Create project from the Python scientific computing certification.

Completed project: https://repl.it/@borntofrappe/fcc-probability-calculator


The README file is quite exhaustive and considerably longer than the previous challenges. It is overall understandable, but it might take a couple of reads to sink in.

Two minor corrections:

This method should remove two balls at random from contents and return those balls as a list of strings.

The text should refer to the number of balls specified by the input arguments, not two.

balls_to_draw: An object indicating the exact group of balls to attempt to draw from the bag for the experiment.

Here we refer to a bag, while the README sets up the ball in a hat. Not that big of an issue, but worth a small change.

For the contents of the README, I have also two suggestions:

  1. the last test performs a draw in which the number of balls exceeds the number of balls available in the contents list. Should we make specific mention of this case when describing the Hat class? Something along the lines of "if the number of balls exceeds the available quantity, return the entire list"

  2. prob_calculator.py begins by importing two modules in random and copy. Similarly to the number of draws, should we make specific mention of this? I foolishly developed the project locally and didn't realize I needed a random function or a copy function until late in the development.

My only fear is that the README will be excessively long, but especially the first point might help campers understanding the error message presented by the testing suite.

ValueError: empty range for randrange() (0, 0, 0)

Thank you for your review, @borntofrappe. Great catch with the ball/hat mismatch. That's something I totally missed in my initial review. We'll definitely fix this before release.

If the number of balls being drawn is greater than the total number in the hat, having the Hat class return the whole list or a specific message is a good idea. What do you think @beaucarnes?

For your second point, we may not want to mention this in the README. We generally expect people to develop these projects on Repl.it, and don't want to require any modules specifically unless it's something like Pandas. Did you end up rolling your own random and copy? If so, did they work when you ran the tests?

Also, when did you see that error message? I wasn't able to trigger it with the included solution.

Thanks for the quick response @scissorsneedfoodtoo

I didn't roll up my own version. I found the random module in the official docs, but I felt guilty importing the library (espcially since the previous problems didn't use any external code). I went back to the REPL to give the project another read and saw the two lines at the top. The rest, I guess, is documented in my code :)

The error message appears if you comment out these lines of code.

contents = self.contents
        if(len(contents) <= num_balls_drawn):
            return contents

This is where I return the entire list if the number exceeds the available comment.

@borntofrappe, no need to feel guilty! We generally encourage people to use whatever modules or libraries they want while working on these required projects, especially if they're part of the language.

Thank you for that code snippet. We'll look more into it and consider changing the requirements / adding a test if num_balls_drawn exceeds the total number of balls in the hat.

@borntofrappe Thanks for your suggestions. I implemented most of them.

As far as mentioning the random and copy modules, those are actually specifically mentioned but maybe not in the location you were expecting. Here are the first three lines of the starter code (prob_calculator.py):

import copy
import random
# Consider using the modules imported above.

Do you think it would be better to move that comment to the readme instead of the starter code file?

@beaucarnes I eventually noticed the modules in the .py file, and I'm going back and forth on whether or not there should be a specific mention in the README.

As @scissorsneedfoodtoo said, campers develop the project in the REPL. If we consider there won't be the solution file either, prob_calculator.py is going to be the first file opened after the markdown file.

That being said, it is also something novel with respect to previous problems, and personally, after I saw the two import statements I was able to plan out my approach much better.

Sorry not to have a clear suggestion. It's tough to consider how campers reach the problem and how they develop the project from the README.

@borntofrappe I added a hint in the readme about the imported modules.

Only a little spelling correction in the test_module.py.test_prob_experiment() function.
In both .assertEqual() calls, the string message reads "Expected experiemnt" where it should "Expected experiment".

I currently find it very hard to understand what is meant by

contents should be a list containing one item for each ball in the hat.

What is one item? a tuple, a new class, a dictionary?

@borntofrappe I added a hint in the readme about the imported modules.

Might it make sense to move this hint up to the draw method description so that the two are easier to mentally link?

This method should remove balls at random from contents and return those balls as a list of strings

To me, it is currently unclear if this is with or without put-back

I currently find it very hard to understand what is meant by

contents should be a list containing one item for each ball in the hat.

What is one item? a tuple, a new class, a dictionary?

@borntofrappe I added a hint in the readme about the imported modules.

Might it make sense to move this hint up to the draw method description so that the two are easier to mentally link?

This method should remove balls at random from contents and return those balls as a list of strings

To me, it is currently unclear if this is with or without put-back

Hi @lefthand3r , I'll give you my take on this.

contents must be a list with one item per ball/color. For example, if your hat is {"red": 2, "blue": 1}, contents should be ["red", "red", "blue"], a list of strings.

About your last question, yes, you do need to remove the extracted balls while running each single experiment. When you finish an experiment, you need to update your counters and somehow reset the hat by putting all the balls back. This way you will start a new experiment afresh.

Hope this helps.

Hey @jdelarubia - thank you for explaining- I was able to piece most of this together using the test cases.
However, I think it is not clearly written in the readme, so possibly the readme should be changed to clarify

  1. what specifically is expected to be an item- maybe add an example such as the one you presented - or is this part of the 'figure it out?'
  2. IMHO it could also be phrased differently in the readme what an experiment and what the draw function should do individually - right now it feels unclear if the draw method should do a copy or the experiment function - if wanted I would change some things in the readme and let others decide if it is more clear now?

Hi @lefthand3r , Adding examples to the readme it is indeed a good idea. However, the specification for a problem (any problem, really) is never complete. On the positive side, we have the project issues page and the Discord #help channel. In my view, we better off incorporate those help channels to our toolbox rather than trying to solo-code through all the problems (hey, a solo-coder talking here! ;) . Cheers!

edit: my solution - https://repl.it/@MaikRo/fcc-probability-calculator

I think I found a 'bug', which might be an edge case:

def draw_not_working(self, num_balls: int):
        # if the number of balls that should be drawn
        # exceed the number of available balls, return all balls
        if num_balls >= len(self.contents):
            return self.contents
        # randomly take out num_balls balls from the hat
        # and delete them from the hat subsequently
        balls_drawn = random.sample(self.contents, num_balls)
        for ball in balls_drawn:
            self.contents.remove(ball)
        return balls_drawn
def draw_working(self, num):
        num = min(num, len(self.contents))
        balls = []
        for _ in range(num):
            index = random.randint(0, len(self.contents) - 1)
            balls.append(self.contents.pop(index))
        return balls

both functions kind of do the same, the difference is that the lower one specifically deletes the randomly id'ed balls, while the upper one does delete one random ball of the appropriate type.
Since the deletion of the balls has no influence on an experiment (correct?) I would have assumed that both functions with the same random.seed would result in the same probability, however this is not the case.

The non_working draw method gives a 0.267, which results in a test failure.
image

I think most users will use the randint version and thus never encounter this problem however it is really hard to debug (IMHO) since the code does basically the same as the random.sample version.

Thanks for catching that typo, @jdelarubia. I fixed it in the main project here: https://repl.it/@freeCodeCamp/fcc-probability-calculator

Also, thanks to both you and @lefthand3r for your discussion. These are all points we should consider before the release.

Thanks, I added clarification in the readme to what contents should be. I also addressed the edge case of creating draw function differently by changing the test to "assertAlmostEqual" so the probability can be off by 0.01.

@lefthand3r Do you have a wording suggestion on how to better clarify the experiment and draw functions?

I thought about this for a while and this is what I came up with for now - my changes are bold :

draw:

The Hat class should have a draw method that accepts an argument indicating the number of balls to draw from the hat. This method should remove balls at random from contents and return those balls as a list of strings, the balls should not go back into the hat during the draw (similar to an urn experiment without replacement). If the number of balls to draw exceeds the available quantity, return all the balls.

experiment:

  1. the experiment function should be called a function and not a method since it is not associated with a class, right?
  2. I would rename the balls_to_draw to expected_balls or experimental_balls? - I changed it to expected_balls for now

Next, create an experiment function in prob_calculator.py (not inside the Hat class). This function should accept the following arguments:
hat: A hat object containing balls that should be copied inside the function.
expected_balls: An object indicating the exact group of balls to attempt to draw from the hat for the experiment. For example, to determine the probability of drawing 2 blue balls and 1 red ball from the hat, set expected_balls to {"blue":2, "red":1}.
num_balls_drawn: The number of balls to draw out of the hat in each experiment.
num_experiments: The number of experiments to perform. (The more experiments performed, the more accurate the approximate probability will be.)

hat = Hat(black=6, red=4, green=3)
probability = experiment(hat=hat,
expected_balls={"red":2,"green":1},
num_balls_drawn=5,
num_experiments=2000)

@lefthand3r, thanks for your patience and for your suggestions. I believe you're right, experiment should be called a function rather than a method. Also, renaming balls_to_draw to expected_balls is a good idea since we're trying to determine the probability of drawing those exact colors.

The one change I would make is to break your first addition out into its own sentence:

The Hat class should have a draw method that accepts an argument indicating the number of balls to draw from the hat. This method should remove balls at random from contents and return those balls as a list of strings. The balls should not go back into the hat during the draw, similar to an urn experiment without replacement. If the number of balls to draw exceeds the available quantity, return all the balls.

What do you think @beaucarnes?

@lefthand3r Thanks for your suggestions. I implemented them, along with @scissorsneedfoodtoo 's idea about breaking up the sentences in that paragraph.

Hi there!

I've really enjoyed this project, although I've found some parts are a little bit tricky to understand.

I've found a format typo. The word highlighted in yellow should have the same font format like the rest of "contents" words.
image
Moreover, I would clarify that "contents must be a list with one o more items per ball/color. For example, if your hat is {"red": 2, "blue": 1}, contents should be ["red", "red", "blue"], a list of strings.".

Finally, I had trouble to get the correct probability until I realized I didn't make the class object copy and that was affecting to the calculated probability. I don't know what you think about giving some additional clue.

Solution code below:
https://repl.it/@casraysa/fcc-probability-calculator

@casraysa, thanks for your feedback. I'll go in and fix the formatting now.

About the sentence you point out, I agree that it could use a little more clarification. How about "contents must be a list of strings where each item is the color of a ball in the hat. For example, if your hat is {"red": 2, "blue": 1}, contents should be ["red", "red", "blue"]."

Or maybe "contents must be a list with one item per color per ball. For example, if your hat is {"red": 2, "blue": 1}, contents should be ["red", "red", "blue"]."

I liked the ideas of @casraysa and @scissorsneedfoodtoo about changing the wording of that sentence. Below is what I came up with and I already updated the project with this wording (though I'm open to changing it again).

"contents should be a list of strings containing one item for each ball in the hat. Each item in the list should be a color name representing a single ball of that color. For example, if your hat is {"red": 2, "blue": 1}, contents should be ["red", "red", "blue"]."

Another nice but tricky project. Have to read a couple of times to understand the experiment and draw function.

Link : https://repl.it/@ManshulArora/fcc-probability-calculator#prob_calculator.py

Revisiting the project at the new URL, I've found the changes have helped describing the assignment. I've only a few notes for the README.

  • I found the line introducing the project to be mildly confusing

    For this project, you will write a program to determine the approximate probability of drawing certain balls randomly from a hat.

    Personally, I'd feel the following would be clearer

    In this project, you will write a program to determine the approximate probability of drawing at random balls of a certain color.

  • when describing the arguments of the experiment function, I found the description of the hat object to be quite peculiar

    hat: A hat object containing balls that should be copied inside the function

    Unlike the other arguments, the description is very specific and redirects toward the copy module. When encountering the problem for the first time, it might be better to just describe the argument as follows

    hat: A hat object that will be used to draw the balls

  • also in the list of arguments, I'd suggest to describe expected_balls not as an object, but as a dictionary

    expected_balls: An object A dictionary indicating the exact group of balls to attempt to draw from the hat for the experiment.

    Using the word object could create confusion especially considering the hat object which comes earlier.

Aside from the last point, these are mostly suggestions, but I hope you'll find some value.


Forked project:
https://repl.it/@borntofrappe/fcc-probability-calculator

I would second previous comments about multiple ways to (correctly if regardless of efficiency) implement random draws that would not satisfy the final test, even with the current delta fuzziness. My recommendation would be some combination of 1) increasing N by some degree for greater convergence; and 2) deriving the delta allowance from the theoretical standard deviation to target a specific confidence interval, which should be calculable from the experiment specifications. Otherwise you would need a bit more hand-holding than is currently provided (i.e. "from random import randrange" instead of "import random" along with direction on number of times the function should appear).

For reference, one approach is to use random.shuffle(contents) once after initialization and an argumentless contents.pop(). Relative efficiency of those two different approaches (shuffle v randrange) depends on ratio of number of balls drawn per trial to number of balls in hat -- I think, if I'm getting the integral from 0 to N of log n as k approaches N correct (economist background, not CS, so I could be very wrong).

Hi @imkleats, thanks for your patience and your detailed recommendations. They sound good if they make the project more flexible, though if I'm being honest I don't fully understand your suggestions.

Could you give us an example of your preferred changes and post a link to a project on Repl.it?

I'll see if I can find the time soon. Apologies for the stats jargon. It's basically saying that, rather than expanding the range of acceptable results each time we come across an alternative approach, we could use what we know about the general/theoretical probability formulas to make a range that works for a lot of possible solutions.

No need to apologize @imkleats, thank you for clarifying that for me. I'm sure what you originally wrote makes total sense if you have a better grasp of math than I do!

Sure, please feel free to write back when you have some time. We'd really appreciate your help opening up this project to more possible solutions.

@sevketcansefer please use the forum (forum.freecodecamp.org) to ask for help with coding, this is not the place