janboone/applied-economics

Reading health insurance data into the notebook

Closed this issue · 5 comments

Hi,

I encounter some trouble regarding loading the insurance data csv file into the notebook. When I use the code that is already done for us, it throws me the following error: FileNotFoundError: [Errno 2] File data/Vektis Open Databestand Zorgverzekeringswet 2014 - gemeente.csv does not exist: 'data/Vektis Open Databestand Zorgverzekeringswet 2014 - gemeente.csv'

Also if I try to load it from my downloads folder on my Macbook is does not work and this gives the following error: OSError: Initializing from file failed.

So far I haven't had any troubles with loading the data, so how should this be solved?

Thank you in advance.

Floris

Hi Floris,

I am assuming you are working on the uvt server (not your own anaconda installation). Then you need to upload the data to the server in the same way that I showed this week on how to upload a notebook from your Downloads folder. The server does not recognize the local folders on your computer.

If you want to use the code in the notebook, create a folder data (in the folder where the notebook itself is) and upload the data into this folder data.

Does this help?

Regards,

Jan.

Thank you, this helps. I did not create a 'data' folder in the notebook folder, so that was the problem.

In the same cell, I've got another question. The code did not work and threw me this error: ValueError: invalid literal for int() with base 10: ' 0 t/m 4 jaar'. However, when I deleted following code ' df['age'] = df['age'].astype(int)' the cell did actually run without any problems.

Is this deleted code necessary? Why did it not work?

Could it be you downloaded the wrong dataset? databestanden op gemeenteniveau use '0 t/m 4 jaar' as age column. In databestanden op postcode3-niveau age equals 0, 1, 2, 3 etc. The only problem then is age category '90+', but the code deals with this.

Does this help?

Yes thank you, I downloaded the other dataset!