elephant-map-adventure

Learning Hadoop and requisites


Hello there!

I've wanted to really dive into data science and as part of my journey on becoming a better data scientist, I'm learning to work with Hadoop and Scala.

I find that I learn best by doing so I'm writing notes on what resources I have found to be most helpful in a learn by doing approach.

If you have any suggestions or thoughts on how the notes could be better formatted or I've made a mistake in a code block, feel free to let me know!


My current setup

I run a windows 10 personal computer at home with something like

  • 700 GB of storage
  • 16 gb or so of RAM
  • i5 processor

I have a linux partition I am using for running hadoop

  • linux mint, cinnamon 18.x
  • always running things as super

Quick Note: If you forget your root/admin password, you can google "Lost linux password."

I found this tutorial to be most helpful. I might have been misreading previous tutorials but you have to replace the text starting at ro... and ending at ...handoff.