/OndaNet

Just studing data analisys from a rock music portal

Primary LanguagePython

OndaNet

My purpouse is to learn Data Analysis and Data Mining using data from Onda Rock, an Italian music portal.

Step 1: A network of the music

Each review page of Ondarock is conneted with the other using hyperlink. I would obtain the network using to parse the pages:

  1. Request
  2. BeautifulSoap
  3. Htlm5lib
  4. nltk

To store and analyse the net:

  1. NetworkX

To plot the data

  1. D3

Step 1.5: find clusters

There are clusters? And these follow the division based on music gender?

Step 2: It's better store the data

I will chose a way to store and organize data, for example a DB, like Mongo o Couch. Any information is precious, like votes or the page reviewer

Step 3: Other data analysis

I would use also Pandas to charge the data
to analise. After I can think about to search correlation between data, or to developt a method to sugest me some music that I don't know but that will be like.