== Nokogiri / Postgres Practice
- Fork and clone this repo as per usual
- Change into your project directory
- Run bundle
- Change the database.yml.example file to database.yml and update your postgres settings
- Uncompress the files in the data directory
- Run rake - should say no specs defined.
= TODO
- Write a unit test for a Nokogiri parser using machinist blueprints to simulate a data import. It needs to set the title and body given a string input collated from a number of Article models.
- The parser should output a file in your /tmp directory called articles.sql, containing a SQL string that will add your dummy models data to postgres
- Refactor your parser to work with an input file, and bring in the data/small.xml file and make sure it brings in 703 articles and saves them into your database.
- Try to get it to work with the large XML file. It should bring in 7151 results.
- Visit http://dumps.wikimedia.org/enwiki/20130503 and bring in more -pages-meta-current.xml files, prizes for the most articles imported.
= Tips
-
After you've got your unit tests running, the sample parser with
ruby lib/wiki_parser.rb data/small.xml