This gem allows you to interface with the Guardian-Content-API via Ruby.
Installation is as simple as typing the following text into a command prompt (such as Terminal):
gem install guardian-content
Alternatively, the gem can be directly downloaded from rubygems.org/gems/guardian-content
To use the library within a project, simply include the line require 'guardian-content'
- you can then use any of the methods listed below.
Whilst the Guardian Content API doesn’t require an API key for basic usage, for increased rate limits, and the ability to retrieve full text content, it is recommended that you register for an API key at guardian.mashery.com/. Once authorised, you can use the key via either declaring a GUARDIAN_CONTENT_API_KEY
constant, or via the key
param when initialising the class.
To do a basic free text search of Guardian documents, simply use the following command:
articles = GuardianContent::Content.search("your search term")
If successful, this will the return an array of results, which you can loop through and access via some standard attributes:
articles.each do |article| puts article.title puts article.url end
To include additional meta-data in the results, use the :select
parameter. There are two types of data you can request, tags and fields. To select all of both, use :select => {:tags => :all, :fields => :all}
. You can then access the tags and keywords via attributes:
articles = GuardianContent::Content.search(“your search term”, :select => {:tags => :all, :fields => :all}) articles.first.tags.each do |tag| puts tag puts tag end puts articles.first.fields puts articles.first.fields
Each article or content item has a unique id. This can be accessed either via the #id attribute on a returned search result, or you can determine it from the Guardian website itself - the id is simply everything in an article’s URL after the host name - eg for www.guardian.co.uk/world/2010/may/17/iran-nuclear-uranium-swap-turkey the id is world/2010/may/17/iran-nuclear-uranium-swap-turkey
.
To retrieve an article based on its id, use the #find_by_id method:
article = GuardianContent::Content.find_by_id(“world/2010/may/17/iran-nuclear-uranium-swap-turkey”) The article’s content and meta-data can then be accessed in the same way as for search results.
-
Fork the project.
-
Make your feature addition or bug fix.
-
Add tests for it. This is important so I don’t break it in a future version unintentionally.
-
Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
-
Send me a pull request. Bonus points for topic branches.
Copyright © 2010 Copyright 2010 Guardian News and Media Ltd. See LICENCE for details.