/DocWrapper

DSL for creating wrappers for DOM documents.

Primary LanguageRuby

DocWrapper is a simple DSL for creating wrappers around DOM objects.

Usage

% gem install doc_wrapper

Example Usages

DocWrapper allows you to easily create a declarative wrapper to access data from HTML Document Object Model (DOM) or XML DOM documents and optionally transform them.

DocWrapper will work with any underlying "document" that has a search method, such as a DOM generated by Nokogiri, or Hpricot. This allows the selectors used by DocWrapper to support any selector your DOM library does. Using Nokogiri, you can use either XPath or CSS selectors for very flexible property definition.

DocWrapper works by declaring properties with a name, type, and the search path to find the raw data in the DOM.

Basic Example

require 'nokogiri'
require 'doc_wrapper'

html = %{
  <html>
  <body>
  <p class="first_name">Mark</p>
  <p class="last_name">Menard</p>
  </body>
  </html>
}

class PersonWrapper
  include DocWrapper::Base
  include DocWrapper::Properties
	
  property :first_name, :string, './p[class="first_name"]'
  property :last_name, :string, './p[class="last_name"]'
end

person_wrapper = PersonWrapper.new(Nokogiri::HTML(html))
person_wrapper.first_name # => 'Mark'
person_wrapper.last_name # => 'Menard'

Supported Property Types

Currently DocWrapper support :string, :date, :time, :boolean, :float and :raw. Additionally DocWrapper supports embedded wrappers using has_one and has_many functionality very similar to ActiveRecord. See specs for example usages.

Access to Node Attributes

String, Date, Time and Boolean properties can reference an attribute on a node.

Given the following XML document:

<?xml version="1.0" encoding="UTF-8"?>
<feed>
  <link type="text/html" href="http://search.twitter.com/search?q=yahoo.com" rel="alternate"/>
</feed>

You can access the link href with the following property definition.

class FeedWrapper
  include DocWrapper::Base
  include DocWrapper::Properties

  property :link, :string, '//feed/link', :use_attribute => :href
end