hparse
is a Microformats 2 DOM parser written in JavaScript.
It natively supports the new prefix-based parsing rules of µf2, and provides a mechanism to important old, known µf1 vocabularies and parse them according to the rules of µf2.
Struckthrough items are not implemented yet.
- Microformats v2 parsing rules
- Microformats v2 singleton objects
- HTML5
pubdate
asdtpublished
mapping - HTML5
time
element - HTML5
data
element - Date-Time Pattern
- Separated Date-Time Pattern
- Value-Title Pattern
- Parsing of image
alt
text - Markdown conversion for plain-text values.
Include PatternInclude Pattern via Microdataitemref
Support for HTML tables
hparse
requires a Browser-DOM compatible objectNode
,DOMElement
, etc, but given that should function in both Browser and standalone environments.hparse
requires a number of ECMAScript 5 features ([].forEach
,Object.keys
,[].map
,Node.nodeType
enumeration) so will not function in Internet Explorer without also include an appropriate polyfill.hparse
does not require any additional JavaScript framework.
Create a new instance of the parser.
Run the parser against the condifured node. Returns Results
.
Returns all microformats parsed from the tree.
Returns only those microformats that standalone, and are not also a property value for a parent microformat. e.g. An element with root h-event
would be returned, but the object for h-card p-organizer
contained within would not.
Get a parsed microformat object by its id
from the original document.
Returns all microforamts with a matching type (e.g. h-card
, h-event
). Pass true
to also include microformats that are property values in other microformat objects.
Shorthand to instantiate a parser and return all standalone microformats from an object tree; returns Results
.
Add a legacy microformat vocabulary to HParse to parse old, unprefixed microformats as if they were using the new syntax. See src/vocabularies
for example mapping files.
Returns the current global parser settings. Change them by passing in an object with alternate values.
There are some other publicly exposed parsing and text converstion methods that are presently undocumented since they might change.
Global setting names, functions, show here with default values. To change a setting, use HParse.settings({ settingOne: true, settingTwo: false });
. Settings are all boolean.
parseSingletonRootNodes
:true
. Enable parsing of<a class=h-card ...><img src=#photo alt="Ben Ward"></a>
as a full microformat withname
,url
andphoto
properties.parsePubDateAttr
:true
. Enable parsing of the HTML<time>
elementpubdate
attribute as.dt-published
.parseRelAttr
:true
: Enable parsing ofrel
attributes to therelationships
collection.parseItemRefAttr
:true
. Enable use of microdata'sitemref
attribute as per the include-pattern. Not implemented yet.parseV1Microformats
:false
. Enable parsing of legacy syntax microformats as per microformats-2. Requires adding individual vocabularies withHParse.defineLegacyVocabulary
.
forceValidUrls
:true
. Filtersu-
URL property values for valid URLs only.forceValidDates
:true
. Filtesdt-
date time property values for valid ISO8610 dates only.
expandPlainTextUrls
:true
. When converting an<a>
element to plain text in property output, include the URL in parantheses.markdownPlainTextUrls
:false
. WhenexpandPlainTextUrls
istrue
, output URLs in plaintext using Markdown syntax,[Text](http://example.com "Title")
.expandPlainTextAbbreviations
:false
. When converting an<abbr>
to plain text, append thetitle
expansion in parantheses.markdownPlainTextPhrases
:true
. When converting<b>
,<i>
,<strong>
,<em>
and<code>
phrases to plaintext, wrap in markdown syntax.markdownPlainTextImages
:false
. When converting inline images to text, using Markdown syntax,![Alt Text](http://example.com/foo.jpg "Title")
instead of the rawalt
text.