In order to generate the metadata used by the public for govinfo searches and browse, the govinfo team has developed a number of specialized parsers using regular expressions to extract relevant information from the source documents. In order to help third-party developers and other users understand how this metadata is generated, GPO is providing documentation regarding the regular expressions we use as well as the types of metadata available within each collection.
A collection is a set of content that has a consistent format -- for example, the Congressional Record is considered a collection within govinfo, as well as the Federal Register. Within govinfo, a number of these collections have collection codes
, which are a useful way to search specific metadata fields.
For the initial release, we are providing information on our Congressional Hearings (CHRG) collection.