It has a built in plugin for the geman KBA (Kraftfahrt - Bundesamt)
because, why not? That is what I needed in form of structured data.
This you can use as a template for your own plugins.
GLOB uses express
to provide the endpoints of the structured data
and cheerio
to scrape whatewer you request with the request
module.
This version of the framework still reelies on your natural intelligence
for understanding how the website you try to scrape is structured and then
use cheerio
to parse that structure: https://github.com/cheeriojs/cheerio
You should encapsulate your cheerio
logic into its own module like the
one provided for the KBA website: ./plugins/kba.js
After your module is done, expose the methods throu a new express
endpoint.
- clone this project
- run
npm install
- run
npm start
Machine learning
Artificial intelligence