/vandelay

Imports, exports, and ETL

Primary LanguageJavaScriptMIT LicenseMIT

Dead simple data pipeline utility belt.

vandelay NPM version Downloads Build Status

Install

npm install vandelay --save

Example - Flat File

import { tap, fetch, transform, parse } from 'vandelay'

fetch({
  url: 'http://google.com/example.geojson',
  parser: parse('geojson')
})
  .pipe(transform(async (row) => {
    const external = await otherApi(row.field)
    return {
      ...row,
      external
    }
  }))
  .pipe(tap(async (row, meta) => {
    // send row to an external api, db, or whatever!
  }))

Example - API

import { tap, fetch, transform, parse } from 'vandelay'

fetch({
  url: 'http://google.com/api/example',
  parser: parse('json', { selector: 'results.*' }),
  pagination: {
    offsetParam: 'offset',
    limitParam: 'limit'
  }
})
  .pipe(transform(async (row, meta) => {
    const external = await otherApi(row.field)
    return {
      ...row,
      external
    }
  }))
  .pipe(tap(async (row, meta) => {
    // send row to an external api, db, or whatever!
  }))

API

fetch(source[, options])

source

  • url - Required String
  • parser - Required Function
  • pagination - Optional Object
    • offsetParam - Required String (if not using pageParam)
    • pageParam - Required String (if not using offsetParam)
    • limitParam - Required String
    • startPage - Optional Number, defaults to 0
    • limit - Required Number

options

  • modifyRequest - Optional Function
    • Receives a superagent request object prior to execution, so you can add on any additional headers/querystring parameters.

parse(format[, options])

Returns a function that creates a parser stream. Parser streams receive text as input, and output objects.

format

Built in parsers are:

  • csv
    • Optional autoParse option, to automatically infer types of values and convert them.
    • Optional camelcase option, to camelcase and normalize header keys.
  • excel
    • Optional autoParse option, to automatically infer types of values and convert them.
    • Optional camelcase option, to camelcase and normalize header keys.
  • shp
  • json
    • Requires a selector option that specifies where to grab rows in the data.
  • xml
    • Requires a selector option that specifies where to grab rows in the data.
      • Note that the selector applies to the xml2js output.
    • Optional autoParse option, to automatically infer types of values and convert them.
    • Optional camelcase option, to camelcase and normalize header keys.

transform(transformer[, options])

transformer(row, meta)

  • Asynchronous function, receives the current row and the meta information object.
  • If transformer is a string, it will compile it and sandbox it using vm2.
  • Returning an object will pass it on, and null or undefined will remove the item from the stream (skip).

options

  • sandbox - Optional Object
    • Creates a frozen global context, used for sandboxed transformers
  • timeout - Optional Number
  • compiler - Optional Function
  • concurrency - Optional Number, defaults to 50
  • onBegin(row, meta) - Optional Function
  • onError(err, row, meta) - Optional Function
  • onSkip(row, meta) - Optional Function
  • onSuccess(row, meta) - Optional Function

tap(fn[, options])

fn(row, meta)

  • Asynchronous function, receives the current row and the meta information object.
  • Returning an object will pass it on, and null or undefined will remove the item from the stream.

options

  • concurrency - Optional Number, defaults to 50