/tap-ts-starter

A Singer tap built with TypeScript/javascript that runs in Node and produces JSON-formatted data following the Singer spec.

Primary LanguageTypeScriptMIT LicenseMIT

tap-ts-starter

This is a Singer tap built with TypeScript/javascript that runs in Node and produces JSON-formatted data following the Singer spec, and most of the spec is reflected in tap-types.ts.

This tap:

  • Scans a local folder, treating the files it finds there as emails (MIME), parsing them into JSON with Nodemailer.Mailparser
  • Outputs a schema along with the resulting json for each file

This tap is also meant as a template to be forked for other uses. It separates the scanning of a resource collection (e.g. a folder) and the parsing of the individual resources (e.g. MIME files) into separate modules for easy drop-in replacement. A scanner module is included (scan-dir.ts for scanning local folders) and a parser module (parse-mime.ts for parsing emails) is included as well.

This code path is documented here.

New-School Code

If you're used to JavaScript code, here are a few newer ES6/ES7/TypeScript code features we use that might be new to you:

  • Arrow functions are largely interchangable with the more familiar function syntax:

let aFunction = () => {...

is roughly equal to

function aFunction() {...

  • Promises replace callbacks to clean up and clarify our code
  • Async/await builds on promises to make asynchronous code almost as simple (in many cases) as synchronous.

AWS Lambda Deployment

In addition to running as Singer taps, parsers can also be deployed as AWS Lambda functions. This allows you to take the exact same parser that your tap uses and deploy it to parse files one-at-at-time via triggers which run as they are dropped into a bucket. This functionality is enabled out of the box; the deploy script will create the bucket, deploy the parser as a Lambda function and add a trigger to call it when files are created in the bucket.

This code path is documented here.

Quick Start

  • Dependencies:
    • git
    • nodejs - At least v6.3 (6.9 for Windows) required for TypeScript debugging
    • npm (installs with Node)
    • typescript - installed as a development dependency
    • serverless - npm install -g serverless to install globally
  • Clone: git clone https://github.com/donpedro/tap-ts-starter.git
    • After cloning the repo, be sure to run npm install to install npm packages
  • Debug: with VScode use Open Folder to open the project folder, then hit F5 to debug. This runs without compiling to javascript using ts-node
  • Test: npm test or npm t
  • Compile documentation: npm run build-docs-tap and npm run build-docs-aws
  • Compile to javascript: npm run build
  • Deploy to AWS using serverless: serverless deploy --aws-profile [profilename]
  • More options are included from TypeScript Library Starter and are documented here
  • Run using included test data (be sure to build first): node dist/tap-main.cjs.js --config tap-config.json

Testing

We are using Jest for our testing. We run each file in testdata against parseItem, which is the current parser, declared in tap-main.ts. This is done in test/parse-testdata.test.ts which is the Jest test case for testing the parser.

It works by first reading two files:

  • First is the test data file, is in the testdata folder
  • Second is a .json file that contains an expected result from running parseItem on the corresponding test file

These two files are passed into a "matcher" which is a jest function used to check that values meet a certain condition. We are checking if the expected result file matches what we actually get when we run parseItem.

To add a test case:

  • Add a test file to the testdata/tests folder
    • Run the VS Code debugger with the configuration Debug parseItem using current opened test file while your new test file is open on the screen.
    • Copy the output from the debug console
    • Run the copied result through this JSON-validator in order to check is JSON is valid and to format in a more readable way
  • Add a .json file to the testdata/expectedResults folder
  • Take the newly formatted JSON and paste it into your test output file
  • In "test-config.json" there is an array of JSON objects. Each object has two properties: testdata and expectedresult. Add a new object with your new file names.
  • The tester will now run through all tests including the newly added test case

To run the tester: run the command npm test

As the tester runs it will print which files are being tested

Example: Tested data input: test.eml with expected output: test.json

If a test case fails, the files it failed with will be the last files printed.

Notes

Note: This document is written in Markdown. We like to use Typora and Markdown Preview Plus for our Markdown work.