/gtf-nostream

Primary LanguageTypeScriptMIT LicenseMIT

gtf-nostream

Build Status

Parse GTF data. This is a simplified version of @gmod/gtf for with just basic parsing and no node.js stream module usage

Install

$ npm install --save gtf-nostream

Usage

const { parseStringSync } = require('gtf-nostream')
// or in ES6 (recommended)
import { parseStringSync } from 'gtf-nostream'

const fs = require('fs')

// parse a string of gtf synchronously
const stringOfGTF = fs.readFileSync('my_annotations.gtf', 'utf8')
const arrayOfThings = gtf.parseStringSync(stringOfGTF)

Object format

features

In GTF, features can have more than one location. We parse features as arrayrefs of all the lines that share that feature's ID. Values that are . in the GTF are null in the output.

A simple feature that's located in just one place:

[
  {
    "seq_id": "ctg123",
    "source": null,
    "type": "gene",
    "start": 1000,
    "end": 9000,
    "score": null,
    "strand": "+",
    "phase": null,
    "attributes": {
      "ID": ["gene00001"],
      "Name": ["EDEN"]
    },
    "child_features": [],
    "derived_features": []
  }
]

A CDS called cds00001 located in two places:

[
  {
    "seq_id": "ctg123",
    "source": null,
    "type": "CDS",
    "start": 1201,
    "end": 1500,
    "score": null,
    "strand": "+",
    "phase": "0",
    "attributes": {
      "ID": ["cds00001"],
      "Parent": ["mRNA00001"]
    },
    "child_features": [],
    "derived_features": []
  },
  {
    "seq_id": "ctg123",
    "source": null,
    "type": "CDS",
    "start": 3000,
    "end": 3902,
    "score": null,
    "strand": "+",
    "phase": "0",
    "attributes": {
      "ID": ["cds00001"],
      "Parent": ["mRNA00001"]
    },
    "child_features": [],
    "derived_features": []
  }
]

API

Table of Contents

ParseOptions

Parser options

disableDerivesFromReferences

Whether to resolve references to derives from features

Type: boolean

encoding

Text encoding of the input GTF. default 'utf8'

Type: BufferEncoding

parseFeatures

Whether to parse features, default true

Type: boolean

parseDirectives

Whether to parse directives, default false

Type: boolean

parseComments

Whether to parse comments, default false

Type: boolean

parseSequences

Whether to parse sequences, default true

Type: boolean

parseAll

Parse all features, directives, comments, and sequences. Overrides other parsing options. Default false.

Type: boolean