A high-performance, memory-safe TypeScript/JavaScript streaming parser for JSONL (JSON Lines) files with extensive configuration options inspired by csv-parse. Included is a JSONL validator and converters to and from JSON and CSV.
TypeScript
JSONLParse
A high-performance, memory-safe TypeScript/JavaScript streaming parser for JSONL (JSON Lines) files with extensive configuration options inspired by csv-parse. Included is a JSONL validator and converters to and from JSON and CSV.
Features
🚀 High Performance: Native Node.js streams with minimal overhead
🛡️ Memory Safe: Built-in protection against memory exhaustion
📝 TypeScript Support: Full type definitions and interfaces
🔧 Highly Configurable: Extensive options for data transformation and filtering
🌍 Cross-Platform: Handles both Unix (\n) and Windows (\r\n) line endings
⚡ Streaming: Process large files without loading everything into memory
constparser=newJSONLParse({info: true,// Include parsing metadataraw: true// Include original line text})// Output: {// info: { lines: 1, records: 1, invalid_field_length: 0 },// raw: '{"name": "Alice"}',// record: { name: "Alice" }// }
Whitespace Handling
constparser=newJSONLParse({trim: true,// Trim both ends// orltrim: true,// Left trim onlyrtrim: true,// Right trim only})
Skip Empty Records
constparser=newJSONLParse({skip_records_with_empty_values: true// Skip records with all empty/null values})
Nested Object Creation
constparser=newJSONLParse({objname: 'id'// Use 'id' field as object key})// Input: {"id": "user1", "name": "Alice"}// Output: { user1: {"id": "user1", "name": "Alice"}}
Memory-Safe Processing
constsafeParser=newJSONLParse({maxLineLength: 1024*1024,// 1MB per line maximumstrict: false,// Skip overly long lines instead of erroringskip_records_with_error: true// Continue on any parsing errors})
Complex Data Pipeline
import{createReadStream,createWriteStream}from'node:fs'import{Transform}from'node:stream'import{pipeline}from'node:stream/promises'constparser=newJSONLParse({columns: true,// First line as headerscast: true,// Auto-convert typescast_date: true,// Convert datestrim: true,// Trim whitespacefrom: 2,// Skip first data recordskip_records_with_empty_values: true,on_record: (record)=>{// Filter and transformif(record.status!=='active')returnnullreturn{ ...record,processed: true}},info: true// Include metadata})constprocessor=newTransform({objectMode: true,transform(data,encoding,callback){// Access both metadata and recordconst{ info, record }=dataconstoutput={
...record,metadata: info,processed_at: newDate().toISOString()}callback(null,`${JSON.stringify(output)}\n`)}})awaitpipeline(createReadStream('input.jsonl'),parser,processor,createWriteStream('output.jsonl'))
Async Iterator Usage
import{Readable}from'node:stream'constparser=newJSONLParse({cast: true,on_record: record=>record.priority==='high' ? record : null})constreadable=Readable.from(createReadStream('data.jsonl').pipe(parser))forawait(constobjofreadable){console.log('High priority object:',obj)awaitprocessHighPriorityObject(obj)}
JSONL Validator
JSONLParse includes a comprehensive validator for ensuring JSONL file integrity and schema compliance.
JSONLValidator - Validate JSONL Files
Validate JSONL files with comprehensive error reporting and optional schema validation.
interfaceValidationResult{valid: boolean// Overall validation resulterrors: ValidationError[]// List of validation errorstotalLines: number// Total lines processedvalidLines: number// Number of valid linesinvalidLines: number// Number of invalid lines}interfaceValidationError{line: number// Line number (1-based)column?: number// Column position for JSON errorsmessage: string// Error descriptionvalue?: any// Invalid valueschema?: JSONLSchema// Schema that failed}
Validator Usage Examples
Basic Validation
import{validateJSONL}from'jsonl-parse'constjsonlData=`{"id": 1, "name": "Alice"}{"id": 2, "name": "Bob"}invalid json line{"id": 3, "name": "Charlie"}`constresult=validateJSONL(jsonlData,{strictMode: false,allowEmptyLines: true})console.log(`${result.validLines}/${result.totalLines} lines valid`)// Output: 3/4 lines validresult.errors.forEach((error)=>{console.log(`Line ${error.line}: ${error.message}`)})// Output: Line 3: Invalid JSON: Unexpected token i in JSON at position 0
Schema Validation
constuserSchema={type: 'object',required: ['id','name','email'],properties: {id: {type: 'number',minimum: 1},name: {type: 'string',minLength: 2,maxLength: 50},email: {type: 'string',pattern: /^[^\s@]+@[^\s@][^\s.@]*\.[^\s@]+$/},age: {type: 'number',minimum: 0,maximum: 150},status: {type: 'string',enum: ['active','inactive','pending']}}}constvalidator=newJSONLValidator({schema: userSchema,strictMode: true})// Will validate each line against the schema
conststrictValidator=newJSONLValidator({strictMode: true,// No whitespace, perfect formattingallowEmptyLines: false,// No empty lines allowedmaxLineLength: 1000// Reasonable line length limit})constresult=validateJSONL(' {"valid": true} \n',{strictMode: true})// Will report: "Line has leading or trailing whitespace"
import{JSONLParse}from'jsonl-parse'// Validate then process valid recordsconstvalidator=newJSONLValidator({schema: {type: 'object',required: ['id','email'],properties: {id: {type: 'number'},email: {type: 'string',pattern: /^[^\s@]+@[^\s@][^\s.@]*\.[^\s@]+$/}}}})constprocessor=newJSONLParse({strict: false,on_record: (record,context)=>{// Only process records that passed validationreturn{
...record,processed_at: newDate().toISOString(),line_number: context.lines}}})// First validate, then process if validvalidator.on('data',(validationResult)=>{constresult=JSON.parse(validationResult.toString())if(result.valid){console.log('✅ Validation passed, processing records...')createReadStream('input.jsonl').pipe(processor)}else{console.error('❌ Validation failed:')result.errors.forEach((error)=>{console.error(` Line ${error.line}: ${error.message}`)})}})createReadStream('input.jsonl').pipe(validator)
Format Converters
JSONLParse includes several built-in converters for transforming between different data formats:
JSONToJSONL - Convert JSON to JSONL
Convert JSON files (arrays or objects) to JSONL format.
import{createReadStream,createWriteStream}from'node:fs'import{JSONToJSONL}from'jsonl-parse'constconverter=newJSONToJSONL({arrayPath: 'data',// Extract array from nested pathflatten: true,// Flatten nested objectsmaxObjectSize: 1024*1024// 1MB per object limit})createReadStream('data.json').pipe(converter).pipe(createWriteStream('data.jsonl'))
JSONToJSONLOptions
Option
Type
Default
Description
arrayPath
string
null
Extract array from nested object path (e.g., "data.items")
replacer
function
null
JSON.stringify replacer function
encoding
BufferEncoding
'utf8'
Text encoding
maxObjectSize
number
Infinity
Maximum size per JSON object
flatten
boolean
false
Flatten nested objects to dot notation
rootKey
string
null
Wrap first object in specified key
JSONLToJSON - Convert JSONL to JSON
Convert JSONL files to JSON arrays or objects.
import{createReadStream,createWriteStream}from'node:fs'import{JSONLToJSON}from'jsonl-parse'constconverter=newJSONLToJSON({arrayWrapper: true,// Wrap in arrayarrayName: 'results',// Use custom array namepretty: true,// Pretty print outputspace: 2// Indentation spaces})createReadStream('data.jsonl').pipe(converter).pipe(createWriteStream('data.json'))
JSONLToJSONOptions
Option
Type
Default
Description
arrayWrapper
boolean
true
Wrap objects in array
arrayName
string
null
Name for root array property
pretty
boolean
false
Pretty print JSON output
space
string | number
2
Indentation for pretty printing
encoding
BufferEncoding
'utf8'
Text encoding
maxObjects
number
Infinity
Maximum objects to process
JSONLToCSV - Convert JSONL to CSV
Convert JSONL files to CSV format with full customization.
Convert CSV files to JSONL format with robust parsing.
import{createReadStream,createWriteStream}from'node:fs'import{CSVToJSONL}from'jsonl-parse'constconverter=newCSVToJSONL({headers: true,// Use first row as headersdelimiter: ',',cast: true,// Auto-convert typestrim: true,// Trim whitespaceskipEmptyLines: true,flatten: true,// Flatten objects to dot notationmaxObjectSize: 1024*1024// 1MB limit per object})createReadStream('data.csv').pipe(converter).pipe(createWriteStream('data.jsonl'))
CSVToJSONLOptions
Option
Type
Default
Description
delimiter
string
','
Field delimiter
quote
string
'"'
Quote character
escape
string
'"'
Escape character
headers
boolean | string[]
true
Header handling
skipEmptyLines
boolean
true
Skip empty lines
skipRecordsWithEmptyValues
boolean
false
Skip records with empty values
skipRecordsWithError
boolean
false
Continue on parse errors
replacer
function
null
JSON.stringify replacer
encoding
BufferEncoding
'utf8'
Text encoding
maxObjectSize
number
Infinity
Maximum object size
flatten
boolean
false
Flatten nested objects
rootKey
string
null
Wrap objects in root key
trim
boolean
true
Trim field values
cast
boolean | function
false
Type casting
Converter Usage Examples
Batch Processing Pipeline
import{createReadStream,createWriteStream}from'node:fs'import{pipeline}from'node:stream/promises'import{CSVToJSONL,JSONLParse,JSONLToCSV}from'jsonl-parse'// JSONL -> Process -> CSVawaitpipeline(createReadStream('input.jsonl'),newJSONLParse({cast: true,on_record: record=>({
...record,processed: true,timestamp: newDate().toISOString()})}),newJSONLToCSV({header: true}),createWriteStream('output.csv'))// CSV -> JSONL -> Process -> JSONawaitpipeline(createReadStream('data.csv'),newCSVToJSONL({cast: true}),newJSONLParse({on_record: record=>record.active ? record : null}),newJSONLToJSON({pretty: true}),createWriteStream('filtered.json'))
Data Transformation Examples
// Convert nested JSON to flat JSONLconstjsonToFlat=newJSONToJSONL({arrayPath: 'users',flatten: true})// Convert flat JSONL back to nested CSVconstflatToNested=newJSONLToCSV({unflatten: true,unflattenSeparator: '.',columns: ['id','profile.name','profile.email','settings.theme']})// Round-trip conversion with processingawaitpipeline(createReadStream('nested.json'),jsonToFlat,newJSONLParse({on_record: (record)=>{// Process flat structurerecord['profile.verified']=truereturnrecord}}),flatToNested,createWriteStream('processed.csv'))
Memory-Safe Large File Processing
// Process large files with memory constraintsconstsafeConverter=newJSONLToCSV({maxObjectSize: 512*1024,// 512KB per objectcast: {// Compress large text fieldsobject: obj=>JSON.stringify(obj).slice(0,1000)}})constsafeParser=newJSONLParse({maxLineLength: 1024*1024,// 1MB per linestrict: false,skip_records_with_error: true,on_skip: (error,line)=>{console.warn(`Skipped problematic record: ${error.message}`)}})awaitpipeline(createReadStream('large-file.jsonl'),safeParser,safeConverter,createWriteStream('safe-output.csv'))
Error Handling
Strict Mode Errors
constparser=newJSONLParse({strict: true})parser.on('error',(err)=>{if(err.message.includes('Invalid JSON at line')){console.error('JSON parsing failed:',err.message)}elseif(err.message.includes('Line length')){console.error('Line too long:',err.message)}elseif(err.message.includes('Buffer size exceeded')){console.error('Memory limit exceeded:',err.message)}})