/jsonstreamer

JSONstream is a powerful Go library designed to efficiently process large JSON array files through streaming, offering memory-efficient batch processing with comprehensive error handling and flexible output mechanisms

Primary LanguageGoMIT LicenseMIT

jsonstream

jsonstreamer

JSONstream is a powerful Go library designed to efficiently process large JSON array files through streaming, offering memory-efficient batch processing with comprehensive error handling, context-aware operations, and flexible output mechanisms.

Features

  • Generic Implementation: Works with any JSON-decodable type using Go generics
  • Context-Aware: Supports graceful cancellation and timeout handling
  • Streaming JSON Processing: Process large JSON arrays without loading the entire file into memory
  • Flexible Output Handling: Multiple output options:
    • Buffer channel for batch processing
    • Error channel for asynchronous error handling
    • Counter channel for progress tracking
    • Per-entry callback processing
    • Batch processing callbacks
  • Robust Error Handling:
    • Panic recovery with stack traces
    • Error thresholding to prevent infinite processing
    • Asynchronous error reporting
    • Detailed error context preservation
  • Configurable Processing:
    • Options-based configuration
    • Adjustable buffer size for batch processing
    • Ability to skip initial entries
    • Support for processing specific ranges of entries
    • Custom processor functions
  • Type Safety: Leverages Go generics for type-safe processing
  • Memory Efficient: Processes data in chunks to maintain low memory footprint
  • Recovery Mechanisms:
    • Tracks last valid entry offset
    • Supports skipping invalid entries
    • Maintains processing state

Install

go get github.com/thalesfsp/jsonstream

Error Handling

The library provides comprehensive error handling and reporting:

  • Panic Recovery:

    • Full stack trace capture
    • Safe error propagation
    • Resource cleanup
  • Error Context:

    • Original error message
    • File position information
    • Last successfully processed entry
    • Content of the problematic entry
    • Custom error fields and tags
  • Error Scenarios:

    • Context cancellation
    • Invalid JSON structure
    • Missing required fields
    • Type conversion errors
    • File access issues
    • Processing thresholds exceeded
    • Callback failures
  • Error Reporting:

    • Synchronous error returns
    • Asynchronous error channel
    • Error accumulation for batch reporting
    • Custom error types with rich context

Contributing

  1. Fork the repository
  2. Clone your fork
  3. Create a feature branch
  4. Make your changes
  5. Run tests
  6. Create a pull request