/stax-xml

First StAX style XML parser/writer for javascript/typescript

Primary LanguageTypeScriptMIT LicenseMIT

StAX-XML

English | ํ•œ๊ตญ์–ด


English

A high-performance, pull-based XML parser for JavaScript/TypeScript inspired by Java's StAX (Streaming API for XML). It offers both fully asynchronous, stream-based parsing for large files and synchronous parsing for smaller, in-memory XML documents. Unlike traditional XML-to-JSON mappers, StAX-XML allows you to map XML data to any custom structure you desire while efficiently handling XML files through streaming or direct string processing.

๐Ÿš€ Features

  • Declarative Converter API: Zod-style schema API for type-safe XML parsing and writing
  • XPath Support: Use XPath expressions for flexible element selection
  • Bidirectional Transformation: Parse XML to objects and write objects back to XML
  • Fully Asynchronous (Stream-based): For memory-efficient processing of large XML files
  • Synchronous (String-based): For high-performance parsing of smaller, in-memory XML strings
  • Pull-based Parsing: Stream-based approach for memory-efficient processing of large XML files
  • Custom Mapping: Map XML data to any structure you want, not just plain JSON objects
  • High Performance: Optimized for speed and low memory usage
  • Universal Compatibility: Works in Node.js, Bun, Deno, and web browsers using only Web Standard APIs
  • Namespace Support: Basic XML namespace handling
  • Entity Support: Built-in entity decoding with custom entity support
  • TypeScript Ready: Full TypeScript support with comprehensive type definitions

๐Ÿ“ฆ Installation

# npm
npm install stax-xml
# yarn
yarn add stax-xml
# pnpm
pnpm add stax-xml
# bun
bun add stax-xml
# deno
deno add npm:stax-xml

๐Ÿ”ง Quick Start

Here are basic examples to get started. StAX-XML provides two parsing approaches:

  1. Event-based API: Low-level streaming parser for fine-grained control
  2. Converter API: Declarative, zod-style schema API for type-safe XML parsing

Declarative Parsing with Converter API (Recommended)

The converter module provides a zod-style declarative API for parsing and writing XML:

import { x } from 'stax-xml/converter';

// Define schema with XPath
const bookSchema = x.object({
  title: x.string().xpath('/book/title'),
  author: x.string().xpath('/book/author'),
  price: x.number().xpath('/book/price'),
  tags: x.string().array().xpath('/book/tags/tag')
});

// Parse XML
const xml = `
  <book>
    <title>TypeScript Deep Dive</title>
    <author>John Smith</author>
    <price>29.99</price>
    <tags>
      <tag>programming</tag>
      <tag>typescript</tag>
    </tags>
  </book>
`;

const result = await bookSchema.parse(xml);
// Result: { title: 'TypeScript Deep Dive', author: 'John Smith', price: 29.99, tags: ['programming', 'typescript'] }

// Write XML back
const newXml = await bookSchema.write(result, { rootElement: 'book' });

Key features of the Converter API:

  • Type-safe parsing: Infer TypeScript types from schemas
  • XPath support: Use XPath expressions for element selection
  • Bidirectional: Parse XML โ†’ Object and Object โ†’ XML
  • Composable: Build complex schemas from simple primitives
  • Optional values: Handle missing elements gracefully with .optional()
  • Transformations: Apply custom transformations with .transform()

Event-based Parsing (Low-level API)

Basic Asynchronous Parsing (StaxXmlParser)
import { StaxXmlParser, XmlEventType } from 'stax-xml';

const xmlContent = '<root><item>Hello</item></root>';
const stream = new ReadableStream({
  start(controller) {
    controller.enqueue(new TextEncoder().encode(xmlContent));
    controller.close();
  }
});

async function parseXml() {
  const parser = new StaxXmlParser(stream);
  for await (const event of parser) {
    console.log(event);
  }
}
parseXml();
Basic Synchronous Parsing (StaxXmlParserSync)
import { StaxXmlParserSync, XmlEventType } from 'stax-xml';

const xmlContent = '<data><value>123</value></data>';
const parser = new StaxXmlParserSync(xmlContent);

for (const event of parser) {
  console.log(event);
}

For detailed API documentation:

๐ŸŒ Platform Compatibility

StAX-XML uses only Web Standard APIs, making it compatible with:

  • Node.js (v18+)
  • Bun (any version)
  • Deno (any version)
  • Web Browsers (modern browsers)
  • Edge Runtime (Vercel, Cloudflare Workers, etc.)

๐Ÿงช Testing

bun test

Benchmark Results

Disclaimer: These benchmarks were performed on a specific system (cpu: 13th Gen Intel(R) Core(TM) i5-13600K, runtime: node 22.17.0 (x64-win32)) and may vary on different hardware and environments.

large.xml (97MB) parsing

Benchmark avg (min โ€ฆ max) p75 / p99 Memory (avg)
stax-xml to object 4.36 s/iter 4.42 s 2.66 mb
stax-xml consume 3.61 s/iter 3.65 s 3.13 mb
xml2js 6.00 s/iter 6.00 s 1.80 mb
fast-xml-parser 4.25 s/iter 4.26 s 151.81 mb
txml 1.05 s/iter 1.06 s 179.81 mb

midsize.xml (13MB) parsing

Benchmark avg (min โ€ฆ max) p75 / p99 Memory (avg)
stax-xml to object 492.06 ms/iter 493.28 ms 326.28 kb
stax-xml consume 469.66 ms/iter 471.54 ms 174.51 kb
xml2js 163.26 ยตs/iter 161.20 ยตs 89.89 kb
fast-xml-parser 529.99 ms/iter 531.12 ms 1.92 mb
txml 112.81 ms/iter 113.26 ms 1.00 mb

complex.xml (2KB) parsing

Benchmark avg (min โ€ฆ max) p75 / p99 Memory (avg)
stax-xml to object 85.79 ยตs/iter 75.60 ยตs 105.11 kb
stax-xml consume 50.38 ยตs/iter 49.43 ยตs 271.12 b
xml2js 147.45 ยตs/iter 153.50 ยตs 89.42 kb
fast-xml-parser 101.11 ยตs/iter 102.20 ยตs 92.92 kb
txml 9.40 ยตs/iter 9.41 ยตs 125.89 b

books.xml (4KB) parsing

Benchmark avg (min โ€ฆ max) p75 / p99 Memory (avg)
stax-xml to object 166.73 ยตs/iter 156.20 ยตs 221.40 kb
stax-xml consume 176.45 ยตs/iter 151.70 ยตs 202.08 kb
xml2js 259.90 ยตs/iter 254.50 ยตs 161.25 kb
fast-xml-parser 239.57 ยตs/iter 203.30 ยตs 226.17 kb
txml 19.18 ยตs/iter 19.26 ยตs 303.13 b

๐Ÿ“ Sample File Sources

Sources of sample XML files used in testing:

๐Ÿ“„ License

MIT

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.


Korean

Java์˜ StAX(Streaming API for XML)์—์„œ ์˜๊ฐ์„ ๋ฐ›์€ ๊ณ ์„ฑ๋Šฅ pull ๋ฐฉ์‹์˜ JavaScript/TypeScript XML ํŒŒ์„œ์ž…๋‹ˆ๋‹ค. ๋Œ€์šฉ๋Ÿ‰ ํŒŒ์ผ์„ ์œ„ํ•œ ์™„์ „ ๋น„๋™๊ธฐ ์ŠคํŠธ๋ฆผ ๊ธฐ๋ฐ˜ ํŒŒ์‹ฑ๊ณผ ์ž‘์€ ์ธ๋ฉ”๋ชจ๋ฆฌ XML ๋ฌธ์„œ๋ฅผ ์œ„ํ•œ ๋™๊ธฐ ํŒŒ์‹ฑ์„ ๋ชจ๋‘ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์กด์˜ XML-JSON ๋งคํผ์™€ ๋‹ฌ๋ฆฌ, StAX-XML์„ ์‚ฌ์šฉํ•˜๋ฉด XML ๋ฐ์ดํ„ฐ๋ฅผ ์›ํ•˜๋Š” ์ž„์˜์˜ ๊ตฌ์กฐ๋กœ ๋งคํ•‘ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ŠคํŠธ๋ฆฌ๋ฐ ๋˜๋Š” ์ง์ ‘ ๋ฌธ์ž์—ด ์ฒ˜๋ฆฌ๋ฅผ ํ†ตํ•ด XML ํŒŒ์ผ์„ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿš€ ์ฃผ์š” ๊ธฐ๋Šฅ

  • ์„ ์–ธ์  Converter API: ํƒ€์ž… ์•ˆ์ „ํ•œ XML ํŒŒ์‹ฑ๊ณผ ์“ฐ๊ธฐ๋ฅผ ์œ„ํ•œ Zod ์Šคํƒ€์ผ ์Šคํ‚ค๋งˆ API
  • XPath ์ง€์›: ์œ ์—ฐํ•œ ์š”์†Œ ์„ ํƒ์„ ์œ„ํ•œ XPath ํ‘œํ˜„์‹ ์‚ฌ์šฉ
  • ์–‘๋ฐฉํ–ฅ ๋ณ€ํ™˜: XML์„ ๊ฐ์ฒด๋กœ ํŒŒ์‹ฑํ•˜๊ณ  ๊ฐ์ฒด๋ฅผ ๋‹ค์‹œ XML๋กœ ์ž‘์„ฑ
  • ์™„์ „ ๋น„๋™๊ธฐ (์ŠคํŠธ๋ฆผ ๊ธฐ๋ฐ˜): ๋Œ€์šฉ๋Ÿ‰ XML ํŒŒ์ผ์˜ ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์  ์ฒ˜๋ฆฌ
  • ๋™๊ธฐ (๋ฌธ์ž์—ด ๊ธฐ๋ฐ˜): ์ž‘์€ ์ธ๋ฉ”๋ชจ๋ฆฌ XML ๋ฌธ์ž์—ด์˜ ๊ณ ์„ฑ๋Šฅ ํŒŒ์‹ฑ
  • ์‚ฌ์šฉ์ž ์ •์˜ ๋งคํ•‘: ๋‹จ์ˆœํ•œ JSON ๊ฐ์ฒด๊ฐ€ ์•„๋‹Œ ์›ํ•˜๋Š” ๊ตฌ์กฐ๋กœ XML ๋ฐ์ดํ„ฐ ๋งคํ•‘ ๊ฐ€๋Šฅ
  • ๊ณ ์„ฑ๋Šฅ: ์†๋„์™€ ๋‚ฎ์€ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์— ์ตœ์ ํ™”
  • ๋ฒ”์šฉ ํ˜ธํ™˜์„ฑ: ์›น ํ‘œ์ค€ API๋งŒ ์‚ฌ์šฉํ•˜์—ฌ Node.js, Bun, Deno, ์›น ๋ธŒ๋ผ์šฐ์ €์—์„œ ๋ชจ๋‘ ๋™์ž‘
  • ๋„ค์ž„์ŠคํŽ˜์ด์Šค ์ง€์›: ๊ธฐ๋ณธ XML ๋„ค์ž„์ŠคํŽ˜์ด์Šค ์ฒ˜๋ฆฌ
  • ์—”ํ‹ฐํ‹ฐ ์ง€์›: ์‚ฌ์šฉ์ž ์ •์˜ ์—”ํ‹ฐํ‹ฐ ์ง€์›์„ ํฌํ•จํ•œ ๋‚ด์žฅ ์—”ํ‹ฐํ‹ฐ ๋””์ฝ”๋”ฉ
  • TypeScript ์ง€์›: ํฌ๊ด„์ ์ธ ํƒ€์ž… ์ •์˜๋กœ ์™„์ „ํ•œ TypeScript ์ง€์›

๐Ÿ“ฆ ์„ค์น˜

# npm
npm install stax-xml
# yarn
yarn add stax-xml
# pnpm
pnpm add stax-xml
# bun
bun add stax-xml
# deno
deno add npm:stax-xml

๐Ÿ“– ๋ฌธ์„œ

์ž์„ธํ•œ ์‚ฌ์šฉ๋ฒ•, API ์ฐธ์กฐ, ํŠœํ† ๋ฆฌ์–ผ์€ ๊ณต์‹ ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๐Ÿ”ง ๋น ๋ฅธ ์‹œ์ž‘

StAX-XML์€ ๋‘ ๊ฐ€์ง€ ํŒŒ์‹ฑ ๋ฐฉ์‹์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค:

  1. ์ด๋ฒคํŠธ ๊ธฐ๋ฐ˜ API: ์„ธ๋ฐ€ํ•œ ์ œ์–ด๋ฅผ ์œ„ํ•œ ์ €์ˆ˜์ค€ ์ŠคํŠธ๋ฆฌ๋ฐ ํŒŒ์„œ
  2. Converter API: ํƒ€์ž… ์•ˆ์ „ํ•œ XML ํŒŒ์‹ฑ์„ ์œ„ํ•œ ์„ ์–ธ์  Zod ์Šคํƒ€์ผ ์Šคํ‚ค๋งˆ API

Converter API๋ฅผ ์‚ฌ์šฉํ•œ ์„ ์–ธ์  ํŒŒ์‹ฑ (๊ถŒ์žฅ)

Converter ๋ชจ๋“ˆ์€ XML ํŒŒ์‹ฑ ๋ฐ ์“ฐ๊ธฐ๋ฅผ ์œ„ํ•œ Zod ์Šคํƒ€์ผ์˜ ์„ ์–ธ์  API๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค:

import { x } from 'stax-xml/converter';

// XPath๋ฅผ ์‚ฌ์šฉํ•œ ์Šคํ‚ค๋งˆ ์ •์˜
const bookSchema = x.object({
  title: x.string().xpath('/book/title'),
  author: x.string().xpath('/book/author'),
  price: x.number().xpath('/book/price'),
  tags: x.string().array().xpath('/book/tags/tag')
});

// XML ํŒŒ์‹ฑ
const xml = `
  <book>
    <title>TypeScript ๋”ฅ๋‹ค์ด๋ธŒ</title>
    <author>ํ™๊ธธ๋™</author>
    <price>29.99</price>
    <tags>
      <tag>ํ”„๋กœ๊ทธ๋ž˜๋ฐ</tag>
      <tag>ํƒ€์ž…์Šคํฌ๋ฆฝํŠธ</tag>
    </tags>
  </book>
`;

const result = await bookSchema.parse(xml);
// ๊ฒฐ๊ณผ: { title: 'TypeScript ๋”ฅ๋‹ค์ด๋ธŒ', author: 'ํ™๊ธธ๋™', price: 29.99, tags: ['ํ”„๋กœ๊ทธ๋ž˜๋ฐ', 'ํƒ€์ž…์Šคํฌ๋ฆฝํŠธ'] }

// XML๋กœ ๋‹ค์‹œ ์“ฐ๊ธฐ
const newXml = await bookSchema.write(result, { rootElement: 'book' });

Converter API์˜ ์ฃผ์š” ๊ธฐ๋Šฅ:

  • ํƒ€์ž… ์•ˆ์ „ ํŒŒ์‹ฑ: ์Šคํ‚ค๋งˆ์—์„œ TypeScript ํƒ€์ž… ์ž๋™ ์ถ”๋ก 
  • XPath ์ง€์›: ์š”์†Œ ์„ ํƒ์„ ์œ„ํ•œ XPath ํ‘œํ˜„์‹ ์‚ฌ์šฉ
  • ์–‘๋ฐฉํ–ฅ: XML โ†’ ๊ฐ์ฒด, ๊ฐ์ฒด โ†’ XML ๋ณ€ํ™˜
  • ์กฐํ•ฉ ๊ฐ€๋Šฅ: ๋‹จ์ˆœ ๊ธฐ๋ณธํ˜•์—์„œ ๋ณต์žกํ•œ ์Šคํ‚ค๋งˆ ๊ตฌ์ถ•
  • ์„ ํƒ์  ๊ฐ’: .optional()๋กœ ๋ˆ„๋ฝ๋œ ์š”์†Œ ์šฐ์•„ํ•˜๊ฒŒ ์ฒ˜๋ฆฌ
  • ๋ณ€ํ™˜: .transform()์œผ๋กœ ์‚ฌ์šฉ์ž ์ •์˜ ๋ณ€ํ™˜ ์ ์šฉ

์ด๋ฒคํŠธ ๊ธฐ๋ฐ˜ ํŒŒ์‹ฑ (์ €์ˆ˜์ค€ API)

๊ธฐ๋ณธ ๋น„๋™๊ธฐ ํŒŒ์‹ฑ (StaxXmlParser)
import { StaxXmlParser, XmlEventType } from 'stax-xml';

const xmlContent = '<root><item>์•ˆ๋…•ํ•˜์„ธ์š”</item></root>';
const stream = new ReadableStream({
  start(controller) {
    controller.enqueue(new TextEncoder().encode(xmlContent));
    controller.close();
  }
});

async function parseXml() {
  const parser = new StaxXmlParser(stream);
  for await (const event of parser) {
    console.log(event);
  }
}
parseXml();
๊ธฐ๋ณธ ๋™๊ธฐ ํŒŒ์‹ฑ (StaxXmlParserSync)
import { StaxXmlParserSync, XmlEventType } from 'stax-xml';

const xmlContent = '<data><value>123</value></data>';
const parser = new StaxXmlParserSync(xmlContent);

for (const event of parser) {
  console.log(event);
}

์ž์„ธํ•œ API ๋ฌธ์„œ๋Š” ๋‹ค์Œ์„ ์ฐธ์กฐํ•˜์„ธ์š”:

๐ŸŒ ํ”Œ๋žซํผ ํ˜ธํ™˜์„ฑ

StAX-XML์€ ์›น ํ‘œ์ค€ API๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์Œ ํ™˜๊ฒฝ์—์„œ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค:

  • Node.js (v18+)
  • Bun (๋ชจ๋“  ๋ฒ„์ „)
  • Deno (๋ชจ๋“  ๋ฒ„์ „)
  • ์›น ๋ธŒ๋ผ์šฐ์ € (์ตœ์‹  ๋ธŒ๋ผ์šฐ์ €)
  • Edge Runtime (Vercel, Cloudflare Workers ๋“ฑ)

๐Ÿ“ ํ…Œ์ŠคํŠธ ํŒŒ์ผ ์ถœ์ฒ˜

ํ…Œ์ŠคํŠธ์— ์‚ฌ์šฉ๋œ ์ƒ˜ํ”Œ ํŒŒ์ผ๋“ค์˜ ์ถœ์ฒ˜:

XML ํŒŒ์ผ:

JSON ํŒŒ์ผ:

๐Ÿ“„ ๋ผ์ด์„ ์Šค

MIT

๐Ÿค ๊ธฐ์—ฌํ•˜๊ธฐ

๊ธฐ์—ฌ๋ฅผ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค! Pull Request๋ฅผ ์ž์œ ๋กญ๊ฒŒ ์ œ์ถœํ•ด ์ฃผ์„ธ์š”.