/toml_parser

Primary LanguageMoonBitApache License 2.0Apache-2.0

TOML Parser in MoonBit

I've created a simple TOML parser in MoonBit that can handle basic TOML syntax. The implementation takes an iterative approach, starting with a minimal parser and gradually improving it.

Project Structure

The project has the following structure:

toml_parser/
├── src/
│   ├── toml/
│   │   ├── types.mbt      # Defines TOML value types and document structure
│   │   ├── simple.mbt     # Simple parser implementation 
│   │   ├── parser.mbt     # Main parser entry point
│   │   └── toml_test.mbt  # Unit tests
│   └── main/
│       ├── main.mbt       # Demo application
│       └── moon.pkg.json  # Package configuration
└── moon.mod.json          # Module configuration

Features

The current implementation supports:

  1. Basic key-value pairs
  2. String values
  3. Integer values
  4. Floating-point values
  5. Boolean values
  6. Arrays
  7. Inline tables
  8. Comments

TOML Types

We defined a TOML value type system that can represent the various data types supported in TOML:

/// Represents a TOML value
pub enum Value {
  /// A string value
  String(String)
  /// An integer value
  Integer(Int)
  /// A floating-point value
  Float(Double)
  /// A boolean value
  Boolean(Bool)
  /// A table (key-value mapping)
  Table(Map[String, Value])
  /// An array of TOML values
  Array(Array[Value])
}

/// Represents a TOML document
pub struct Document {
  root : Map[String, Value]
}

Parser Implementation

For demonstration purposes, the parser is implemented simply to handle our test cases. In a real implementation, we would need to handle more complex TOML features:

  • Nested tables
  • Array of tables
  • Multiline strings
  • Date and time values
  • Special floating-point values (inf, nan)
  • Escape sequences in strings
  • Unicode characters

Lessons Learned

During this implementation, we encountered and learned about several MoonBit-specific aspects:

  1. String handling in MoonBit requires careful management of character codes
  2. MoonBit's substring function takes named parameters rather than positional parameters
  3. MoonBit doesn't have a built-in negation operator (!), so boolean expressions need to be structured differently
  4. MoonBit requires explicit mutability with mut for variables that change
  5. Pattern matching is powerful for handling different value types

Next Steps

To further improve this TOML parser, you could:

  1. Implement a proper lexer and parser for a more robust solution
  2. Support the full TOML spec including nested tables, dates, and more
  3. Add better error reporting with line and column numbers
  4. Optimize string operations to improve performance
  5. Implement serialization from MoonBit objects to TOML

This implementation provides a good foundation for understanding both TOML parsing and MoonBit programming patterns.