This repository provides a Rust library and a binary providing efficient common and custom data-encodings.
The library provides the following common encodings:
HEXLOWER
: lowercase hexadecimalHEXLOWER_PERMISSIVE
: lowercase hexadecimal with case-insensitive decodingHEXUPPER
: uppercase hexadecimalHEXUPPER_PERMISSIVE
: uppercase hexadecimal with case-insensitive decodingBASE32
: RFC4648 base32BASE32_NOPAD
: RFC4648 base32 without paddingBASE32_DNSSEC
: RFC5155 base32BASE32_DNSCURVE
: DNSCurve base32BASE32HEX
: RFC4648 base32hexBASE32HEX_NOPAD
: RFC4648 base32hex without paddingBASE64
: RFC4648 base64BASE64_NOPAD
: RFC4648 base64 without paddingBASE64_MIME
: RFC2045-like base64BASE64URL
: RFC4648 base64urlBASE64URL_NOPAD
: RFC4648 base64url without padding
Typical usage looks like:
// allocating functions
BASE64.encode(&input_to_encode)
HEXLOWER.decode(&input_to_decode)
// in-place functions
BASE32.encode_mut(&input_to_encode, &mut encoded_output)
BASE64_URL.decode_mut(&input_to_decode, &mut decoded_output)
See the documentation or the changelog for more details.
The library also provides the possibility to define custom little-endian ASCII base-conversion encodings for bases of size 2, 4, 8, 16, 32, and 64 (for which all above use-cases are particular instances). It supports:
- padded and unpadded encodings
- canonical encodings (e.g. trailing bits are checked)
- in-place encoding and decoding functions
- partial decoding functions (e.g. for error recovery)
- character translation (e.g. for case-insensitivity)
- most and least significant bit-order
- ignoring characters when decoding (e.g. for skipping newlines)
- wrapping the output when encoding
The typical definition of a custom encoding looks like:
lazy_static! {
static ref HEX: Encoding = {
let mut spec = Specification::new();
spec.symbols.push_str("0123456789abcdef");
spec.translate.from.push_str("ABCDEF");
spec.translate.to.push_str("abcdef");
spec.encoding().unwrap()
};
static ref BASE64: Encoding = {
let mut spec = Specification::new();
spec.symbols.push_str(
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/");
spec.padding = Some('=');
spec.encoding().unwrap()
};
}
You may also use the macro library to define a compile-time custom encoding:
const HEX: Encoding = new_encoding!{
symbols: "0123456789abcdef",
translate_from: "ABCDEF",
translate_to: "abcdef",
};
const BASE64: Encoding = new_encoding!{
symbols: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/",
padding: '=',
};
See the documentation or the changelog for more details.
The performance of the encoding and decoding functions (for both common and
custom encodings) are similar to existing implementations in C, Rust, and other
high-performance languages. You may run the benchmarks with make bench
.
The binary is mostly a wrapper around the library. You can run make install
to install it from the repository. By default, it will be installed as
~/.cargo/bin/data-encoding
. You can also run cargo install data-encoding-bin
to install the latest version published on crates.io
. This second alternative
does not require to clone the repository.
Once installed, you can run data-encoding --help
(assuming ~/.cargo/bin
is
in your PATH
environment variable) to see the usage:
Usage: data-encoding --mode=<mode> --base=<base> [<options>]
Usage: data-encoding --mode=<mode> --symbols=<symbols> [<options>]
Options:
-m, --mode <mode> {encode|decode|describe}
-b, --base <base> {16|hex|32|32hex|64|64url}
-i, --input <file> read from <file> instead of standard input
-o, --output <file> write to <file> instead of standard output
--block <size> read blocks of about <size> bytes
-p, --padding <padding>
pad with <padding>
-g, --ignore <ignore>
when decoding, ignore characters in <ignore>
-w, --width <cols> when encoding, wrap every <cols> characters
-s, --separator <separator>
when encoding, wrap with <separator>
--symbols <symbols>
define a custom base using <symbols>
--translate <new><old>
when decoding, translate <new> as <old>
--ignore_trailing_bits
when decoding, ignore non-zero trailing bits
--least_significant_bit_first
use least significant bit first bit-order
Examples:
# Encode using the RFC4648 base64 encoding
data-encoding -mencode -b64 # without padding
data-encoding -mencode -b64 -p= # with padding
# Encode using the MIME base64 encoding
data-encoding -mencode -b64 -p= -w76 -s$'\r\n'
# Show base information for the permissive hexadecimal encoding
data-encoding --mode=describe --base=hex
# Decode using the DNSCurve base32 encoding
data-encoding -mdecode \
--symbols=0123456789bcdfghjklmnpqrstuvwxyz \
--translate=BCDFGHJKLMNPQRSTUVWXYZbcdfghjklmnpqrstuvwxyz \
--least_significant_bit_first