/DataSize

Human-readible byte sizes

Primary LanguageJavaApache License 2.0Apache-2.0

DataSize Utils

Maven Central javadoc

Minimal no-dep library dedicated to print human-readable byte sizes.

Features

  • Blazingly fast
  • Small. Adds about 10KB to your project.
  • Supports both decimal calculation (1 kilobyte = 1000 bytes) as well as binary calculation (1 kikibyte = 1024 bytes).
  • Configurable:
    • Number of decimals (per unit type, for example for Megabytes print with 3 decimals)
    • Suffix (per unit type, for example for Megabytes it may be MB)
    • Decimal separator. Typically either . or ,.
  • Includes many preset suffixes: SI, ISO-80000, Customary, GNU, etc.
  • Can be used with Java 8 onwards.

Documentation

For in-depth documentation see JavaDoc.

Usage

Library is available on Maven Central.

<dependency>
    <groupId>net.lbruun</groupId>
    <artifactId>datasize</artifactId>
    <version>  --LATEST--  </version>
</dependency>

Code examples

Using the convenience methods:

DataSize.asStringBinary (2_000_000L);  // produces "1.9 MiB"
DataSize.asStringDecimal(2_000_000L);  // produces "2.0 MB"

Full control:

// Increase decimals for the 'mega' unit
DataSizeUnitDecimals decimals = DataSizeUnitDecimals.builder()
                .withMegabyteDecimals(3)
                .build();

DataSize.asString(
    2_000_000L,
    true,                                 // use binary (true) or decimal (false)
    DataSizeUnitSuffixes.SUFFIXES_GNU,    // suffixes to use
    '.',                                  // decimal separator
    decimals                              // number of decimals, by unit
    );     
    // produces "1.907M"

Pre-defined suffix sets

A number of pre-defined suffix sets are included:

For use
with
calc. type
Suffixes
set
Description
BINARY SUFFIXES_ISO80000 ISO 80000 / International Electrotechnical Commission (IEC)
BINARY SUFFIXES_CUSTOMARY Unit suffixes known as customary. These are used for example by the Microsoft Windows operating system.
BINARY SUFFIXES_GNU Unit suffixes used by GNU/Linux ls -h command. This is a very dense format with no space between the digits and the suffix.
DECIMAL SUFFIXES_SI International System of Units (SI)
DECIMAL SUFFIXES_GNU_SI Unit suffixes used by GNU/Linux ls --si command. This is a very dense format with no space between the digits and the suffix.
(builder) Roll your own

Performance

The core fomatting routine is optimized for speed by:

  • Avoiding String.format, DecimalFormat and the like.
  • Avoiding costly Math functions. Only does integer arithmetics.
  • Favor integer comparison over integer multiplication

Honestly, none of this matters if you are formatting a display of 50 files. But it matters if you are formatting a display of 50,000 files.

Also, for accuracy, the routine avoids floting point completely. We don't want those mysterious rounding errors!

Alternatives

  • org.springframework.util.unit.DataSize. From what I can tell it can only parse, not format.

  • Apache Commons IO - FileUtils class. Does too heavy rounding, everything becomes GBs. Also not clear to me if it does both binary or decimal calculation .. or indeed which one it does?