/Protobuf-PHP

PHP implementation of Google's Protocol Buffers with a protoc plugin compiler

Primary LanguagePHPMIT LicenseMIT

Protobuf for PHP

Protobuf for PHP is an implementation of Google's Protocol Buffers for the PHP language, supporting its binary data serialization and including a protoc plugin to generate PHP classes from .proto files.

Great effort has been put into generating PHP files that include all sort of type hints to aide IDE's with autocompletion. Therefore, it can not only be used to communicate with Protocol Buffers services but also as a generation tool for data objects no matter what the final serialization is.

For more information see the included man pages.

Requirements

  • PHP 5.3
  • Pear's Console_CommandLine (for the protoc wrapper tool)
  • Google's protoc compiler version 2.3 or above
  • GMP or BC Math extensions ¹

¹ Only needed for negative values in int32, int64 or fixed64 types. See the known issues section.

Features

Working

  • Standard types (numbers, string, enums, messages, etc)
  • Extensions, Unknown and Packed fields
  • Generate service interfaces
  • Protoc compiler plugin to generate the PHP classes
  • Template based code generation. Go crazy and customize the generated code :)
  • Include comments from .proto files in the generated files
  • Pluggable serialization backends (codecs)
    • Standard Binary
    • Standard TextFormat ¹
    • PhpArray
    • JSON
    • ProtoJson (TagMap and Indexed variants)
    • XML
  • Reflection capabilities
  • Dynamic messages with annotations support (no code generation step required)
  • Lazy decoding of messages to improve the performance in real world scenarios

¹ Only serialization is supported in this codec

Future

  • Speed optimized code generation mode

Example usage

$person = new Tutorial\Person();
$person->name = 'DrSlump';
$person->setId(12);

$book = new Tutorial\AddressBook();
$book->addPerson($person);

// Use default codec
$data = $book->serialize();

// Use custom codec
$codec = new \DrSlump\Protobuf\Codec\Json();
$data = $codec->encode($book);
// ... or ...
$data = $book->serialize($codec);

Installation

composer install

Known issues

Types

PHP is very weak when dealing with numbers processing. Several work arounds have been applied to the standard binary codec to reduce incompatibilities between Protobuf types and PHP ones.

  • Protobuf stores floating point values using the IEEE 754 standard with 64bit words for the double and 32bit for the float types. PHP supports IEEE 754 natively although the precission is platform dependant, however it typically supports 64bit doubles. It means that if your PHP was compiled with 64bit sized doubles (or greater) you shouldn't have any problem encoding and decoded float and double typed values with Protobuf.

  • Integer values are also platform dependant in PHP. The library has been developed and tested against PHP binaries compiled with 64bit integers. The encoding and decoding algorithm should in theory work no matter if PHP uses 32bit or 64bit integers internally, just take into account that with 32bit integers the numbers cannot exceed in any case the PHP_INT_MAX value (2147483647).

    While Protobuf supports unsigned integers PHP does not. In fact, numbers above the compiled PHP maximum integer (PHP_INT_MAX, 0x7FFFFFFFFFFFFFFF for 64bits) will be automatically casted to doubles, which typically will offer 53bits of decimal precission, allowing to safely work with numbers upto 0x20000000000000 (2^53), even if they are represented in PHP as floats instead of integers. Higher numbers will loose precission or might even return an infinity value, note that the library does not include any checking for these numbers and using them might provoke unexpected behaviour.

    Negative values when encoded as int32, int64 or fixed64 types require the big integer extensions GMP or BC Math (the later only for 64bit architectures) to be available in your PHP environment. The reason is that when encoding these negative numbers without using zigzag the binary representation uses the most significant bit for the sign, thus the numbers become above the maximum supported values in PHP. The library will check for these conditions and will automatically try to use GMP or BC to process the value.

Strings

The binary codec expects strings to be encoded using UTF-8. PHP does not natively support string encodings, PHP's string data type is basically a length delimited stream of bytes, so it's not trivial to include automatic encoding conversion into the library encoding and decoding routines. Instead of trying to guess or offer a configuration interface for the encoding, the binary codec will process the string type just as it would process byte one, delegating on your application the task of encoding or decoding in the desired character set.

Memory usage

Large messages might be troublesome since the way the library is modelled does not allow to parse or serialize messages as streams, instead the whole operation is performed in memory, which allows for faster processing but could consume too much RAM if messages are too large.

Unknown fields

Since wire types are different across different codec's formats, it's not possible to transcode unkwnon fields consumed in one codec to another. This means, for example, that when consuming a message using the binary codec, if it contains unknown fields they won't be included when serializing the message using the Json codec.

Generating PHP classes

The generation tool is designed to be run as a protoc plugin, thus it should work with any proto file supported by the official compiler.

protoc --plugin=protoc-gen-php --php_out=./build tutorial.proto

To make use of non-standard options in your proto files (like php.namespace) you'll have to import the php.proto file included with the library. It's location will depend on where you've installed this library.

protoc -I=./Protobuf-PHP/library/DrSlump/Protobuf/Compiler/protos \
       --plugin=protoc-gen-php --php_out=./build tutorial.proto

In order to make your life easier, the supplied protoc plugin offers an additional execution mode, where it acts as a wrapper for the protoc invocation. It will automatically include the php.proto path so you don't need to worry about it.

protoc-gen-php -o ./build tutorial.proto

Testing

mkdir -p test/generated ./protoc-gen-php.php -o test/generated -Dmultifile=true -i ./test/library/DrSlump/Protobuf/Test/protos/ ./test/library/DrSlump/Protobuf/Test/protos/*.proto ./vendor/bin/phpunit