Cannot instantiate enum
miklcct opened this issue · 6 comments
Bug Report
Cannot instantiate enum while loading documents from MongoDB database.
Environment
PHP 8.1.5 + ext/mongodb 1.13.0 + mongodb-org 5.0.7
Test Script
#!/usr/bin/php
<?php
declare(strict_types=1);
use MongoDB\BSON\Persistable;
use MongoDB\Client;
require_once __DIR__ . '/vendor/autoload.php';
enum Test : string implements Persistable {
case FOO = 'F';
case BAR = 'B';
public function bsonSerialize() {
return (array)$this;
}
public function bsonUnserialize(array $data) {
}
}
$client = new Client(driverOptions: ['typeMap' => ['array' => 'array']]);
$database = $client->selectDatabase('test');
$collection = $database->selectCollection('test');
$collection->insertMany([Test::FOO, Test::BAR]);
var_dump($collection->find()->toArray());
Expected and Actual Behavior
I should have a way to get back the enums, but there isn't a way for me to do that, instead:
PHP Warning: Uncaught Error: Cannot instantiate enum Test in /home/michael/projects/national_rail_journey_planner/test.php:26
Stack trace:
#0 [internal function]: MongoDB\Driver\Cursor->rewind()
#1 /home/michael/projects/national_rail_journey_planner/test.php(26): MongoDB\Driver\Cursor->toArray()
#2 {main}
thrown in /home/michael/projects/national_rail_journey_planner/test.php on line 26
PHP Stack trace:
PHP 1. {main}() /home/michael/projects/national_rail_journey_planner/test.php:0
PHP 2. MongoDB\Driver\Cursor->toArray() /home/michael/projects/national_rail_journey_planner/test.php:26
PHP 3. MongoDB\Driver\Cursor->rewind() /home/michael/projects/national_rail_journey_planner/test.php:26
PHP Fatal error: Couldn't find implementation for function bsonUnserialize in Unknown on line 0
PHP Stack trace:
PHP 1. {main}() /home/michael/projects/national_rail_journey_planner/test.php:0
PHP 2. MongoDB\Driver\Cursor->toArray() /home/michael/projects/national_rail_journey_planner/test.php:26
PHP 3. MongoDB\Driver\Cursor->rewind() /home/michael/projects/national_rail_journey_planner/test.php:26
Opened PHPC-2083 to track this. Feel free to follow that issue for updates, but I'll also follow up here if I have additional questions or determine a solution.
@miklcct: Just to follow up here, I'm nearly confident in the approach in #1317 that works for both pure and backed enums. This PR will not make it into the 1.14.0 beta release I'm about to tag later today, but I am hoping to get it merged and included in the final 1.14.0 release.
The first approach that invoked BackedEnum::from()
originally did not work due to a PHP bug (since fixed in php/php-src@01d8454). While that would handle your original use case, it seemed less flexible than the approach I derived from PHP's own serialization functions, so I didn't see any benefit to limiting instantiation to BackedEnums if we can easily support both types.
@miklcct: I'm going to close this out as #1317 has been merged (and PHPC-2083 resolved). This will be included in the upcoming 1.15.0 release, but you're welcome to build the driver from source if you would like to try this out sooner.
Pure and backed enums can now be decoded from BSON by implementing either Unserializable or Persistable, and there is a PersistableEnum trait in the extension to save you the trouble of providing method implementations. The trait can be used with either interface, so it doesn't require use of Persistable if you're comfortable using a custom type map. You can take a look at the tests in that PR if you're interested in specific code examples.
@miklcct: Although PHPC-2083 has been merged, I have some reservations about releasing it, both due to the internal special handling in the BSON decoder and the permanent addition of a PersistableEnum trait into the extension's public API.
Going back to your original script, I overlooked that you were inserting an enum as a root document. I don't think this makes sense in practice, as decoding the BSON back into a PHP enum will leave us with no way to populate the _id
property that uniquely identifies the document in the database.
The suggested approach would be to store the enum as a class property, and ensure it is serialized and deserialized accordingly in that class' BSON methods. There are a few benefits to this approach:
- The class has full control over how to reinstantiate the enum vs. any fixed logic we might implement in the BSON decoder. For instance, a BackedEnum can utilize either
from()
ortryFrom()
if parsing a value, or use the case name like a pure enum. - The enum class doesn't need to implement a BSON interface (e.g. Persistable), which decouples it from the MongoDB driver.
- The driver can produce a more concise BSON representation (i.e. just its case name or backed value). This seems preferable given that the purpose of an enum is to represent a single state in a finite set. Additionally, decoupling the BSON representation from PHP (where enums as an object are an implementation detail) will make it more portable with other drivers/tools.
Persistable classes were originally introduced as a convenience to allow users to forgo specifying a type map, but the overhead they entail with encoding the full class name in a __pclass
property is undesirable (although unavoidable). Still, I see no reason to encourage that as general practice for enums, especially since they should only appear within document objects and those document classes will already have bsonSerialize()
and bsonUnserialize()
methods defined (whether or not they implement Persistable).
Consider the following:
<?php
declare(strict_types=1);
require 'vendor/autoload.php';
use MongoDB\BSON\ObjectId;
use MongoDB\BSON\Persistable;
use MongoDB\Client;
use function MongoDB\BSON\fromPHP;
use function MongoDB\BSON\toPHP;
use function MongoDB\BSON\toCanonicalExtendedJSON;
enum MyPureEnum {
case A;
case B;
case C;
}
enum MyBackedEnum : string {
case A = 'ant';
case B = 'bee';
case C = 'cat';
}
class MyDocument implements Persistable
{
public ObjectId $id;
public MyPureEnum $pureEnum;
public MyBackedEnum $backedEnumByName;
public MyBackedEnum $backedEnumByValue;
public function bsonSerialize(): array
{
$data = [];
if (isset($this->id)) {
$data['_id'] = $this->id;
}
if (isset($this->pureEnum)) {
$data['pureEnum'] = $this->pureEnum->name;
}
if (isset($this->pureEnum)) {
$data['backedEnumByName'] = $this->backedEnumByName->name;
}
if (isset($this->pureEnum)) {
$data['backedEnumByValue'] = $this->backedEnumByValue->value;
}
return $data;
}
public function bsonUnserialize(array $data): void
{
if (isset($data['_id'])) {
$this->id = $data['_id'];
}
if (isset($data['pureEnum'])) {
// See: https://www.php.net/manual/en/language.enumerations.basics.php#127112
$this->pureEnum = constant("MyPureEnum::{$data['pureEnum']}");
}
if (isset($data['backedEnumByName'])) {
$this->backedEnumByName = constant("MyBackedEnum::{$data['backedEnumByName']}");
}
if (isset($data['backedEnumByValue'])) {
$this->backedEnumByValue = MyBackedEnum::from($data['backedEnumByValue']);
}
}
}
$doc = new MyDocument;
$doc->pureEnum = MyPureEnum::A;
$doc->backedEnumByName = MyBackedEnum::B;
$doc->backedEnumByValue = MyBackedEnum::C;
echo "Original PHP document:\n\n";
// Original PHP object
var_dump($doc);
echo "\nRound-tripping PHP document through BSON:\n\n";
$bson = fromPHP($doc);
// BSON (as Extended JSON)
echo toCanonicalExtendedJSON($bson), "\n\n";
// Round-tripped PHP object
var_dump(toPHP($bson));
echo "\nRound-tripping PHP document through database:\n\n";
$client = new MongoDB\Client('mongodb://localhost:27060/?replicaSet=rs0');
$collection = $client->selectCollection('test', 'enum');
$collection->drop();
$collection->insertOne($doc);
var_dump($collection->findOne());
Executing this script produces the following output:
Original PHP document:
object(MyDocument)#3 (3) {
["id"]=>
uninitialized(MongoDB\BSON\ObjectId)
["pureEnum"]=>
enum(MyPureEnum::A)
["backedEnumByName"]=>
enum(MyBackedEnum::B)
["backedEnumByValue"]=>
enum(MyBackedEnum::C)
}
Round-tripping PHP document through BSON:
{ "__pclass" : { "$binary" : { "base64" : "TXlEb2N1bWVudA==", "subType" : "80" } }, "pureEnum" : "A", "backedEnumByName" : "B", "backedEnumByValue" : "cat" }
object(MyDocument)#7 (3) {
["id"]=>
uninitialized(MongoDB\BSON\ObjectId)
["pureEnum"]=>
enum(MyPureEnum::A)
["backedEnumByName"]=>
enum(MyBackedEnum::B)
["backedEnumByValue"]=>
enum(MyBackedEnum::C)
}
Round-tripping PHP document through database:
object(MyDocument)#19 (4) {
["id"]=>
object(MongoDB\BSON\ObjectId)#22 (1) {
["oid"]=>
string(24) "634cfd797e7a3f00c9014ea2"
}
["pureEnum"]=>
enum(MyPureEnum::A)
["backedEnumByName"]=>
enum(MyBackedEnum::B)
["backedEnumByValue"]=>
enum(MyBackedEnum::C)
}
In conclusion, I think the preferred course of action here would be to revert the unreleased commits and instead add a tutorial page to the library documentation that demonstrates how users should handle enums within their document model.
Talked this over with @alcaeus a bit more and we're going to explore supporting enums directly in the type map without requiring implementation of MongoDB\BSON\Unserializable. This would also potentially allow a BackedEnum to be instantiated directly from an integer or string value in BSON, and perhaps pure enums from a string value. We could still support instantiating both from a BSON document (only requiring the name
property).
We were in agreement about removing the PersistableEnum trait from the extension API. Instead, we'll add a tutorial to the library documentation that demonstrates how users can integrate enums into their document model. This tutorial can include example code for Serializable and Persistable enum traits, but keeping those within documentation will make it more likely users can be educated about potential gotchas and overhead (vs. using a trait in the public API on a whim).
I'll leave this ticket open for now you can expect a subsequent PR soon to revise what we currently have in 1.15-dev.
#1378 has since been merged. Some notable changes in that PR:
- Backed enums serialize as their case value
- The PersistableEnum trait was removed
- Enums are prohibited from implementing Unserializable and Persistable interfaces
I've created PHPLIB-1040 to add some documentation on how enums can be handled in an application (e.g. instantiating them via a document class' bsonUnserialize()
method). We opted not to allow enums to be specified in type maps (although it was considered), since there are plans to support callables in type maps. That will allow more control over unserialization and we wanted to avoid introducing a stop-gap solution that'd further complicate the BSON API.