Alembic files are malformed, h5ls and abcecho show execption stack traces for H5Oget_info_by_name
Closed this issue · 5 comments
GoogleCodeExporter commented
see discussion thread:
http://groups.google.com/group/alembic-discussion/browse_thread/thread/a741b4137
49591e2/3618e96b726e55e4?show_docid=3618e96b726e55e4
HDF5-DIAG: Error detected in HDF5 (1.8.7) thread 139740866865248:
#000: H5O.c line 657 in H5Oget_info_by_name(): object not found
major: Symbol table
minor: Object not found
#001: H5Gloc.c line 747 in H5G_loc_info(): can't find object
major: Symbol table
minor: Object not found
#002: H5Gtraverse.c line 905 in H5G_traverse(): internal path traversal failed
major: Symbol table
minor: Object not found
#003: H5Gtraverse.c line 688 in H5G_traverse_real(): traversal operator failed
major: Symbol table
minor: Callback failed
#004: H5Gloc.c line 702 in H5G_loc_info_cb(): can't get object info
major: Symbol table
minor: Can't get value
#005: H5O.c line 2865 in H5O_get_info(): can't retrieve object's btree & heap info
major: Object header
minor: Can't get value
#006: H5Goh.c line 358 in H5O_group_bh_info(): can't read LINFO message
major: Symbol table
minor: Can't get value
#007: H5Omessage.c line 545 in H5O_msg_read_oh(): unable to decode message
major: Object header
minor: Unable to decode value
#008: H5Olinfo.c line 129 in H5O_linfo_decode(): bad version number for message
Original issue reported on code.google.com by ble...@gmail.com
on 18 Nov 2011 at 2:48
GoogleCodeExporter commented
valgrind of Kevin's scene didn't turn up any complaints about Alembic.
Original comment by miller.lucas
on 18 Nov 2011 at 2:57
GoogleCodeExporter commented
I've been hunting this bug where a bad cache is generated. I'm now 70%
convinced it's an hdf5 bug. On the read side was have an HDF5 group (our
compound property) and we are asking for the hdf5 group that stores the
after-the-first sample values of the property childBnds. In my test case this
is "LarmPvMidOriNUL/.prop/.xform/.childBnds.smpi". The hdf5 code fails to be
able to resolve the HDF5 hard link to actual link's destination so it can't
open the smpi group.
I've been debugging this on the write side, and I've been trying to see how an
hdf5's group's hard links are created. At a certain point hdf5 has code that
says "ok, you have too many hard links in the parent group, I'll change the
parent group to use "dense link storage". If you're curious you can see this
in H5G_obj_insert ():
/* If there's still a small enough number of links, use the 'link' message */
/* (If the encoded form of the link is too large to fit into an object
* header message, convert to using dense link storage instead of link messages)
*/
I don't think this actual routine has a bug... my hunch is that some other bit
of hdf5 has similar logic though for adding other things into the parent group
(maybe hdf5 attributes or datatypes. not certain), and this other location has
a bad interaction.
What i have done though, is I changed my hdf5 so that the switch over to "dense
link storage" happens sooner (there's a simple link.nlinks < ginfo.max_compact
check that tweaked by adding 4). I realize that is a total hack, but in effect
I'm causing hdf5 to convert to dense link storage sooner, and with this change
my test case creates a valid cache. I haven't changed anything in Alembic at
all, so we aren't keeping any hdf5 objects open differently.
This leads me to strongly suspect that some bit of logic that relates to group
conversion operations inside of hdf5 has a bug.
Original comment by cookingw...@gmail.com
on 21 Dec 2011 at 9:24
GoogleCodeExporter commented
That is very compelling, luckily we don't need to to hack HDF5, we could use:
http://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetAttrPhaseChange
And if this isn't just a problem for attrs we might need to investigate:
http://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLinkPhaseChange
http://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetSharedMesgPhaseChang
e
Original comment by miller.lucas
on 22 Dec 2011 at 1:42
GoogleCodeExporter commented
This should be at least partially fixed in 1.0.4 with the use of set link phase
change.
Original comment by miller.lucas
on 24 Jan 2012 at 1:36
- Changed state: PleaseVerify
GoogleCodeExporter commented
Original comment by miller.lucas
on 24 Jan 2012 at 1:41
- Changed state: Verified