Queries with Clojure Records
kul opened this issue · 6 comments
There seems to be a problem with queries if clojure records are present in tuples
user=> (use 'cascalog.api)
nil
user=> (defrecord MyRec [a b])
user.MyRec
user=> (??<- [?r] ([(MyRec. 1 2)] ?r))
UnsupportedOperationException user.MyRec (form-init9201996833299850058.clj:1)
user=> (??<- [?r] ([[(MyRec. 1 2)]] ?r))
UnsupportedOperationException user.MyRec (form-init4990014382799884351.clj:1)
hi kul,
could you add a full stracktrace? you can get this my calling (pst) in your repl directly after trying the query that fails. thanks.
user=> (pst)
UnsupportedOperationException
user.MyRec (form-init4990014382799884351.clj:1)
com.esotericsoftware.kryo.serializers.MapSerializer.read (MapSerializer.java:137)
com.esotericsoftware.kryo.serializers.MapSerializer.read (MapSerializer.java:17)
com.esotericsoftware.kryo.Kryo.readObject (Kryo.java:612)
cascading.kryo.KryoDeserializer.deserialize (KryoDeserializer.java:37)
cascading.tuple.hadoop.TupleSerialization$SerializationElementReader.read (TupleSerialization.java:628)
cascading.tuple.hadoop.io.HadoopTupleInputStream.readType (HadoopTupleInputStream.java:105)
cascading.tuple.hadoop.io.HadoopTupleInputStream.getNextElement (HadoopTupleInputStream.java:52)
cascading.tuple.io.TupleInputStream.readTuple (TupleInputStream.java:78)
cascading.tuple.io.TupleInputStream.readTuple (TupleInputStream.java:67)
cascading.tuple.hadoop.io.TupleDeserializer.deserialize (TupleDeserializer.java:38)
cascading.tuple.hadoop.io.TupleDeserializer.deserialize (TupleDeserializer.java:28)
Great! didnt know about pst
thanks.
I recall this now. carbonite, a library which allows for clojure types to be serialized with kryo, has a bug where records cannot be serialized. So, this is actually a bug in carbonite and not cascalog itself.
I will look into fixing the carbonite bug.
That great news (in the sense that cascalog doesnt need to be patched)!
Thanks
@kul kryo cannot serialize clojure records in a generic manner since records are concrete types in java.
So, your options are:
- write kryo serializers for your record types and register them with hadoop/cascalog
- preprocess your records into maps using the builtin
map->MyRecord
fns and just pass maps around inside cascalog.
Closing this one.