ianmcook/implyr

Try connecting to Impala with thriftr package

Opened this issue · 5 comments

There's a fairly new R package thriftr (GitHub, CRAN) from @marekjagielski that implements Apache Thrift in R. See if it would be possible (and if so, what additional work would be required) to use this to connect to Impala (instead of ODBC or JDBC).

I took a first pass at this. It failed because some parts of the Thrift protocol are not yet implemented in the thriftr package. It looks like thriftr is under active development, so I'll keep trying. However, thriftr (being a pure-R implementation of Thrift) is probably going to be very slow at some operations like loading the Thrift IDL file.

Hi,
I am happy that you try thriftr. Until now,I just implemented binary protocol. If you find any problem with package please feel free to create an issue.
If we talk about performance, it is true that it can take more time at the initial phase. I would say that loading small IDL shouldn't take more than 1s.
Slower is also serializing and deserializing than its counterpart thriftpy2.
thriftr: https://github.com/systemincloud/thriftr/blob/master/tests/testthat/test.test_protocol_binary.R#L159
thriftpy2: https://github.com/Thriftpy/thriftpy2/blob/master/tests/test_protocol_binary.py#L161
I should definitely rewrite low level code responsible for that into C.
Any contributions welcome!

Thank you Marek for your initiative and hard work creating thriftr! And thank you for your comment here. I will gather some specific information about the issues I am observing when attempting to use it with Impala or Hive, and I will create GitHub issues for these in the thriftr repo. Please forgive my lack of deep knowledge about Thrift; I might ask some stupid questions :)

Would you mind replying to two questions here about the context of the thriftr project?

  1. What is the primary use case you had in mind when you began creating thriftr?
  2. Do you intend for thriftr to provide full (or nearly full) support for all the capabilities of Thrift, or is there a subset of capabilities that you're targeting?

Your replies will help me understand how I might help and what the scope of the project is.

Thank you!

  1. I used successfully thriftpy for my project to execute python code from eclipse plugin I was looking for something similar to execute R code. Finding nothing that would be easy and portable decided to write a clone of thriftpy and by occasion ply. I haven't incorporate thriftr into my project yet.
  2. I have no ambition to provide full thrift support. Once it fits my needs I will not go actively forward. Anyway, I think that present thirftr should be able to parse all base syntax of thrift.
    Moreover, I will not stop to maintain the library and guide anyone that would like to contribute.

Great, thank you for your replies!