pacman82/odbc2parquet

Build release assets for linux as well

leo-schick opened this issue · 7 comments

Release assets are only created for osx and windows. I would like to see here linux as well:
image

Currently I install it on linux via cargo install odbc2parquet, but that comes heavy when installing odbc2parquet insinde a docker image.

Creating an artifact, which works on any linux distribution is a bit tricky. One usual way to achieve this with Rust is to link everything statically and build against a Musl target in order to also get rid of the dynamic dependency on libc which versions differs across linux distribution. However, odbc2parquet is required to dynamically link against libodbc.so which can not be done with a musl target.

An executable build on a linux system with an old libc would likely do well on a lot of platforms, but I am not sure about caveats this approach might have.

IMHO it is probably best you build the executable for your distribution (e.g. Ubuntu) and copy it into your container. You could even do so in an earlier build stage of your container. E.g.

# Dockerfile for a the server app

# 1: Build the executable
FROM rust:1.62 as builder
RUN cargo install odbc2parquet

# 2: Copy the executable to an empty Docker image
FROM scratch
COPY --from=builder /usr/local/cargo/bin/odbc2parquet .

Take the sample with a grain of salt, I did not test it. The approach however should limit the size of the resulting docker image, since it does not contain the entire rust toolchain anymore.

I am also open for ideas and pull requests.

I am not so deep into the c++ building system so I don't know what a musl target is. But isn't there a possibility to just build the app without the libodbc.so and just define the package unixodbc-dev as a dependency for odbc2parquet on linux?

The libodbc.so is part of the unixodbc-dev package, see:

I can't say so much about the other distributions but I guess when you support Ubuntu and Debian, you should already conver a high percentage of linux users. E.g. the rust docker image mentioned above is based on Debian, see here.

I am not so deep into the c++ building system so I don't know what a musl target is. But isn't there a possibility to just build the app without the libodbc.so and just define the package unixodbc-dev as a dependency for odbc2parquet on linux?

Yes there is. This is what dynamic linking is about.

Sadly this kind of dynamic linking is incompatible with a Musl target, which I would require to statically link libc. If libc is not statically linked it follows that it is linked dynamically (or not at all, but this is not the case for odbc2parquet). Linking against libc dynamically implies depending on at least the libc version on the system you build.

I do not know if there are other caveats to use e.g. a binary build on Ubuntu on an Arch Linux. I've a pretty wild fantasy and could imagine some.

Since I maintain these pipelines and artifacts currently alone, I would at the moment rather choose not to do so. At the moment, I see most linux artefacts deployed by source, and this is probably what I would stick with.

However, since GitHub offers Ubuntu runners out of the box, I would add binaries for the latest Ubuntu version to the release process. Everything else I would consider out of scope, and I feel should better be solved on the consuming side.

If you wanna setup a side with prebuild executables or contribute to maintaining these pipelines, this is another talk. Yet on my own a latest Ubuntu Version is all I would offer.

Ps. Building from source on your local platform using target-cpu=natve might yield some performance benifits. Usually odbc2parquet is io dominated, though.

Sounds good! I think Ubuntu should be enough. I guess Debian will work then as well since they are both quiet equal. Latest Ubuntu version should be enough.

I don’t think it is a good idea that I provide a PR on this because I never did some development in Rust or C++. Just a bit C in my early days. I guess to check for a similar project in Rust on GitHub which creates a Linux artifact would be a great place to get inspiration from… that’s the place I would start when creating a PR for this

Hello @leo-schick ,

odbc2parquet 0.9.3 has been released. Including a binary release for Ubuntu. I feel however I've done you a little disservice by providing it to you.

I guess to check for a similar project in Rust on GitHub which creates a Linux artifact would be a great place to get inspiration from… that’s the place I would start when creating a PR for this

That line of reasoning would most likely lead me to realease only by source (either packaged or not), but not to binary releases. For your Mara DB I would argue that you use odbc2parquet like a library. If you depend on a rust "library" you should think about how to include that in the build system, rather than rely on binary releases. I feel the one Ubuntu release is likely to only making a small fraction of your potential users happy.

Just food for thought.

Thanks for your feedback. Enjoy the release and cheers,

Markus

Ok, I’ll check it after my vacation. I guess when I can make it work with conda-forge it would fit the bill.

Yes, I think you are on the right track with conda-forge. This way you're likely to even get it in running on an ARM architecture like the increasingly popular M1 Apple, etc..