
Rust wrapper for the RDKit using CFFI

Primary LanguageC++MIT LicenseMIT


This is an 🚧 experimental 🚧 rust wrapper for some functionality of the great RDKit cheminformatics library.

It makes use of its new C Foreign Function Interface (CFFI), see also this blog post.

Use it at your own risk, its not recommended yet for productive use :-)

Please note, that only a limited functionality is being exposed via cffi by RDKit and not all of this is available yet via this interface.
Have a look at the examples below and the test functions.

There are dependencies to specific version of boost and rdkit (some headers & the shared lib), see also the installation section.
The rdkitcffi.so shared library is downloaded during build from azure. This could be done in a better and more dynamic way.

Currently, only linux is supported, however support for macos should also be viable.

Please note, that there is also a cargo crate providing low level wrapper to rdkit, i.e. not using the cffi interface.


Basic usage:

use rdkitcffi::Molecule;

let smiles = "OCCC#CO";
let mol = Molecule::new(smiles, "").unwrap();

let natoms = mol.get_numatoms();

Additional arguments can be passed via json

use rdkitcffi::Molecule;

let json_args = "{\"removeHs\":false,\"canonical\":false}";
let mol = Molecule::new("c1cc(O[H])ccc1", json_args).unwrap();

Working with SD files and filtering invalid molecules:

use rdkitcffi::{Molecule,read_sdfile};

let mut mol_opt_list : Vec<Option<Molecule>>= read_sdfile("data/test.sdf");
let mut mol_list: Vec<Molecule> = mol_opt_list.into_iter().filter_map(|m| m).collect();
mol_list.iter_mut().for_each(|m| m.remove_all_hs());

Dealing with invalid molecules

use rdkitcffi::Molecule;

let result = Molecule::new("OCCO", "");
match result {
   Some(m) => println!("Result: {:?}", m),
   None => println!("Could not get molecule!"),

Getting a JSON represenation (via serde_json):

use rdkitcffi::Molecule;

let mol = Molecule::new("OCCO", "").unwrap();
println!("json: {:?}", mol.get_json(""));

Neutralizing a zwitterion

use rdkitcffi::Molecule;

let mut mol = Molecule::new("C(C(=O)[O-])[NH3+]", "").unwrap();
println!("{:?}", mol.get_smiles(""));

Computing RDKit descriptors

use rdkitcffi::Molecule;

let mol = Molecule::new("CCCN", "").unwrap();
let desc = mol.get_descriptors_as_dict();
let nrot = desc.get("NumRotatableBonds");
let logp = desc.get("CrippenClogP");

Creating a polars dataframe:

use rdkitcffi::Molecule;
use polars::prelude::*;
use polars::df;

let mut mol_list : Vec<Molecule> = rdkitcffi::read_smifile_unwrap("data/test.smi");
let a: Vec<_> = mol_list.iter().map(|m| m.get_smiles("")).collect();
let df = df!( "smiles" => a).unwrap();


Currently only linux is supported.
In some cases you may have also to install some additional packages for installation:

sudo apt-get install build-essential
sudo apt-get install libclang-dev

Download the repo:

git clone https://github.com/chrissly31415/rdkitcffi.git  

If you have a rust/cargo installation, just run

cd rdkitcffi
cargo build  
cargo test --lib  

After installation you may want to update your LD_LIBRARY_PATH in order to run binaries without cargo, e.g.:

export LD_LIBRARY_PATH=/home/username/rdkitcffi/lib/rdkitcffi_linux/linux-64/:$LD_LIBRARY_PATH

Using it in your project

Modify your Cargo.toml file:

rdkitcffi = {path="/pathtorepo/rdkitcffi"}