This is an 🚧 experimental 🚧 rust wrapper for some functionality of the great RDKit cheminformatics library.
It makes use of its new C Foreign Function Interface (CFFI), see also this blog post.
Use it at your own risk, its not recommended yet for productive use :-)
Please note, that only a limited functionality is being exposed via cffi by RDKit and not all of this is available yet via this interface.
Have a look at the examples below and the test functions.
There are dependencies to specific version of boost and rdkit (some headers & the shared lib), see also the installation section.
The rdkitcffi.so shared library is downloaded during build from azure. This could be done in a better and more dynamic way.
Currently, only linux is supported, however support for macos should also be viable.
Please note, that there is also a cargo crate providing low level wrapper to rdkit, i.e. not using the cffi interface.
Basic usage:
use rdkitcffi::Molecule;
let smiles = "OCCC#CO";
let mol = Molecule::new(smiles, "").unwrap();
let natoms = mol.get_numatoms();
Additional arguments can be passed via json
use rdkitcffi::Molecule;
let json_args = "{\"removeHs\":false,\"canonical\":false}";
let mol = Molecule::new("c1cc(O[H])ccc1", json_args).unwrap();
Working with SD files and filtering invalid molecules:
use rdkitcffi::{Molecule,read_sdfile};
let mut mol_opt_list : Vec<Option<Molecule>>= read_sdfile("data/test.sdf");
let mut mol_list: Vec<Molecule> = mol_opt_list.into_iter().filter_map(|m| m).collect();
mol_list.iter_mut().for_each(|m| m.remove_all_hs());
Dealing with invalid molecules
use rdkitcffi::Molecule;
let result = Molecule::new("OCCO", "");
match result {
Some(m) => println!("Result: {:?}", m),
None => println!("Could not get molecule!"),
};
Getting a JSON represenation (via serde_json):
use rdkitcffi::Molecule;
let mol = Molecule::new("OCCO", "").unwrap();
println!("json: {:?}", mol.get_json(""));
Neutralizing a zwitterion
use rdkitcffi::Molecule;
let mut mol = Molecule::new("C(C(=O)[O-])[NH3+]", "").unwrap();
mol.neutralize("");
println!("{:?}", mol.get_smiles(""));
Computing RDKit descriptors
use rdkitcffi::Molecule;
let mol = Molecule::new("CCCN", "").unwrap();
let desc = mol.get_descriptors_as_dict();
let nrot = desc.get("NumRotatableBonds");
let logp = desc.get("CrippenClogP");
Creating a polars dataframe:
use rdkitcffi::Molecule;
use polars::prelude::*;
use polars::df;
let mut mol_list : Vec<Molecule> = rdkitcffi::read_smifile_unwrap("data/test.smi");
let a: Vec<_> = mol_list.iter().map(|m| m.get_smiles("")).collect();
let df = df!( "smiles" => a).unwrap();
Currently only linux is supported.
In some cases you may have also to install some additional packages for installation:
sudo apt-get install build-essential
sudo apt-get install libclang-dev
Download the repo:
git clone https://github.com/chrissly31415/rdkitcffi.git
If you have a rust/cargo installation, just run
cd rdkitcffi
cargo build
cargo test --lib
After installation you may want to update your LD_LIBRARY_PATH in order to run binaries without cargo, e.g.:
export LD_LIBRARY_PATH=/home/username/rdkitcffi/lib/rdkitcffi_linux/linux-64/:$LD_LIBRARY_PATH
Modify your Cargo.toml file:
[dependencies]
rdkitcffi = {path="/pathtorepo/rdkitcffi"}