This repository is the official implementation of "Weighted Edit Distance optimized using Genetic Algorithm for SMILES-based Compound Similarity, PAAA(SCIE)".
Authors: In-Hyuk Choi and Il-Seok Oh
DOI: https://doi.org/10.1007/s10044-023-01141-3
Published: 18.Feb.2023
Edit distance(Levenshtein distance) has three operations; insert, delete, substitute. We set each operation to have a different weight, which is Weighted Edit Distance. With Genetic Algorithm(GA), we present optimal weight set of weighted edit distance for each SMILES data.
conda create -n GA-WeightedEditSimilarity python=3.7 -y
conda activate GA-WeightedEditSimilarity
conda install numpy scipy scikit-learn matplotlib tqdm -y
python main.py -d [e, ic, gpcr, nr]
We use four dataset; Enzyme, Ion channel, GPCR, Nuclear receptor.
@article{choi2023wes,
title = {{Weighted edit distance optimized using genetic algorithm for SMILES-based compound similarity}},
author = {In-Hyuk Choi and Il-Seok Oh},
doi={10.1007/s10044-023-01141-3},
journal={Pattern Analysis and Applications},
year={2023}
}