/MAYA

Module containing an script to perform a chemical multiverse integrating several molecular representations to generate multipe chemical spaces to provide a depper analysis of structure multiple activity relationships

Primary LanguageJupyter Notebook

MAYA

Open In Colab

MAYA (Multiple Activity Analyzer) is designed to automatically construct a chemical multiverse, generating multiple visualizations of chemical spaces described by structural descriptors such as MACCS keys (166 bits), ECFP 4 and 6, and molecular descriptors with pharmaceutical relevance, as well as implementing biological descriptors. These representations are integrated with various visualization techniques for automated analysis, focusing on the analysis of structure - multiple activiy/property relationships.

Process

MAYA is developed as a user-friendly, open-source tool that automates the construction of chemical spaces by integrating different representations to provide a more comprehensive description of the structural, chemical, and functional characteristics of a set of molecules described by their SMILES notation and an associated activity/property, supporting various file types (CSV, TSV, XLSX, JSON and XML), requiring only the specification of a few parameters related to the database in use and the desired representations. Additionally, it includes options for customizing the visualizations.

The generated visualizations are interactive, allowing for a better understanding of the displayed data. They provide a 2D view of the structure, as well as the obtained variability values and their SMILES notation. Customization features are included, enabling the modification of the data's size, shape, and transparency, as well as the ability to change the color palette.

The script consist in a funtion that automatically implement:

  1. Data curation
  2. Descriptors calculation
  3. Tanimoto simmilarity calculation
  4. Dimensionality reduction
  5. 2D interactive visualization

How use MAYA?

Important

Depending of the interest of the user it is possible select the descriptors and dimensionality reduction thecniques to use. Defining the variables as True or False is possible disable their calculation.

Example

# This is an example
chemical_multiverse(file='/content/example.csv', smiles_column_name='SMILES', target_activities=['Target_1', 'Target_2', 'Target_3'], MACCS=Falce, ECFP=True, MD=Falce, vPCA=True, t-SNE=True )

See this notebook for more detailed usage

Why use MAYA?

To perform an automated analysis of your database annotated with any activity, property, or score by constructing a chemical multiverse focused on a deeper understanding of multiple structure-activity relationships.

You can customize the descriptors and techniques used depending on the required focus. You can select which descriptors you want to use, and you can also input a similarity matrix of any desired descriptor, allowing its integration into the generated visualizations.

Access to well-documented code is provided, covering database curation processes, similarity calculations, and dimensionality reduction techniques.

Usage

  1. Google Colaboratory
    The easiest way to use the script is ti open it in Google Colab. The only thing needed is a Google account.
  2. Local installation
    You can also setup your own local environment if you do not want to run the script through a Google service.

Additional Information

MAYA current supports Pythob 3.10

rdkit (2022.09.05)

matplotlib (3.7.1)

pandas (2.1.4)

seaborn (0.13.1)

sklearn (1.3.2)

Funding

Research contained in this package was supported by the Consejo Nacional de Humanidades, Ciencia y TecnologĂ­a (CONAHCYT) for the scholarship No. CVU 1340927