chembl/chembl_webresource_client

Issues when searching list of molecules by name

gcolmenarejo opened this issue · 0 comments

Hi,
I'm trying to search by name a large set of molecules, in order to get their chembl_id, using:

for i in range(len(nams)):
try:
res = nc.molecule.search(nams[i])[0]
if len(res) == 0:
print("not found data for ", ids[i], nams[i])
else:
print("OK found data for ", ids[i], nams[i])
except:
print("error for ", ids[i], nams[i])
print("Finished ", i)

However, this takes huge a amount of time. It starts very fast, but later it slows down and it only retrieved about 2K molecules in one day. Is it possible to run the search in batches of names? I tried something like

nc.molecule.search(nams[i:(i+100)])

but it did not work:
TypeError: quote_from_bytes() expected bytes

In addition, a very large percentage of the searches returns an error like this (in this case, searching "Hericenone B", although it seems not reproducible, some times it returns the error but sometimes not):

Error for url https://www.ebi.ac.uk/chembl/api/data/molecule/search.json, server response: <!doctype html>

<!-- Use the .htaccess and remove these lines to avoid edge case issues.

More info: h5bp.com/b/378 -->

<title>Server error &lt; EMBL-EBI</title>
<meta name="description" content="EMBL-EBI"><!-- Describe what this page is about -->
<meta name="keywords" content="bioinformatics, europe, institute"><!-- A few keywords that relate to the content of THIS PAGE (not the whol project) -->
<meta name="author" content="EMBL-EBI"><!-- Your [project-name] here -->

<!-- Mobile viewport optimized: j.mp/bplateviewport -->
<meta name="viewport" content="width=device-width,initial-scale=1">

<!-- Place favicon.ico and apple-touch-icon.png in the root directory: mathiasbynens.be/notes/touch-icons -->

<!-- CSS: implied media=all -->
<!-- CSS concatenated and minified via ant build script-->
<link rel="stylesheet" href="//www.ebi.ac.uk/web_guidelines/css/compliance/develop/boilerplate-style.css">
<link rel="stylesheet" href="//www.ebi.ac.uk/web_guidelines/css/compliance/develop/ebi-global.css" type="text/css" media="screen">
<link rel="stylesheet" href="//www.ebi.ac.uk/web_guidelines/css/compliance/develop/ebi-visual.css" type="text/css" media="screen">
<link rel="stylesheet" href="//www.ebi.ac.uk/web_guidelines/css/compliance/develop/984-24-col-fluid.css" type="text/css" media="screen">

<!-- you can replace this with [projectname]-colours.css. See http://frontier.ebi.ac.uk/web/style/colour for details of how to do this -->
<!-- also inform ES so we can host your colour palette file -->
<link rel="stylesheet" href="//www.ebi.ac.uk/web_guidelines/css/compliance/develop/embl-petrol-colours.css" type="text/css" media="screen">

<!-- for production the above can be replaced with -->
<!--
<link rel="stylesheet" href="//www.ebi.ac.uk/web_guidelines/css/compliance/mini/ebi-fluid-embl.css">
-->


<!-- end CSS-->

    
<!-- All JavaScript at the bottom, except for Modernizr / Respond.

Modernizr enables HTML5 elements & feature detects; Respond is a polyfill for min/max-width CSS3 Media Queries
For optimal performance, use a custom Modernizr build: www.modernizr.com/download/ -->

<!-- Full build -->
<!-- <script src="//www.ebi.ac.uk/web_guidelines/js/libs/modernizr.minified.2.1.6.js"></script> -->

<!-- custom build (lacks most of the "advanced" HTML5 support -->
<script src="//www.ebi.ac.uk/web_guidelines/js/libs/modernizr.custom.49274.js"></script>
EMBL European Bioinformatics Institute
        <nav>
            <ul id="global-nav">
                <!-- set active class as appropriate -->
                                    <li id="services" class=" first "><a href="//www.ebi.ac.uk/services" title="Services">Services</a></li>
                                    <li id="research" class=""><a href="//www.ebi.ac.uk/research" title="Research">Research</a></li>
                                    <li id="training" class=""><a href="//www.ebi.ac.uk/training" title="Training">Training</a></li>
                                    <li id="industry" class=""><a href="//www.ebi.ac.uk/industry" title="Industry">Industry</a></li>
                                    <li id="about" class=" last"><a href="//www.ebi.ac.uk/about" title="About us">About us</a></li>
                                </ul>
        </nav>

    </div>
                            <div id="local-masthead" class="masthead grid_24 nomenu">

        <!-- local-title -->
        <!-- NB: for additional title style patterns, see http://frontier.ebi.ac.uk/web/style/patterns -->

    <div class="" id="local-title">
                                                                <h1><a href="/" title="Back to Server error homepage">Server error</a></h1>
                                        </div>

    <!-- /local-title -->

Something has gone wrong with our web server

Our web server says this is a 500 internal server error: the request cannot be carried out by the server.
This problem means that the service you are trying to access is currently unavailable. We're very sorry.

Please try again but if it keeps happening, you can contact us and we will try to help you.

Explore the EBI:

Examples: blast, keratin, bfl1...

	</section>    </section>

    <!-- End example layout containers -->
<!-- Optional local footer (insert citation / project-specific copyright / etc here -->
    <!--
    <div id="local-footer" class="grid_24 clearfix">
  <p>How to reference this page: ...</p>
</div>
    -->
    <!-- End optional local footer -->
    
<div id="global-footer" class="grid_24">

    <nav id="global-nav-expanded">

        <div class="grid_4 alpha">
            <h3 class="embl-ebi"><a href="//www.ebi.ac.uk/" title="EMBL-EBI">EMBL-EBI</a></h3>
        </div>

        <div class="grid_4">
            <h3 class="services"><a href="//www.ebi.ac.uk/services">Services</a></h3>
        </div>

        <div class="grid_4">
            <h3 class="research"><a href="//www.ebi.ac.uk/research">Research</a></h3>
        </div>

        <div class="grid_4">
            <h3 class="training"><a href="//www.ebi.ac.uk/training">Training</a></h3>
        </div>

        <div class="grid_4">
            <h3 class="industry"><a href="//www.ebi.ac.uk/industry">Industry</a></h3>
        </div>

        <div class="grid_4 omega">
            <h3 class="about"><a href="//www.ebi.ac.uk/about">About us</a></h3>
        </div>

    </nav>

    <section id="ebi-footer-meta">
        <p class="address">EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK &nbsp; &nbsp; +44 (0)1223 49 44 44</p>
        <p class="legal">Copyright &copy; EMBL-EBI 2013 | EBI is an Outstation of the <a href="http://www.embl.org">European Molecular Biology Laboratory</a> | <a href="/about/privacy">Privacy</a> | <a href="/about/cookies">Cookies</a> | <a href="/about/terms-of-use">Terms of use</a></p>
    </section>

</div>
<script defer="defer" src="//www.ebi.ac.uk/web_guidelines/js/cookiebanner.js"></script> <script defer="defer" src="//www.ebi.ac.uk/web_guidelines/js/foot.js"></script>

How can I run a search for a long list of names?

Thanks

Gonzalo