SOAP Client for querying the Web of Science database
Web of Science (previously Web of Knowledge) is an online subscription-based scientific citation indexing service maintained by Thomson Reuters.
wos
is a python SOAP Client (both API and command-line tool) to query the
WOS database in order to get XML data from a query using the WWS access.
The package has been uploaded to PyPI, so you can install the package using pip:
pip install wos
This README and the documentation for the classes and methods can be accessed on ReadTheDocs.
You can use the wos
command to query the Web of Science API. If you want to
access data that needs to be accessed using the premium API, you also have to
authenticate using your username and password.
- usage: wos [-h] [--close] [-l] [-u USER] [-p PASSWORD] [-s SID]
- {query,doi,connect} ...
Query the Web of Science.
- positional arguments:
- {query,doi,connect} sub-command help
- query query the Web of Science. doi get the WOS ID from the DOI. connect connect and get an SID.
- optional arguments:
-h, --help show this help message and exit --close Close session. --proxy PROXY HTTP proxy --timeout TIMEOUT API timeout -l, --lite Wos Lite -v, --verbose Verbose - authentication:
API credentials for premium access.
-u USER, --user USER -p PASSWORD, --password PASSWORD -s SID, --sid SID
You can use the WOS Lite API using the --lite
parameter (for each query).
You can also authenticate using the session id (SID). In fact the sessions are not closed by the command line utility. Example:
$ wos --user JohnDoe --password 12345 connect Authenticated using SID: ABCDEFGHIJKLM $ wos --sid ABCDEFGHIJKLM query 'AU=Knuth Donald' -c1 Authenticated using SID: ABCDEFGHIJKLM <?xml version="1.0" ?> <records> <REC r_id_disclaimer="ResearcherID data provided by Thomson Reuters"> <UID>WOS:000287850200007</UID> <static_data> <summary> <EWUID> <WUID coll_id="WOS"/> <edition value="WOS.SCI"/> </EWUID> <pub_info coverdate="MAR 2011" has_abstract="N" issue="1" pubmonth="MAR" pubtype="Journal" pubyear="2011" sortdate="2011-03-01" vol="33"> <page begin="33" end="45" page_count="13">33-45</page> </pub_info> <titles count="6"> <title type="source">MATHEMATICAL INTELLIGENCER</title> .... $ wos --sid ABCDEFGHIJKLM doi '10.1007/s00283-010-9170-7' 10.1007/s00283-010-9170-7
Check the user_query documentation to understand how to create query strings.
Obviously you can also use the python client programmatically:
from wos import WosClient
import wos.utils
with WosClient('JohnDoe', '12345') as client:
print(wos.utils.query(client, 'AU=Knuth Donald'))
In wos
0.1.11+, the WosClient
class can access the following APIs.
I am not affiliated with Thomson Reuters. The library leverages the Web of Science WWS API (Web Services Premium or Lite), which is a paid service offered by Thomson Reuters. This means that your institution has to pay for the Web of Science Core Collection access. The simple registration to Web of Knowledge / Web of Science does not entitle you to access the WWS API service.
So if you receive errors like No matches returned for Username
or No matches returned for IP
, these errors are thrown directly by the WWS API server. This means that the library is correctly communicating with the server, but you do not have access to the Web Services API. I do understand that you can access the WOS website from your network, but the website access and the API access (used in this project) are two separated products, and the website access does not imply the API access, since Thomson Reuters bills them separately. This project does not scrape the website (which would violate the terms of usage) but invokes the WWS APIs offered by Thomson Reuters. Thus there is nothing this project can do to help you.
If you think this is an error and you should be entitled to access the services, please contact Thomson Reuters support first and verify if you have the WWS access. Please open an issue ONLY when you have (1) verified with Thomson Reuters support that you have WWS access; (2) verified that you are connected from the correct network.