Show "citations" on the publications section
Opened this issue · 1 comments
porduna commented
It would be cool to show "citations" in each publication, so you can see the publications which cite each publication. This would be a simple link to Google Scholar pointing to this publication.
Doing so semi-automatically is not too difficult. But it requires the model to change and add a new field, such as "scholar_id", so in the visualization we can provide the link.
The way to do this could be:
scholar.google.com/scholar?q=%22TITLE_OF_THE_PAPER%22author1%20author2
Find the first "cache:([^:]+):" regexp (which comes from something like the following):
<a href="http://scholar.googleusercontent.com/scholar?q=cache:bPrZe0NyrG0J:scholar.google.com/+Learning+Analytics+on+federated+remote+laboratories:+tips+and+techniques&hl=es&as_sdt=0,5" class="gs_nvi">Versión en HTML</a>
And that's the identifier in base64. Then you can put:
http://scholar.google.es/scholar?cites=bPrZe0NyrG0J
to show the cites.
Example:
import re
import urllib2
req = urllib2.Request("http://scholar.google.es/scholar?q=%22Towards+federated+interoperable+bridges+for+sharing+educational+remote+laboratories%22", headers = { 'User-Agent' : 'Google Chrome' })
urlobj = urllib2.urlopen(req)
contents = urlobj.read()
regex = re.compile(r"cache:([^:]+):")
token = regex.findall(contents)[0]
print token
# This returns: bPrZe0NyrG0J
# If we want to pass to the numeric form:
hex_token = ""
for byte in token.decode('base64')[::-1][1:]:
hex_token += hex(ord(byte)).split('0x')[1].zfill(2)
print int(hex_token, 16)
# With both identifiers:
#
# "http://scholar.google.es/scholar?cites=" + identifier
#
# works.
If the paper is not yet in Google Scholar, it should not be added.