OscarPDR/labman_ud

Show "citations" on the publications section

Opened this issue · 1 comments

It would be cool to show "citations" in each publication, so you can see the publications which cite each publication. This would be a simple link to Google Scholar pointing to this publication.

Doing so semi-automatically is not too difficult. But it requires the model to change and add a new field, such as "scholar_id", so in the visualization we can provide the link.

The way to do this could be:

scholar.google.com/scholar?q=%22TITLE_OF_THE_PAPER%22author1%20author2

Find the first "cache:([^:]+):" regexp (which comes from something like the following):

<a href="http://scholar.googleusercontent.com/scholar?q=cache:bPrZe0NyrG0J:scholar.google.com/+Learning+Analytics+on+federated+remote+laboratories:+tips+and+techniques&amp;hl=es&amp;as_sdt=0,5" class="gs_nvi">Versión en HTML</a>

And that's the identifier in base64. Then you can put:

http://scholar.google.es/scholar?cites=bPrZe0NyrG0J

to show the cites.

Example:

import re
import urllib2

req = urllib2.Request("http://scholar.google.es/scholar?q=%22Towards+federated+interoperable+bridges+for+sharing+educational+remote+laboratories%22", headers = { 'User-Agent' : 'Google Chrome' })

urlobj = urllib2.urlopen(req)
contents = urlobj.read()

regex = re.compile(r"cache:([^:]+):")
token = regex.findall(contents)[0]
print token

# This returns: bPrZe0NyrG0J
# If we want to pass to the numeric form:
hex_token = ""
for byte in token.decode('base64')[::-1][1:]:
   hex_token += hex(ord(byte)).split('0x')[1].zfill(2)

print int(hex_token, 16)

# With both identifiers:
# 
# "http://scholar.google.es/scholar?cites=" + identifier
# 
# works.

If the paper is not yet in Google Scholar, it should not be added.

@OscarPDR could you please add a "scholar_id" field in the abstract publication you're working on?