automatically import nbviewer
bendichter opened this issue · 3 comments
It has become common for python docs to reference jupyter notebooks with external links to http://nbviewer.ipython.org./. (e.g. http://nbviewer.jupyter.org/github/cvxgrp/cvx_short_course/blob/master/applications/portfolio_optimization.ipynb) I have found that these notebooks are often the most useful portion of the docs, however they are not automatically captured by doc2dash. I propose a feature that automatically downloads these notebooks.
I've implemented this with the following python script. I've only tested it on cvxpy.
import os
from glob import glob
import re
import urllib.request
from requests import get
import bs4 as soup
from tqdm import tqdm
html_dir = '.../cvxpy/doc/build/html'
nbv_addresses = []
names = []
for filename in glob(os.path.join(html_dir, '**/*.html'), recursive=True):
with open(filename, 'r') as content_file:
content = content_file.read()
nbv_inds = [m.start() for m in re.finditer('http://nbviewer.ipython.org', content)]
content_out = content
if nbv_inds:
for nbv_ind in tqdm(nbv_inds, desc='downloading and converting notebooks from ' + filename):
nbv_address = content[nbv_ind:content.find('"', nbv_ind)]
dest = os.path.split(filename)[0]
name = nbv_address[nbv_address.rfind('/') + 1:]
nb_fname = name.replace('.ipynb','.html')
# download notebook
dl_address = nbv_address.replace('nbviewer.ipython.org/github', 'raw.githubusercontent.com')
dl_address = dl_address.replace('blob/','')
response = get(dl_address)
# write ipnb file
nb_fullpath = os.path.join(dest, name)
with open(nb_fullpath, "wb") as file:
file.write(response.content)
#convert notebook
os.system('jupyter nbconvert --to html -y --output-dir ' + dest + ' ' + nb_fullpath)
os.remove(nb_fullpath)
content_out = content_out.replace(nbv_address, nb_fname)
# write file with new paths
with open(filename, 'w') as content_file:
content_file.write(content_out)
I’m sorry I’ve left you hanging for checks calendar 7 years. I regularly checked the issue, didn’t quite understand what it means and swore to myself to check again soon. I still don’t quite understand what it’s about but I also don’t think you’re still interested in pursuing this. Sorry.
TBH I don't know what this is about anymore either 😂