grahamgower/demesdraw

Use in pyodide/ jupyterlite

Closed this issue · 15 comments

At the moment demesdraw uses four packages that require compiling. This makes it hard to use in jupyterlite:

# within a jupyterlite session
import micropip
await micropip.install('demesdraw', keep_going=True)
ValueError                                Traceback (most recent call last)
Cell In[10], line 17
     14 import tskit
     15 #import tsinfer
     16 #import tsdate
---> 17 await micropip.install('demesdraw', keep_going=True)

File /lib/python3.11/site-packages/micropip/_micropip.py:580, in install(requirements, keep_going, deps, credentials, pre)
    578 if transaction.failed:
    579     failed_requirements = ", ".join([f"'{req}'" for req in transaction.failed])
--> 580     raise ValueError(
    581         f"Can't find a pure Python 3 wheel for: {failed_requirements}\n"
    582         f"See: {FAQ_URLS['cant_find_wheel']}\n"
    583     )
    585 wheel_promises = []
    586 # Install built-in packages

ValueError: Can't find a pure Python 3 wheel for: 'ecos>=2', 'fastcache', 'cvxcanon>=0.0.22', 'scs>=1.1.3'
See: https://pyodide.org/en/stable/usage/faq.html#micropip-can-t-find-a-pure-python-wheel

Is there any way of making these packages optional?

Hey @hyanwong. These are all indirect dependencies, which I'm reasonably sure get pulled in by the direct dependency on cvxpy. I'm using cvxpy for constrained optimisation to choose the positions of demes so that they're closer to other demes to whom they're most closely related (defined as those with ancestor/descendent relationships or migration/pulse interactions), with the constraint corresponding to a fixed left-to-right ordering of demes (which I optimise separately to minimise line crossings). It's a bit complicated, but the code is here and here in case you're curious. It's probably underdocumented. I was using one of the scipy.optimize() methods to start with, but it just never gave sensible results. I also played with kiwisolver, which matplotlib already uses, but I couldn't figure out how to describe my constraints with their API. If you have suggestions for alternatives, I'd certainly be happy to avoid this dependency!

Thanks for the details. I wonder if it is possible to switch to using scipy.optimize() if cvxpy is not available? Or, as you say, work out how to specify it in kiwisolver, which is in pyodide.

Seems like the kiwisolver route might be worth investigating, but I'm not sure I would have the expertise to get this off the ground. I wonder if anyone else might be able to help?

If we do get it working in pyodide, it would make tutorial workbooks etc withe demesdraw much easier to put online.

I wonder if chatGPT would be able to convert the cvxpy API call to a kiwisolver alternative?!!

From memory, there were cases where scipy.optimize() would suggest positions that broke the constraints in weird ways. I forget the details. Kiwisolver would be my ideal choice, but my final conclusion was that it didn't offer a way to do arbitrary linear constraints, which is what I want/need. But maybe there is some other optimisation library that could work?

I wonder if chatGPT would be able to convert the cvxpy API call to a kiwisolver alternative?!!

Hmm... We could ask chatGPT to give the code that Open-Assistant would suggest. :)
https://github.com/LAION-AI/Open-Assistant

Here's what chatGPT says, FWIW (it's probably half-nonsense)

Convert the following cvxpy code to use kiwisolver:
...

from kiwisolver import Solver

def cvxpy_optimise(
    graph: demes.Graph,
    positions: Mapping[str, float],
    *,
    sep: float,
    unique_interactions: bool,
) -> Dict[str, float]:
    """
    Optimise the given positions into a tree-like layout.
    Convex optimisation is used to minimise the distances:
      - from each parent deme to the mean position of its children,
      - and between interacting demes (where interactions are either
        migrations or pulses).
    Subject to the constraints that:
      - demes are ordered like in the input ``positions``,
      - and demes have a minimum separation distance ``sep``.
    :param graph:
        Graph for which positions should be optimised.
    :param positions:
        A dictionary mapping deme names to positions.
    :param sep:
        The minimum separation distance between contemporary demes.
    :return:
        A dictionary mapping deme names to positions.
    """
    contemporaries = coexistence_indices(graph)
    if len(contemporaries) == 0:
        # There are no constraints, so stack demes on top of each other.
        return {deme.name: 0 for deme in graph.demes}
    successors = successors_indices(graph)
    interactions = interactions_indices(graph, unique=unique_interactions)

    vs = [Solver().Variable() for _ in graph.demes]
    # Place the root at position 0.
    vs[0].value = 0

    z = 0
    for parent, children in successors.items():
        a = vs[parent]
        b = sum(vs[child] for child in children) / len(children)
        z += (a - b) ** 2
    if len(interactions) > 0:
        z += Solver().sum([vs[j] - vs[k] for j, k in interactions]).squared()

    solver = Solver()
    objective = solver.Minimize(z)

    constraints = []
    x = np.array([positions[deme.name] for deme in graph.demes])
    for j, k in contemporaries:
        if x[j] < x[k]:
            j, k = k, j
        constraints.append(vs[j] - vs[k] >= sep)

    prob = solver.create_problem_with_constraints(objective, constraints)
    prob.solve()
    if prob.status != solver.OPTIMAL:
        raise RuntimeError(f"Failed to optimise: {prob}")

    return {graph.demes[j].name: float(v.value()) for j, v in enumerate(vs)}

Note that KiwiSolver is a solver for convex optimization problems, similar to CVXPY. The main differences are the syntax for variable and constraint creation, as well as the way the solver is called and problem status is checked. KiwiSolver uses the Solver() class to create variables and constraints, and the create_problem_with_constraints() method to define the problem with the objective and constraints. Finally, the solve() method is called on the problem to solve it, and the status is checked using the status attribute of the problem object.

lol. Did it just make up an API for kiwisolver?

Apparently so!

Having said which, this is actually the sort of thing that AI models should be able to do relatively easily: simply translate one set of API calls into an equivalent using a different library that has similar functionality. There should be enough code examples on the internet for it to get most of it right.

One thing that might actually help is to add a few more comment lines into the body of the cvxpy_optimise function, then it's easier to use copilot & friends to suggest ways to implement commands in Kiwisolver-speak. But it might be that I've misunderstood, and the specific optimisation you want isn't possible to specify in kiwisolver.

Hi @hyanwong. I've just committed a change that removes cvxpy and uses scipy again. The scipy docs are quite a lot better than when I last looked, so this proved reasonably easy. A visual inspection of my test cases (the examples page in the docs) shows negligible differences, and I added a bunch of new tests, so I'm hopeful that this doesn't break anything.

Could you check if things work for you now with pyodide? I'm not up to speed with that stuff, so I'm not sure what I should be looking at there. If it works for you, I'll make a new release in the next week or so.

Ah, that's so great, thanks. Does demesdraw make wheels available somewhere, which I think is what I need to install a dev/github version on pyodide?

Hmm, the micropip.install() thingo is a bit funny---it wont't just install a pure python package from a git url like the command line pip will. I'll do a beta release today to get some wheels up on test.pypi.org, which should be enough to test micropip.

It works! I get the expected output for the following:

import micropip
await micropip.install("https://test-files.pythonhosted.org/packages/b3/f3/b70933fdf5adf11a623c7cfa75fabe6be55078ed189b46565ada198f46f7/demesdraw-0.4.0b1-py3-none-any.whl", keep_going=True)
# Output svg. Why is fuzzy .png the notebook default?
from matplotlib_inline.backend_inline import set_matplotlib_formats
set_matplotlib_formats("svg")

import demes
import demesdraw

demog = demes.from_ms("-I 2 0 0 0.1 -es 1 1 0.99 -ej 1 3 2 -ej 100 2 1", N0=1)
demesdraw.tubes(demog, log_time=True)

There are a couple of other things I want to do before release, but I should have some time to make this happen next week.

That's fantastic - thanks for your work on this. Hope it didn't take too much time.