graphistry/pygraphistry

[FEAT] - add igraph personalized pagerank

DataBoyTX opened this issue · 4 comments

Describe the bug

igraph pagerank (using personalized param) code that worked in Graphistry 2.40.46, no longer works in v2.40.55 - appears that

Need to pin the version of igraph that the code is compatible with

To Reproduce
The following pagerank interface is no longer supported in igraph v0.10.4, there's a new API for igraph personalized_pagerank:

e.g.

`g2 = g2.compute_igraph('pagerank',params={'personalization': personalization})`

def personal_pagerank_on_goal(g, goal_col='goal'):

    g2 = g.edges(g._edges).nodes(g._nodes)
    personalization = pd.DataFrame({'vertex': g2._nodes[g2._nodes[goal_col]][g._node]}).assign(values=1.0)
    g2 = g2.compute_igraph('pagerank',params={'personalization': personalization})
    low_nodes = g2._nodes[g2._nodes.pagerank < g2._nodes.pagerank.median()]
    g3 = g2.drop_nodes(low_nodes[g._node])
    g3 = g3.compute_igraph('louvain', directed=False, out_col='journey_community')

    merged_nodes = g2._nodes.merge(g3._nodes[[g._node, 'journey_community']], how='left', on=g._node)
    merged_nodes['journey_community'] = merged_nodes['journey_community'].fillna(-1)
    g4 = g2.nodes(merged_nodes)

    g4 = g4.nodes(g4._nodes.assign(pagerank=g4._nodes.pagerank.fillna(0.0)))    
    return g4

@DataBoyTX Can we do a version sniff? We don't get to control what version of igraph regular pygraphistry users are on

if algorithim == 'ppr':
  if igraph.__version__ < xyz:
    ...
  else:
   ...

I don't think this was a major igraph version bump, and sounds recentish, so maintaining compatibility seems worth it

Edit: ignore this in favor of immediately following comment #554 (comment)


After internal discussion:

  • this is a good time to add ppr and any other new igraph bindings
  • we can give a cleaner error message if they do not exist in the user's currently installed igraph version: catch and rethrow the exn
  • for legacy igraph users, whether old igraph or our old ppr form, we can reroute to new ppr with a depreciation warning, and if old igraph, stay in old form but still note the deorecation

Reviewing a bit more, I think we just need to:

  1. expose personalized_pagerank as part of the compute_igraph options: https://github.com/graphistry/pygraphistry/blame/53448d4ef153fd262466087a951bc28a44c8fadf/graphistry/plugins/igraph.py#L267
  2. add to the examples for consistency w/ compute_cugraph examples
  3. update graphistry + gak repos to latest pyg

Research:

note, meanwhile as a workaround, users may be able to do:


# graph with nodes and edges
df = pd.DataFrame({
    's': ['a', 'b', 'c', 'd', 'd'],
    'd': ['b', 'c', 'd', 'a', 'e']
})
g1 = graphistry.edges(df, 's', 'd').materialize_nodes()


# new graph where nodes have added column 'ppr'
g2 = g1.nodes(
  g1._nodes.assign(
      ppr=g1.to_igraph().personalized_pagerank(reset_vertices=['b']))
)

# ex
g2._nodes

#id	ppr
#0	a	0.096360
#1	b	0.313812
#2	c	0.266740
#3	d	0.226729
#4	e	0.096360