[FEAT] - add igraph personalized pagerank
DataBoyTX opened this issue · 4 comments
Describe the bug
igraph pagerank (using personalized param) code that worked in Graphistry 2.40.46, no longer works in v2.40.55 - appears that
Need to pin the version of igraph that the code is compatible with
To Reproduce
The following pagerank interface is no longer supported in igraph v0.10.4, there's a new API for igraph personalized_pagerank:
e.g.
`g2 = g2.compute_igraph('pagerank',params={'personalization': personalization})`
def personal_pagerank_on_goal(g, goal_col='goal'):
g2 = g.edges(g._edges).nodes(g._nodes)
personalization = pd.DataFrame({'vertex': g2._nodes[g2._nodes[goal_col]][g._node]}).assign(values=1.0)
g2 = g2.compute_igraph('pagerank',params={'personalization': personalization})
low_nodes = g2._nodes[g2._nodes.pagerank < g2._nodes.pagerank.median()]
g3 = g2.drop_nodes(low_nodes[g._node])
g3 = g3.compute_igraph('louvain', directed=False, out_col='journey_community')
merged_nodes = g2._nodes.merge(g3._nodes[[g._node, 'journey_community']], how='left', on=g._node)
merged_nodes['journey_community'] = merged_nodes['journey_community'].fillna(-1)
g4 = g2.nodes(merged_nodes)
g4 = g4.nodes(g4._nodes.assign(pagerank=g4._nodes.pagerank.fillna(0.0)))
return g4
@DataBoyTX Can we do a version sniff? We don't get to control what version of igraph regular pygraphistry users are on
if algorithim == 'ppr':
if igraph.__version__ < xyz:
...
else:
...
I don't think this was a major igraph version bump, and sounds recentish, so maintaining compatibility seems worth it
Edit: ignore this in favor of immediately following comment #554 (comment)
After internal discussion:
- this is a good time to add ppr and any other new igraph bindings
- we can give a cleaner error message if they do not exist in the user's currently installed igraph version: catch and rethrow the exn
- for legacy igraph users, whether old igraph or our old ppr form, we can reroute to new ppr with a depreciation warning, and if old igraph, stay in old form but still note the deorecation
Reviewing a bit more, I think we just need to:
- expose
personalized_pagerank
as part of thecompute_igraph
options: https://github.com/graphistry/pygraphistry/blame/53448d4ef153fd262466087a951bc28a44c8fadf/graphistry/plugins/igraph.py#L267 - add to the examples for consistency w/ compute_cugraph examples
- update graphistry + gak repos to latest pyg
Research:
- igraph never supported
pagerank(personalization=...)
wherepersonalization
is a vertex weights df - igraph does support
personalized_pagerank(reset=...)
, which does what we want- sidenote: internally, igraph implements
pagerank(...)
aspersonalized_pagerank(..., reset=None)
- sidenote: internally, igraph implements
note, meanwhile as a workaround, users may be able to do:
# graph with nodes and edges
df = pd.DataFrame({
's': ['a', 'b', 'c', 'd', 'd'],
'd': ['b', 'c', 'd', 'a', 'e']
})
g1 = graphistry.edges(df, 's', 'd').materialize_nodes()
# new graph where nodes have added column 'ppr'
g2 = g1.nodes(
g1._nodes.assign(
ppr=g1.to_igraph().personalized_pagerank(reset_vertices=['b']))
)
# ex
g2._nodes
#id ppr
#0 a 0.096360
#1 b 0.313812
#2 c 0.266740
#3 d 0.226729
#4 e 0.096360