Using graph embeddings and Tensorflow to predict AML fraud
Right now, I have a Colab that has a pretty basic model using an embedding of the transaction graph. This can run on a decent laptop. As I look through the simulated data, I might enhance the model; I haven't made a fraud-detection model before.
I am using some data from the IBM AMLSim repo, which has several datasets pre-generated and some nice papers:
As I familarized myself with this data, I found their wiki helpful. They have published two papers related to their repo. I recommend looking at this one:
Scalable Graph Learning for Anti-Money Laundering: A First Look, Mark Weber, et. al. 2018.
They cite other work that could be interesting too:
- Vec2Struc: A Method Towards Explainable Structural-Based Node Embeddings
- Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics
There are two targets of interest from the data on Dropbox:
- IS_FRAUD: a binary label
- ALERT_ID: a categorical label that's more descriptive