/ai-ml-projects

A portfolio of my recent public-facing AI/ML projects

Hi, I'm Zach 👋

I'm passionate about AI, machine learning, and data science. I thrive on learning and sharing new technologies and am always seeking innovative ways to make an impact with them.

Below are some of my recent technical project highlights. As background context, I'm a technical marketer for Neo4j (since Aug 2021), specializing in GenAI/RAG integration as well as graph data science and machine learning. As such, my technical projects are a mix of workflows geared towards GenAI/RAG app development (sample apps for Q&A, search, recommendations, etc.) and DS/ML (analytics, building models, etc.). I am very flexible about, and enjoy, working both ends of this spectrum!

Technical Project Highlights

GenAI & RAG Apps

  • Demo app for GraphRAG patterns (code, webinar recording): This is a sample app and presentation I developed to explore various GraphRAG patterns and their benefits. The multi-page app demonstrates RAG for Q&A and email generation use cases. Most patterns involve a combination of embeddings, vector index, and graph operations. The app leverages OpenAI, Streamlit, Langchain, and Neo4j.

  • Short demo videos:

  • Hands-on GenAI workshop (notebook, deck): This is a long-form (~2.5 hour) workshop I developed for Neo4j to run globally at in-person events with prospects and customers. It takes participants through a series of steps to understand how to perform graph-powered RAG using an email generation bot as an example. It leverages LangChain for creating retrievers and chains.

  • Marketing Assistant (Productivity Tool) (code): As a marketer I get a lot of ad hoc writing requests for scripts, demos, blogs, etc. so I built a GenAI app to help me do that and 3x my productivity! In this instance, it is important for the LLM to be grounded with our internal messaging, positioning, and other intelligence for accuracy and alignment. I leverage our LLM-KG-Builder to create a knowledge graph from our internal documents which this app leverages for RAG. This is a dockerized application that uses FastAPI, LangServe, and Streamlit (frontend). It provides a convenient Q&A interface with some conversation memory so I can have back-and-forward drafting, editing, and refining.

Machine Learning & Data Science Workflows

  • Graph Data Science for fraud detection (blog, code): This is an analytical/ML workflow I developed to demonstrate the benefits of graphs for entity resolution, risk scoring, and ML models in the context of fraud use cases. While it doesn't leverage embeddings, it is a multi-part series including in-depth blogs and coded examples that should give you a good idea of how I put together larger projects for technical audiences.

  • Graph Data Science for recommendation systems (blog, code): This workflow explores how to use graphs, and specifically graph embeddings, to power and enhance recommendations. Graph embeddings encode graph structures, such as user preferences expressed in interconnected transactions, making them very interesting and applicable to recommendation use cases.

  • Graph machine learning overview (blog): This does not include any code, but can give you a good idea of how I communicate and explain complex ML topics in writing. The blog also touches on embedding, which is pretty core to graph ML.

  • Graph embeddings for improving model performance (notebook): This workflow demonstrates how to feed Neo4j graph embeddings into downstream classification models to improve accuracy. I use PyTorch to construct MLP neural nets for the downstream models. I train the models on different graph and text embeddings, evaluate, and compare accuracy.

  • Graph Database sampling for training GNNs (notebook): This notebook demonstrates how to sample large graphs to make computationally intensive Graph Neural Networks (GNNs) more efficient and feasible to train. I use Neo4j's random walks with restarts (RWR) algorithm to export a sample graph then demonstrate training a convolutional graph neural network (CGN) using Pytorch Geometric (PyG). Taking representative samples of graphs is not trivial due to their arbitrary interconnected nature. It requires special algorithms, hence demonstrating Neo4j's RWR here.

Other Examples

I own the Neo4j Product Examples GitHub Org which includes additional GenAI & DS/ML examples that may be of interest.