Large-scale language models like GPT-4 from OpenAI serve as foundational technologies that can be applied to virtually any business issue. However, the robust power and flexibility of this technology come with a significant challenge: it is extremely difficult to pinpoint the optimal opportunities for leveraging this technology within a company.
This project is designed to assist analytics leaders, product managers, and development teams in surmounting these obstacles by demonstrating the technology's application across a variety of common business problems. The project unfolds through a series of episodes, each accompanied by the following resources:
- Walkthrough videos available on Prolego's YouTube Channel.
- Tagged releases on the main branch of this repository.
- Conversations held within Prolego's Discord community.
We advise our clients to take a capabilties-based approach when building their AI. That is, create foundational solutions that allow you to solve many different business use cases. Unfortunately too many teams begin solving specifing business problems withough building a generalizable foundation.
Most companies are developing the following capabilties as part of their AI strategy.
Capability | Explanation | Examples |
---|---|---|
text classification | Assigning categories to documents or document sections. | Episode 2 |
information extraction | Pulling out names, places, or specific sections from documents. | Episode 2 |
semantic search | Finding information based on its meaning instead of keywords. | Episode 1 |
information summarization | Condense extensive documents into concise and essential highlights. | |
information comparison | Identifying similar documents or sections of documents. | |
document generation | Creating precisely written content consistent with style and needs. Often includes a review step. | Episode 2 |
unified natural language query | Empowering anyone to get answers to questions about data and documents without SQL or tools. | Episode 3, Episode 4, Episode 5 |
routine task automation | Automating analysis of information from various sources, reasoning across them, and making decisions. | Episode 4 |
Remember, this is demo code! Don't attempt to use it for production without first redesigning it.
First install the neo-sophia code on your local machine before proceeding to the examples from the Episodes below.
git clone https://github.com/prolego-team/neo-sophia.git
conda env create -f neo-sophia/env.yml
conda activate neosophia
pip install -e neo-sophia
cd neo-sophia
cp config_example.json config.json
cp openai_api_key_example.txt openai_api_key.txt
- Change the path locations in
config.json
or use the defaults. - Add your OpenAI API key to
openai_api_key.txt
.
./test.sh
If the tests pass you are ready to run the code in one of the Episodes.
Questions? Just ask in our Discord Community.
Video: Build a RAG demo in 1 hour with GPTs
Video: Discover AI Opportuniteis with generated data
Video: Intro to RAG
- Checkout Episode 17, Release v0.17.0
git checkout tags/v0.17.0
- Start the demo by running
python -m examples.fia.reg_search
What are companies doing with Generative AI? Here's how one CEO did it.
Video: How Vericant released its Generative AI solution in 30 days
Video: Should You Use Open Source LLMs or GPT-4?
Running this code requires additional steps by an experienced developer. See the pull request
Video: The Top 3 Enterprise AI Use Cases
In the second episode on Scratchpads we walk through the technical detaials and workflow.
Video: Cook a Scrumptious AI Solution with LLM Scratchpads
Same demo as Episode 12.
LLMs cannot handle data bigger than their context windows. To overcome these limitations, use temporary memory called LLM Scratchpads.
- Checkout Episode 12, Release v0.12.0
git checkout tags/v0.12.0
- Start the demo by running
python -m examples.agent_example
Generalized LLM frameworks like LangChain are not mature enough for most teams. The foundational models and best practices are changing too fast to generalize.
Video: LangChain and other frameworks are not ready
Dashboards and reports written with applications like Tableau are underutilized. Fortunately you can turn the into rich conversations with LLMs.
Video: Hate Tableau? Replace it with LLMs
- Checkout Episode 10, Release v0.10.0
git checkout tags/v0.10.0
- Start the demo by running
python -m examples.sleuth
Most companies spend months talking about AI before converging on the same plan as everyone else. Save yourself the time and start with the right plan.
Video:Write Your Company's Generative AI Strategy in ONE hour
Programming against LLM APIs requires a combination of an experimental mindset and systems engineering experience.
Video: The skills needed for LLM Developent Job description: AI Systems Engineer
GPT-3? GPT-4? Claude 2? Open source? You have so many options and don't know where to start. Fortunately, there is an easy answer for the majority of teams. Start with the most intelligent model, currently GPT-4.
Video: How to pick the Best LLM for Your Project)
LLM hallucinations and inconsistencies are real challenges, but you can begin overcoming them with a good evaluation framework.
Video: Conquer LLM Hallucinations with an Evaluation Framework
- Checkout Episode 6, Release v0.6.0
git checkout tags/v0.6.0
- Start the demo by running
python -m examples.bank_agent_eval
Every practical problem you encounter will require accessing large datasets such as multiple databases. In doing so you will encounter the limits of the LLM's context window. In this example we explain this limitation and a simple approach for overcoming it.
Video: How to Overcome LLM Context Window Limitations
- Checkout Episode 5, Release v0.5.0
git checkout tags/v0.5.0
- Start the demo by running
python examples/bank_agent_two.py
In our second episode on Unified NLQ we introduce LLM Agents. Agents are necessary for the complex reasoning required for to run natural language queries across multiple tables.
Video: Supercharge Your Data Anlaytics with LLM Agents
- Checkout Episode 4, Release v0.4.0
git checkout tags/v0.4.0
- Start the demo by running
python -m examples.bank_agent
What will be the AI "killer app" in the enterprise? Our bet is Unified Natural Language Query (NQL). It gives executives and business leaders the ability to get insights from data by asking "natural" questions, similar to how you currently use ChatGPT. In this Episode we describe the business problem and show the extensible power of a simple example of SQL generation supplemented with the reasoning power of an LLM like GPT-4.
Video
- Checkout Episode 3, Release v0.3.2
git checkout tags/v0.3.2
- Start the demo by running
python -m examples.sqlite_chat
Every company has businesses processes that require ingesting and processing a stream of text documents. Most of this processing requires tedious human effort to find, edit, review, summarize, score, etc. chunks of text from larger documents. In this Episode we demonstrate a generalized approach for solving many of these problems using LLMs. The example takes a set of SEC 10-Q company filings and replaces the "Basis of Presentation" section with different text based on an editable templates.
Videos
- Your AI strategy “quick win” - automated document processing
- Audomated document processing - technical walkthrough
- Checkout Episode 2, Release v0.2.1
git checkout tags/v0.2.1
- Start the demo by running
python -m examples.generate_10q_basis
Most companies are struggling to pick the best AI use cases from many different options. By building a core competency in document embeddings you can begin developing a set of capabilities applicable for many enterprise use cases. In Episoide 1 we provide a primer on embeddings for a business audience and demonstrate the use of embeddings in semantic search and document Q&A.
This episode uses data from the MSRB Regulatory Rulebook
Videos
- Document embeddings are foundational capabilities for your AI strategy
- Document embeddings - technical walkthrough
- Checkout Episode 1, Release v0.1.1
git checkout tags/v0.1.1
- Start the demo by running
python -m examples.interface
Prolego is an AI services company that started in 2017 and has helped some of the world’s biggest companies generate opportunities with AI. "Prolego" is the Greek word for "predict". We needed a name for this repo and decided to use the Greek words for "new" (neo) and "wisdom" (sophia). And we just thought that Neo Sophia sounded cool.
The team: