llm-usecases

Dozens of AI agents - General Knowledge, Career and Job, ...
RAG pipeline, via which internal APIs are called, and their responses are injected into a subsequent LLM prompt to provide additional context to ground the response.
- Routing: decides if the query is in scope or not, and which AI agent to forward it to. Examples of agents are: job assessment, company understanding, takeaways for posts, etc.
- Retrieval: recall-oriented step where the AI agent decides which services to call and how (e.g. LinkedIn People Search, Bing API, etc.).
- Generation: precision-oriented step that sieves through the noisy data retrieved, filters it and produces the final response.
Small models for routing/retrieval, bigger models for generation
techniques like Chain of Thought (CoT) are very effective at improving quality and reducing hallucinations. But they require tokens that the member never sees, hence increasing their perceived latency.
Time To First Token (TTFT), Time Between Tokens (TBT)
We picked YAML because it is less verbose, and hence consumes fewer tokens than JSON
Wrapping RPC APIs in LLM friendly schema

mkmohangb/llm-usecases