{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Make sure Python and Jupyter are set up locally first (and if necessary relaunch VSCode). There is a `shell.nix` that does this for you if you have Nix installed, or you can use a virtual env created from the global Python.\n", "\n", "```bash\n", "$ python3 -m venv .venv\n", "$ . .venv/bin/activate\n", "$ pip install jupyter ipykernel notebook\n", "$ pip install -r requirements.txt\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some LLM code copied from [Qwak.com](https://www.qwak.com/post/utilizing-llms-with-embedding-stores#building-a-closed-qa-bot-with-falcon-7b-and-chromadb)..." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'instruction': 'When did Virgin Australia start operating?', 'context': \"Virgin Australia, the trading name of Virgin Australia Airlines Pty Ltd, is an Australian-based airline. It is the largest airline by fleet size to use the Virgin brand. It commenced services on 31 August 2000 as Virgin Blue, with two aircraft on a single route. It suddenly found itself as a major airline in Australia's domestic market after the collapse of Ansett Australia in September 2001. The airline has since grown to directly serve 32 cities in Australia, from hubs in Brisbane, Melbourne and Sydney.\", 'response': 'Virgin Australia commenced services on 31 August 2000 as Virgin Blue, with two aircraft on a single route.', 'category': 'closed_qa'}\n" ] } ], "source": [ "from datasets import load_dataset\n", "\n", "# Load only the training split of the dataset\n", "train_dataset = load_dataset(\"databricks/databricks-dolly-15k\", split='train')\n", "\n", "# Filter the dataset to only include entries with the 'closed_qa' category\n", "closed_qa_dataset = train_dataset.filter(lambda example: example['category'] == 'closed_qa')\n", "\n", "print(closed_qa_dataset[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make sure ollama is running in the background and the model is available:\n", "\n", "```bash\n", "$ ollama serve\n", "$ ollama pull mistral\n", "$ ollama pull albertogg/multi-qa-minilm-l6-cos-v1\n", "```\n", "\n", "(You can tell it to pull a model from Python too.)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import chromadb\n", "\n", "# Initialize the embedding model\n", "chroma_client = chromadb.Client()\n", "collection = chroma_client.create_collection(name=\"knowledge-base\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import ollama\n", "\n", "# Method to populate the vector store with embeddings from a dataset\n", "def populate_vectors(dataset):\n", " for i, item in enumerate(dataset):\n", " combined_text = f\"{item['instruction']} {item['context']}\"\n", " embeddings = ollama.embeddings(model='albertogg/multi-qa-minilm-l6-cos-v1', prompt=combined_text)['embedding']\n", " collection.add(embeddings=[embeddings], documents=[item['context']], ids=[f\"id_{i}\"])\n", "\n", "# Method to search the ChromaDB collection for relevant context based on a query\n", "def search_context(query, n_results=1):\n", " query_embeddings = ollama.embeddings(model='albertogg/multi-qa-minilm-l6-cos-v1', prompt=query)['embedding']\n", " return collection.query(query_embeddings=query_embeddings, n_results=n_results)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "populate_vectors(closed_qa_dataset)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'model': 'albertogg/multi-qa-minilm-l6-cos-v1',\n", " 'embeddings': [[-0.047326926,\n", " 0.04261045,\n", " -0.04876651,\n", " -0.06244526,\n", " 0.020464458,\n", " -0.048344012,\n", " -0.04408791,\n", " -0.002949139,\n", " 0.0025748874,\n", " 0.05482461,\n", " 0.11849957,\n", " 0.025493044,\n", " 0.014819109,\n", " 0.06831706,\n", " 0.009105153,\n", " -0.046532623,\n", " -0.16621193,\n", " 0.05230968,\n", " 0.010075348,\n", " 0.023884395,\n", " 0.015539375,\n", " 0.0028502217,\n", " 0.060684662,\n", " -0.062685415,\n", " -0.046413578,\n", " -0.048255168,\n", " 0.00030686293,\n", " -0.12090608,\n", " 0.018961545,\n", " 0.074324206,\n", " 0.11006716,\n", " 6.0693263e-05,\n", " 0.061112206,\n", " -0.048079386,\n", " -0.025828438,\n", " 0.04921832,\n", " -0.024565157,\n", " -0.03051368,\n", " -0.0028907557,\n", " 0.097533785,\n", " -0.06472982,\n", " 0.061170023,\n", " -0.0089735575,\n", " 0.020445723,\n", " -0.005381917,\n", " 0.0071884934,\n", " -0.0033629434,\n", " 0.057539146,\n", " -0.04797672,\n", " 0.013695918,\n", " -0.061673347,\n", " -0.056127004,\n", " -0.026096934,\n", " -0.03828968,\n", " -0.055517305,\n", " 0.03823934,\n", " -0.00991792,\n", " -0.012591937,\n", " -0.07290659,\n", " -0.07391272,\n", " 0.058799125,\n", " -0.07563445,\n", " 0.019186638,\n", " 0.005049216,\n", " -0.07882881,\n", " 0.038277213,\n", " -0.009100741,\n", " 0.01655431,\n", " 0.04669677,\n", " 0.01542749,\n", " 0.010468321,\n", " -0.06541152,\n", " 0.063369796,\n", " 0.0043949294,\n", " 0.017930193,\n", " -0.07531007,\n", " 0.005872479,\n", " -0.0055264668,\n", " -0.037674367,\n", " -0.043022335,\n", " 0.045320578,\n", " 0.02453328,\n", " -0.07177942,\n", " 0.008218003,\n", " -0.042356815,\n", " 0.0047335094,\n", " -0.05494888,\n", " 0.0041321,\n", " 0.07856775,\n", " -0.001975062,\n", " -0.02943519,\n", " -0.096557,\n", " -0.021854732,\n", " 0.02290473,\n", " -0.013733136,\n", " 0.067812845,\n", " -0.08494185,\n", " 0.050859287,\n", " 0.023364836,\n", " -0.009106958,\n", " 0.034797944,\n", " 0.059569016,\n", " -0.0202045,\n", " 0.04459757,\n", " 0.038256377,\n", " -0.035493616,\n", " -0.04549411,\n", " -0.043316167,\n", " -0.06066166,\n", " 0.020980902,\n", " -0.04659468,\n", " 0.0056130863,\n", " 0.02413518,\n", " -0.027821388,\n", " 0.04375567,\n", " 0.005690112,\n", " 0.004990727,\n", " -0.0007373358,\n", " 0.060861778,\n", " -0.08770446,\n", " -0.021952398,\n", " -0.02709858,\n", " -0.023554781,\n", " 0.0040802425,\n", " 0.07432263,\n", " 0.023400767,\n", " -0.0760793,\n", " -1.0742225e-31,\n", " 0.004910136,\n", " -0.021676823,\n", " 0.06059023,\n", " 0.0416192,\n", " -0.05726914,\n", " -0.00095885724,\n", " 0.029654905,\n", " -0.060690094,\n", " 0.04802541,\n", " 0.050239563,\n", " -0.051198337,\n", " 0.05962651,\n", " 0.053898286,\n", " 0.04554278,\n", " -0.041560393,\n", " 0.05900883,\n", " -0.015705032,\n", " 0.029923437,\n", " -0.014376889,\n", " 0.04262046,\n", " 0.038890563,\n", " 0.08034209,\n", " 0.028923307,\n", " -0.09804302,\n", " -0.0064277407,\n", " 0.10547153,\n", " 0.04993054,\n", " 0.02035595,\n", " 0.010557823,\n", " -0.05253371,\n", " 0.025632618,\n", " -0.02916404,\n", " -0.06700798,\n", " 0.04671706,\n", " 0.10173924,\n", " 0.027336081,\n", " 0.00026571372,\n", " -0.028487796,\n", " -0.02309734,\n", " 0.0041730814,\n", " -0.0043037725,\n", " 0.024915375,\n", " 0.047395922,\n", " -0.043984596,\n", " 0.010964619,\n", " 0.051979583,\n", " 0.029158855,\n", " -0.03807753,\n", " 0.019117119,\n", " -0.045216594,\n", " 0.03017429,\n", " 0.036291167,\n", " 0.1122992,\n", " -0.107111506,\n", " -0.0026101093,\n", " -0.009684378,\n", " -0.019219413,\n", " 0.09266587,\n", " -0.061512146,\n", " -0.04848443,\n", " 0.005890569,\n", " 0.022914898,\n", " 0.05983137,\n", " 0.023002269,\n", " 0.049840003,\n", " -0.007583193,\n", " -0.079507016,\n", " -0.046642106,\n", " 0.04948097,\n", " 0.039361574,\n", " 0.023636023,\n", " 0.06455103,\n", " -0.06301515,\n", " -0.060034383,\n", " 0.09716866,\n", " 0.009709598,\n", " -0.0020804934,\n", " -0.0849345,\n", " -0.008208633,\n", " -0.044828847,\n", " -0.07166878,\n", " -0.07144221,\n", " -0.052819643,\n", " -0.029836621,\n", " 0.025228724,\n", " 0.03152433,\n", " -0.07479202,\n", " 0.08937089,\n", " -0.011851438,\n", " -0.017864672,\n", " -0.0033845317,\n", " 0.034747325,\n", " -0.087153055,\n", " 0.0864516,\n", " -0.04853223,\n", " -6.08171e-33,\n", " -0.042537298,\n", " 0.0035220943,\n", " -0.0030195138,\n", " -0.03504341,\n", " -0.004265753,\n", " 0.0217429,\n", " -0.09722863,\n", " -0.0037464986,\n", " 0.032404616,\n", " -0.008141994,\n", " -0.03446925,\n", " 0.005365082,\n", " 0.028299851,\n", " 0.090020254,\n", " 0.0021240658,\n", " 0.023433167,\n", " 0.034527183,\n", " -0.02063375,\n", " -0.12912427,\n", " 0.021344976,\n", " -0.018480597,\n", " 0.13902964,\n", " -0.091888934,\n", " 0.0021802043,\n", " 0.0031639924,\n", " -0.02057608,\n", " 0.07748809,\n", " -0.020372646,\n", " 0.024597498,\n", " -0.037090827,\n", " 0.02698387,\n", " 0.01585303,\n", " -0.019023806,\n", " 0.0408317,\n", " 0.06356056,\n", " 0.035854116,\n", " -0.0065504485,\n", " 0.06595484,\n", " -0.026909878,\n", " -0.093034685,\n", " -0.07147366,\n", " 0.026927354,\n", " -0.009459878,\n", " -0.08205121,\n", " 0.03732551,\n", " -0.01628004,\n", " -0.038362805,\n", " -0.033685356,\n", " -0.068909496,\n", " 0.051083885,\n", " -0.15675288,\n", " -0.018456614,\n", " 0.08039072,\n", " -0.056529105,\n", " 0.10678994,\n", " -0.044059474,\n", " -0.06206633,\n", " -0.085817374,\n", " 0.007421948,\n", " 0.034810632,\n", " 0.014360923,\n", " 0.0036407453,\n", " -0.05928217,\n", " 0.08439162,\n", " -0.008038554,\n", " 0.0037566014,\n", " -0.07900707,\n", " 0.04697754,\n", " -0.02753081,\n", " -0.042420704,\n", " 0.091448046,\n", " 0.10619971,\n", " -0.024978679,\n", " -0.04519029,\n", " -0.002192896,\n", " 0.043923937,\n", " -0.015708532,\n", " -0.026105527,\n", " -0.04063413,\n", " -0.009478847,\n", " 0.013998384,\n", " -0.13688947,\n", " 0.025269555,\n", " -0.005103828,\n", " -0.03260368,\n", " 0.0686648,\n", " -0.05240626,\n", " 0.010964839,\n", " 0.040379893,\n", " -0.052226026,\n", " 0.06515012,\n", " 0.047985274,\n", " 0.0045284205,\n", " 0.05459975,\n", " -0.019940866,\n", " -4.998592e-33,\n", " -0.025325025,\n", " 0.0019487329,\n", " 0.005584312,\n", " 0.03159372,\n", " -0.045154814,\n", " 0.032593627,\n", " -0.009964983,\n", " 0.017840559,\n", " -0.092259206,\n", " 0.06479814,\n", " -0.09905614,\n", " 0.07095165,\n", " 0.08806772,\n", " 0.07580785,\n", " 0.0024786636,\n", " -0.034674115,\n", " 0.03822023,\n", " 0.049485184,\n", " 0.04066313,\n", " 0.033541873,\n", " 0.12382988,\n", " -0.0746652,\n", " 0.01814293,\n", " -0.028297476,\n", " -0.006112553,\n", " -0.015192306,\n", " 0.016731648,\n", " -0.039424665,\n", " -0.035580322,\n", " -0.11895453,\n", " -0.022746451,\n", " 0.00818405,\n", " -0.03211015,\n", " 0.0052714436,\n", " 0.0059184586,\n", " -0.007196112,\n", " 0.097190894,\n", " 0.027681807,\n", " 0.048719853,\n", " -0.024112804,\n", " -0.014177468,\n", " -0.0013266716,\n", " 0.03676054,\n", " 0.024993153,\n", " -0.009454966,\n", " 0.06656371,\n", " -0.004334297,\n", " -0.013607873,\n", " 0.045264672,\n", " 0.04677959,\n", " -0.016298106,\n", " 0.09399973,\n", " -0.060032774,\n", " -0.09507076,\n", " 0.060542457,\n", " 0.015698548,\n", " -0.03549813,\n", " 0.0332258,\n", " -0.12323315,\n", " 0.020846719,\n", " 0.052872688,\n", " 0.0075417515,\n", " -0.03681177,\n", " -0.0005330843]],\n", " 'total_duration': 515624039,\n", " 'load_duration': 268014019,\n", " 'prompt_eval_count': 159}" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "instruction = \"What is the name of the major school of praxiology not developed by Ludwig von Mises\"\n", "context = \"In philosophy, praxeology or praxiology (/\\u02ccpr\\u00e6ksi\\u02c8\\u0252l\\u0259d\\u0292i/; from Ancient Greek \\u03c0\\u03c1\\u1fb6\\u03be\\u03b9\\u03c2 (praxis) 'deed, action', and -\\u03bb\\u03bf\\u03b3\\u03af\\u03b1 (-logia) 'study of') is the theory of human action, based on the notion that humans engage in purposeful behavior, contrary to reflexive behavior and other unintentional behavior.\\n\\nFrench social philosopher Alfred Espinas gave the term its modern meaning, and praxeology was developed independently by two principal groups: the Austrian school, led by Ludwig von Mises, and the Polish school, led by Tadeusz Kotarbi\\u0144ski.\"\n", "ollama.embed(model='albertogg/multi-qa-minilm-l6-cos-v1', input=f\"Content: {instruction} {context}\")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dataset({\n", " features: ['instruction', 'context', 'response', 'category'],\n", " num_rows: 1773\n", "})" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "closed_qa_dataset" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Collection(id=UUID('ff8dce2d-b4f2-4eae-bcb7-579cc1391a6c'), name='knowledge-base', configuration_json={'hnsw_configuration': {'space': 'l2', 'ef_construction': 100, 'ef_search': 10, 'num_threads': 8, 'M': 16, 'resize_factor': 1.2, 'batch_size': 100, 'sync_threshold': 1000, '_type': 'HNSWConfigurationInternal'}, '_type': 'CollectionConfigurationInternal'}, metadata=None, dimension=None, tenant='default_tenant', database='default_database', version=0)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "collection.get_model()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'ids': [[]],\n", " 'distances': [[]],\n", " 'metadatas': [[]],\n", " 'embeddings': None,\n", " 'documents': [[]],\n", " 'uris': None,\n", " 'data': None,\n", " 'included': ['metadatas', 'documents', 'distances']}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "collection.query(query_embeddings=[ollama.embeddings(model='albertogg/multi-qa-minilm-l6-cos-v1', prompt='When was Tomoaki Komorida born?')['embedding']], n_results=1)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Komorida was born in Kumamoto Prefecture on July 10, 1981. After graduating from high school, he joined the J1 League club Avispa Fukuoka in 2000. Although he debuted as a midfielder in 2001, he did not play much and the club was relegated to the J2 League at the end of the 2001 season. In 2002, he moved to the J2 club Oita Trinita. He became a regular player as a defensive midfielder and the club won the championship in 2002 and was promoted in 2003. He played many matches until 2005. In September 2005, he moved to the J2 club Montedio Yamagata. In 2006, he moved to the J2 club Vissel Kobe. Although he became a regular player as a defensive midfielder, his gradually was played less during the summer. In 2007, he moved to the Japan Football League club Rosso Kumamoto (later Roasso Kumamoto) based in his local region. He played as a regular player and the club was promoted to J2 in 2008. Although he did not play as much, he still played in many matches. In 2010, he moved to Indonesia and joined Persela Lamongan. In July 2010, he returned to Japan and joined the J2 club Giravanz Kitakyushu. He played often as a defensive midfielder and center back until 2012 when he retired.'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "user_question = \"When was Tomoaki Komorida born?\"\n", "context_response = search_context(user_question)\n", "context = \"\".join(context_response['documents'][0])\n", "context" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.\n" ] } ], "source": [ "prompt = f\"\"\"\n", "{user_question}\n", "\n", "Context information is below.\n", "---------------------\n", "{context}\n", "---------------------\n", "Given the context and provided history information and not prior knowledge,\n", "reply to the user comment. If the answer is not in the context, inform\n", "the user that you can't answer the question.\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Komorida was born on July 10, 1981.'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = ollama.chat(model=\"mistral\", messages=[{'role':'user', 'content': f\"{context}\\n\\n{user_question}\"}])\n", "response['message']['content'].strip()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Tomoaki Komorida was born on July 10, 1981.\n", "Tomoaki Komorida was born on July 10, 1981.\n", "Tomoaki Komorida was born on July 10, 1981.\n", "Tomoaki Komorida was born on July 10, 1981.\n", "Tomoaki Komorida was born on July 10, 1981.\n", "Tomoaki Komorida was born on July 10, 1981.\n", "Tomoaki Komorida was born on July 10, 1981.\n", "Komorida was born on July 10, 1981.\n", "Tomoaki Komorida was born on July 10, 1981.\n", "Tomoaki Komorida was born on July 10, 1981.\n" ] } ], "source": [ "for i in range(10):\n", "\tresponse = ollama.chat(model=\"mistral\", messages=[{'role':'user', 'content': f\"{context}\\n\\n{user_question}\"}])\n", "\tprint(response['message']['content'].strip())" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Tomoaki Komorida was born on July 10, 1981.'" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ollama.generate(model='mistral', prompt=f\"{context}\\n\\n{user_question}\")['response'].strip()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'model': 'mistral',\n", " 'embeddings': [[0.017545225,\n", " -0.008891564,\n", " 0.019453695,\n", " 0.0016723485,\n", " 0.0066151437,\n", " -0.011255239,\n", " 0.0073642163,\n", " 0.015485475,\n", " 0.0030973046,\n", " -0.002989305,\n", " -0.00017289614,\n", " -0.0051300055,\n", " -0.009368509,\n", " -0.0050006732,\n", " 0.019989194,\n", " -0.00912649,\n", " 0.031789336,\n", " -0.0003259816,\n", " 0.0028014206,\n", " -0.030878957,\n", " -0.010182084,\n", " -0.005244998,\n", " 0.0028699695,\n", " 0.018529905,\n", " 0.0035672926,\n", " -0.005899601,\n", " 0.008602345,\n", " -0.016584355,\n", " 0.004158217,\n", " -0.03023776,\n", " -0.003201344,\n", " -0.0077145896,\n", " -0.0022898177,\n", " -0.02077727,\n", " 0.016491277,\n", " -0.009456201,\n", " 0.007900923,\n", " -0.0021981897,\n", " -0.02405443,\n", " 3.098758e-05,\n", " 0.020625295,\n", " -0.011458679,\n", " -0.0051068566,\n", " -0.0085069295,\n", " 0.017706133,\n", " 0.009591857,\n", " 0.00090422056,\n", " -0.010826694,\n", " 0.0070333797,\n", " 0.0053707818,\n", " 0.0048018754,\n", " 0.0034794498,\n", " 0.011210644,\n", " -0.0903977,\n", " 0.0018821944,\n", " -0.0018045873,\n", " -0.030915277,\n", " 0.00034670293,\n", " 0.0020041289,\n", " -0.0026716313,\n", " -0.013138594,\n", " 0.010708301,\n", " -0.006320043,\n", " 0.0003430565,\n", " -0.0023863784,\n", " 0.0041285586,\n", " 0.02436556,\n", " -0.0013494157,\n", " 0.0027933612,\n", " -0.013282195,\n", " -0.016526068,\n", " -0.0024382453,\n", " -0.00123786,\n", " -0.024043454,\n", " -0.014208728,\n", " -0.025910893,\n", " 0.01483559,\n", " 0.0014756524,\n", " -0.0062794304,\n", " 0.011652672,\n", " -0.010846084,\n", " -0.00055292505,\n", " -0.006917841,\n", " -0.016316416,\n", " 0.0019117374,\n", " -0.0024690018,\n", " 0.0065878574,\n", " -0.011674545,\n", " -0.027465316,\n", " -0.010525571,\n", " 0.0070896544,\n", " 0.0038828724,\n", " -1.9514691e-05,\n", " -0.03796895,\n", " -0.016891971,\n", " -0.020952495,\n", " -0.02707475,\n", " -0.0056210486,\n", " 0.0008978269,\n", " 0.0090206405,\n", " 0.014132879,\n", " -0.005039622,\n", " 0.007910115,\n", " -0.010297287,\n", " -0.018774403,\n", " -0.0037102283,\n", " -0.0081058275,\n", " 0.008451028,\n", " 0.009922386,\n", " 0.020084145,\n", " -0.016261198,\n", " -0.007216061,\n", " -0.00927829,\n", " -0.014977591,\n", " 0.0049820817,\n", " 0.0038789976,\n", " 0.010578156,\n", " 0.0034335677,\n", " 0.007220238,\n", " 0.0055910046,\n", " 0.010893634,\n", " -0.016074488,\n", " -0.01070372,\n", " -0.0017039044,\n", " 0.0013024615,\n", " -0.009858568,\n", " 0.0052182064,\n", " -0.012993387,\n", " 0.010220798,\n", " -0.013077802,\n", " -0.002633005,\n", " -0.002007256,\n", " 0.0035281247,\n", " -0.014246121,\n", " -0.00662812,\n", " 0.015674533,\n", " -0.0072830277,\n", " 0.018278548,\n", " -0.0039205262,\n", " 0.008028687,\n", " 0.009912567,\n", " -0.016735815,\n", " 0.044228956,\n", " -0.0040302947,\n", " 0.015140588,\n", " 0.012763981,\n", " 0.00940551,\n", " 0.008043622,\n", " -0.007542642,\n", " 0.008158422,\n", " -0.006419053,\n", " -0.0054414403,\n", " -0.01070021,\n", " -0.026129365,\n", " 0.017914664,\n", " 0.0074348566,\n", " -0.0026260135,\n", " 0.0050034993,\n", " -0.008685947,\n", " -0.01232482,\n", " -0.009472104,\n", " -0.00998203,\n", " -0.013084934,\n", " 0.016157482,\n", " 0.0035411296,\n", " -0.01580881,\n", " 0.0056481594,\n", " 0.0044578677,\n", " 0.0192105,\n", " -0.007447126,\n", " 0.0018608619,\n", " -0.01642316,\n", " 0.013846359,\n", " 0.016723748,\n", " 0.017610094,\n", " 0.0025869973,\n", " 0.013281461,\n", " -0.0035878026,\n", " 0.013574551,\n", " 0.01191671,\n", " 0.002023514,\n", " -0.012203663,\n", " -0.0035232634,\n", " 0.004333291,\n", " -0.0022068326,\n", " -0.019803211,\n", " -0.014175296,\n", " 0.009632993,\n", " -0.009261824,\n", " -0.005644163,\n", " -0.019793313,\n", " -0.01999723,\n", " 0.010476594,\n", " 0.00023194576,\n", " 0.038221277,\n", " 0.002692452,\n", " -0.011473131,\n", " 0.011317104,\n", " -0.02533043,\n", " 0.025733313,\n", " 0.009908857,\n", " 0.006011558,\n", " -0.00117811,\n", " 0.0040635415,\n", " 0.04839185,\n", " -0.004475311,\n", " -0.020770011,\n", " 0.021278203,\n", " 0.0006682506,\n", " -0.01391899,\n", " -0.011185348,\n", " -0.0016687596,\n", " -0.020530779,\n", " -0.009443637,\n", " 0.00021791937,\n", " 0.0141618345,\n", " -0.00018125157,\n", " -0.0057364246,\n", " 0.026130533,\n", " -0.006763163,\n", " -0.012792275,\n", " 0.009654985,\n", " -0.0055593587,\n", " -0.004337306,\n", " -0.0011920833,\n", " 0.013856145,\n", " -0.009925314,\n", " -0.03665154,\n", " -0.02422632,\n", " -0.0032122692,\n", " 0.010654627,\n", " 0.002900551,\n", " -0.009436965,\n", " -0.009093561,\n", " -0.01300134,\n", " -0.011036559,\n", " -0.004318026,\n", " -0.007913343,\n", " -0.0026958662,\n", " 0.006920839,\n", " 0.03142679,\n", " 0.00720109,\n", " -0.025722636,\n", " 0.01609606,\n", " 0.018501537,\n", " -0.012519316,\n", " 0.018567495,\n", " -0.00080161256,\n", " 0.02868531,\n", " -0.022842437,\n", " 0.00041527807,\n", " 0.001499078,\n", " 0.0014302981,\n", " -0.006122281,\n", " -0.005866547,\n", " -0.00364596,\n", " -0.018620782,\n", " 0.010773724,\n", " 0.014241376,\n", " 0.006928573,\n", " 0.0025531498,\n", " 0.019868506,\n", " 0.004625751,\n", " 0.0013183479,\n", " -0.0035429432,\n", " 0.0034629963,\n", " 0.0014225965,\n", " 0.003674885,\n", " -0.022263356,\n", " -0.0057946867,\n", " 0.000847111,\n", " 0.016393552,\n", " -0.006979385,\n", " 0.010756057,\n", " 0.02133184,\n", " -0.01161145,\n", " -0.006900085,\n", " 0.020659793,\n", " 0.0060759606,\n", " 0.0064930245,\n", " 0.017779253,\n", " -0.015418825,\n", " -0.009948451,\n", " 0.0054611037,\n", " 0.0072064647,\n", " 0.009153094,\n", " -0.019114252,\n", " 0.0095359115,\n", " 0.0066137696,\n", " -0.0018614236,\n", " -0.0035935654,\n", " -0.0147451125,\n", " 0.0437512,\n", " 0.006653816,\n", " 0.017610248,\n", " 0.00023346231,\n", " -0.0031273973,\n", " -0.00795702,\n", " 0.009134104,\n", " 0.0049350346,\n", " 0.023338927,\n", " -0.011797726,\n", " 0.006721727,\n", " 0.034036856,\n", " 0.027116382,\n", " 0.008839323,\n", " 0.008879634,\n", " -0.014732894,\n", " -0.023258459,\n", " -0.010458117,\n", " 0.0025783682,\n", " 0.026273916,\n", " 0.015801044,\n", " -0.03471658,\n", " -0.019053059,\n", " 0.010396413,\n", " -0.004727972,\n", " 0.011225099,\n", " 0.005267588,\n", " 0.0009971889,\n", " 0.002181436,\n", " 0.0026353376,\n", " 0.013491443,\n", " 0.003732599,\n", " -0.015181297,\n", " 0.014031707,\n", " 0.006955213,\n", " -0.010255031,\n", " -0.0005977646,\n", " -0.0014751297,\n", " 0.011726749,\n", " -0.015701365,\n", " 0.009161775,\n", " -0.021872235,\n", " -0.012786676,\n", " 0.011369055,\n", " 0.010139113,\n", " -0.021847589,\n", " -0.003500169,\n", " 0.005626935,\n", " 0.014681923,\n", " -0.015393642,\n", " -0.0042989696,\n", " -0.01274783,\n", " -0.003247575,\n", " -0.009241044,\n", " -0.024767496,\n", " 0.01402007,\n", " -0.032956727,\n", " 0.007823585,\n", " 0.00086588814,\n", " 0.016799834,\n", " 0.009937656,\n", " 0.011562151,\n", " 0.018530047,\n", " -0.008733477,\n", " 0.015021173,\n", " 0.008170433,\n", " -0.02211776,\n", " -0.020664722,\n", " 0.017137757,\n", " 0.006076144,\n", " -0.0024033028,\n", " 0.008911197,\n", " -0.008098087,\n", " -0.01227845,\n", " 0.004365266,\n", " 0.011450032,\n", " -0.0058513233,\n", " -0.001962597,\n", " 0.0018001919,\n", " -0.0118909255,\n", " -0.016564349,\n", " 0.0011006123,\n", " 0.0028502448,\n", " 0.0035765416,\n", " -0.025846446,\n", " 0.02323068,\n", " -0.010438478,\n", " 0.012867443,\n", " 0.016655866,\n", " -0.011014527,\n", " -0.00018478793,\n", " 0.0043127853,\n", " -0.00739433,\n", " 0.01380768,\n", " -0.01715595,\n", " -0.011008975,\n", " -0.023952184,\n", " -0.013332797,\n", " 0.012218126,\n", " 0.010988365,\n", " -0.012082493,\n", " 0.005911786,\n", " 0.03259747,\n", " 0.0028192478,\n", " -0.004791071,\n", " -0.0033470504,\n", " 0.027849363,\n", " -0.008910224,\n", " 0.0038032038,\n", " 0.00079069537,\n", " -0.0026898668,\n", " -0.038791772,\n", " -0.009055432,\n", " 0.005574182,\n", " -0.0046006306,\n", " 0.022275487,\n", " -0.026547536,\n", " 0.01985376,\n", " -0.0036260244,\n", " 0.0055667395,\n", " -0.007856447,\n", " 0.002101895,\n", " -0.0048638624,\n", " -0.0041430946,\n", " -0.003604123,\n", " 0.023410158,\n", " -0.00890236,\n", " 0.02113935,\n", " 0.012540842,\n", " 0.021628506,\n", " 0.011549729,\n", " -0.011692699,\n", " 0.0062029436,\n", " 0.004613249,\n", " -0.015186577,\n", " 0.020775136,\n", " 0.004292727,\n", " 0.014386384,\n", " 0.027752299,\n", " 0.007513367,\n", " 0.009246629,\n", " -0.0064950096,\n", " -0.0012228594,\n", " 0.0183201,\n", " 0.01666892,\n", " 0.014074987,\n", " -0.015826361,\n", " 0.00706225,\n", " -0.011420435,\n", " -0.000117821124,\n", " 0.00636599,\n", " -0.01168592,\n", " -0.008913979,\n", " -0.009315366,\n", " 0.004305923,\n", " -0.0070775324,\n", " -0.0013507333,\n", " -0.010641516,\n", " 0.025595605,\n", " 0.0021488322,\n", " -0.020299394,\n", " 0.016365835,\n", " 0.01090173,\n", " -0.014332757,\n", " 0.0021695998,\n", " -0.015918853,\n", " 0.005657473,\n", " 0.0035333806,\n", " -0.01639628,\n", " -0.012431445,\n", " 0.017524673,\n", " 0.0012468589,\n", " -0.020931745,\n", " 0.01772653,\n", " -0.0017076621,\n", " -0.005710361,\n", " 0.0036018444,\n", " 0.0032456913,\n", " 0.0060804514,\n", " 0.0055429623,\n", " 0.00010344527,\n", " -0.0022787189,\n", " 0.0094032595,\n", " 0.026479244,\n", " -0.0133330235,\n", " 0.0012139258,\n", " 0.0012430488,\n", " 0.00532974,\n", " -0.031719025,\n", " 0.029677862,\n", " -0.011657092,\n", " -0.0015981547,\n", " 0.024310634,\n", " -0.0040961807,\n", " 0.014150848,\n", " 0.001236066,\n", " -0.0068385373,\n", " 0.0026352534,\n", " -0.0015445428,\n", " -0.02126823,\n", " 0.00035352813,\n", " -0.020471761,\n", " 0.023361415,\n", " -0.005785739,\n", " -0.0015232314,\n", " 0.005771716,\n", " -0.0034492896,\n", " -0.005690269,\n", " 0.0074402643,\n", " 0.010385756,\n", " -0.0052817506,\n", " -0.015418466,\n", " 0.0055891434,\n", " -0.005395423,\n", " -0.01826545,\n", " -0.016707467,\n", " -0.0038768558,\n", " -0.009845685,\n", " 0.013465657,\n", " -0.012950367,\n", " 0.0034455333,\n", " 0.0027995422,\n", " -0.018374551,\n", " 0.032909963,\n", " 0.011097423,\n", " -0.00523679,\n", " 0.011771575,\n", " -0.00030578976,\n", " 0.008449713,\n", " -0.0032486266,\n", " 0.01343338,\n", " 0.011201929,\n", " -0.0037805375,\n", " -0.0045786174,\n", " 0.018902773,\n", " -0.012733217,\n", " 0.009184702,\n", " -0.00145156,\n", " 0.0044501056,\n", " -0.019501481,\n", " 0.0011780142,\n", " -0.017345436,\n", " 0.009869974,\n", " 0.00035023314,\n", " 0.0008536144,\n", " 0.014787637,\n", " 0.013130153,\n", " 0.013293301,\n", " 0.008133363,\n", " -0.018138664,\n", " 0.017097523,\n", " 0.0016423509,\n", " -0.0012313571,\n", " -0.009307641,\n", " -0.008348854,\n", " 0.013959533,\n", " 0.00081260497,\n", " -0.022842584,\n", " 0.0021000463,\n", " 0.0071342024,\n", " -0.030543033,\n", " -0.00076658826,\n", " 0.0038838654,\n", " -0.020026527,\n", " -0.0009193741,\n", " -0.028725406,\n", " 0.008354255,\n", " 0.017175624,\n", " 0.0066980836,\n", " -0.008825315,\n", " -0.0032821153,\n", " -0.0120257195,\n", " 0.008753739,\n", " -0.010888297,\n", " 0.0018628367,\n", " 0.00666672,\n", " -0.0070096236,\n", " 0.0064608934,\n", " -0.018866133,\n", " -0.009216352,\n", " 0.0010065404,\n", " 0.007893361,\n", " 0.0022640722,\n", " -0.0048996797,\n", " 0.0012664055,\n", " 0.026383942,\n", " -0.023850972,\n", " -0.0038789338,\n", " -0.021360097,\n", " 0.011016509,\n", " 0.03296637,\n", " -0.013446338,\n", " -0.0062947953,\n", " 0.0035113234,\n", " -0.0009893583,\n", " 0.009339733,\n", " -0.011971877,\n", " -0.0043110526,\n", " -0.0008398244,\n", " -0.005604297,\n", " 0.0034807874,\n", " 0.010588174,\n", " 0.0043597063,\n", " 0.009744976,\n", " 0.0023046762,\n", " 0.006984085,\n", " -0.0036556544,\n", " 0.006231876,\n", " 0.0064902953,\n", " -0.0029447286,\n", " 0.0051266653,\n", " -0.01336394,\n", " -0.0113149965,\n", " 0.007199561,\n", " 0.018737625,\n", " 0.0020844636,\n", " -0.00026631117,\n", " -0.005549026,\n", " -0.012998596,\n", " -0.0026321998,\n", " -0.0069889147,\n", " 0.012633094,\n", " 0.006075017,\n", " -0.003161684,\n", " 0.0036537081,\n", " -0.023826687,\n", " -0.019064976,\n", " 0.0042509953,\n", " -0.008035015,\n", " 0.0017437332,\n", " 0.0001933908,\n", " -0.010598572,\n", " 0.0005602085,\n", " -0.0020730442,\n", " -0.0044246544,\n", " 0.0047421805,\n", " -0.007940179,\n", " 0.007924253,\n", " -0.03992038,\n", " -0.020474782,\n", " 0.002820776,\n", " -0.0037712108,\n", " 0.01640155,\n", " -0.0026677707,\n", " -0.0065477686,\n", " -0.00942237,\n", " -0.028361756,\n", " 0.017999578,\n", " -0.028847925,\n", " -0.035870824,\n", " -0.0077725938,\n", " 0.00025159324,\n", " 0.024979902,\n", " 0.0008509833,\n", " -0.020956736,\n", " 0.010088415,\n", " -0.014998588,\n", " 0.008106019,\n", " -0.029402161,\n", " 0.0013716082,\n", " 0.005331104,\n", " 0.019453391,\n", " -0.022312775,\n", " 0.005430823,\n", " -0.0006697997,\n", " -0.0020321724,\n", " 0.011565465,\n", " -0.0010338072,\n", " 0.0002029741,\n", " 0.024209853,\n", " -0.0026119952,\n", " 0.0024724454,\n", " -0.0033658869,\n", " 0.0035583116,\n", " 0.01953483,\n", " -0.0063113878,\n", " -0.0054068915,\n", " -0.013151072,\n", " -0.008620972,\n", " -0.0027908338,\n", " 0.0071389414,\n", " -0.011312278,\n", " -0.009514534,\n", " -0.014097776,\n", " 0.014665653,\n", " -0.010825416,\n", " -0.006738613,\n", " -0.008681804,\n", " 0.0074968743,\n", " 0.00925995,\n", " 0.030365922,\n", " 0.01581594,\n", " 0.027012127,\n", " 0.014637945,\n", " 0.018089956,\n", " 0.0063143787,\n", " 0.011933219,\n", " 0.0017047741,\n", " 0.005093401,\n", " -0.007823843,\n", " -0.031952586,\n", " 0.0057907584,\n", " -0.022763541,\n", " 0.009758164,\n", " 0.014764762,\n", " 0.0046889684,\n", " 0.00069254386,\n", " -0.00093928544,\n", " 0.017493282,\n", " -0.011338603,\n", " 0.010587571,\n", " -0.042125665,\n", " -0.01751254,\n", " 0.12934099,\n", " -0.010869519,\n", " 0.00583255,\n", " 0.007236653,\n", " -0.020137789,\n", " 0.0020204228,\n", " 0.018176237,\n", " -0.027152134,\n", " -0.0017807232,\n", " 0.018857878,\n", " 0.000920231,\n", " -0.012664301,\n", " -0.004148013,\n", " -0.0061075287,\n", " -0.01019638,\n", " 0.030963605,\n", " 0.0029604621,\n", " 0.020669036,\n", " 0.0057819453,\n", " 0.007060314,\n", " 2.8583525e-05,\n", " 0.008000199,\n", " -0.0041641127,\n", " -0.015752537,\n", " -0.014323279,\n", " -0.0063861427,\n", " 0.00097315706,\n", " -0.021074038,\n", " 0.0053948155,\n", " 0.008212429,\n", " 0.0034420004,\n", " -0.005460239,\n", " 0.0052815555,\n", " 0.023984928,\n", " 0.00024263989,\n", " -0.01184368,\n", " 0.01635719,\n", " -0.012418883,\n", " -0.027259506,\n", " -0.00791782,\n", " -0.022361718,\n", " -0.023534195,\n", " 0.007092053,\n", " 0.015088988,\n", " 0.0025764408,\n", " 0.008576472,\n", " 0.0024461527,\n", " 0.006974108,\n", " 0.0040977984,\n", " -0.009910279,\n", " 0.0014616131,\n", " -0.011150518,\n", " 0.016027663,\n", " 0.0010791811,\n", " -0.00047525248,\n", " -0.0122826295,\n", " -0.001779053,\n", " -0.004107656,\n", " 0.0004357073,\n", " -0.013289022,\n", " 5.2768923e-05,\n", " 0.027372459,\n", " -0.00012527185,\n", " 0.013568389,\n", " 0.0014534167,\n", " -0.015535423,\n", " 0.012551211,\n", " -0.009110119,\n", " 0.008564244,\n", " -0.001981865,\n", " -0.009355929,\n", " 0.028776072,\n", " -0.0037904843,\n", " 0.0006138599,\n", " 0.007737072,\n", " -0.008870305,\n", " 0.0050867843,\n", " -0.0046355803,\n", " -0.00780643,\n", " 0.006330763,\n", " -0.0034218375,\n", " 0.011135206,\n", " 0.010597446,\n", " -0.001705957,\n", " 0.00024502908,\n", " 0.0023455597,\n", " 0.019304464,\n", " 0.0020452046,\n", " -0.009107396,\n", " 0.0034956592,\n", " -0.002760975,\n", " 0.011731058,\n", " 0.0075381757,\n", " -0.0032501328,\n", " 0.007482203,\n", " 0.0112169,\n", " -0.0070274365,\n", " -0.019349564,\n", " 0.0079222545,\n", " 0.005137253,\n", " -0.0037911844,\n", " -0.023284845,\n", " 0.01017654,\n", " -0.01285582,\n", " 0.00657109,\n", " -0.0038569963,\n", " 0.004562567,\n", " 0.010494698,\n", " -0.013017893,\n", " -0.0001540083,\n", " 0.00027303037,\n", " 0.018464914,\n", " 0.008634839,\n", " 0.003688138,\n", " -0.024935802,\n", " -0.01563817,\n", " 0.0030880566,\n", " 0.0107814055,\n", " -0.005098247,\n", " 0.0051683136,\n", " 0.008334721,\n", " -0.012933082,\n", " -0.009752038,\n", " -0.009319242,\n", " 0.0026183068,\n", " -0.00839282,\n", " 0.028957602,\n", " 0.0015102412,\n", " -0.009705219,\n", " 0.007890572,\n", " -0.16137508,\n", " -0.00971317,\n", " 0.011886127,\n", " -0.0041362955,\n", " -0.051743045,\n", " 0.0029404622,\n", " 0.013015636,\n", " 0.0135714635,\n", " 0.023563059,\n", " 0.009681992,\n", " -0.0140659595,\n", " -0.0029743111,\n", " -0.0012119104,\n", " -0.0013912185,\n", " -0.018352704,\n", " 0.025660586,\n", " 0.006345499,\n", " 0.020528495,\n", " 0.026140284,\n", " -0.006008913,\n", " -0.024912497,\n", " 0.0051325797,\n", " -0.026183711,\n", " 0.00017664659,\n", " 0.011910398,\n", " -0.005905746,\n", " 0.037139714,\n", " -0.00074520166,\n", " -0.00975112,\n", " -0.017940624,\n", " -0.03163426,\n", " -0.0041947165,\n", " -0.0016518922,\n", " 0.0008597186,\n", " 0.009477727,\n", " -0.010944344,\n", " -0.031105489,\n", " -0.010577163,\n", " 0.0012018185,\n", " 0.02502922,\n", " 0.002644734,\n", " 0.019586446,\n", " 0.0006251714,\n", " -0.009300319,\n", " 0.0048416024,\n", " -0.016230734,\n", " 0.021608155,\n", " 0.0057596206,\n", " 0.0054233163,\n", " 0.020398937,\n", " 0.0019896117,\n", " -0.016917758,\n", " 0.016629925,\n", " -0.021681983,\n", " 0.008405055,\n", " 0.005663764,\n", " 0.0074983193,\n", " -0.011529013,\n", " 0.023950666,\n", " -0.0055231564,\n", " -0.009429639,\n", " -0.008542265,\n", " -0.02154635,\n", " 0.008263985,\n", " 0.040017866,\n", " 0.009212233,\n", " 0.007296921,\n", " 0.0008881682,\n", " -0.02306129,\n", " -0.018765206,\n", " 0.034263294,\n", " 0.0052024196,\n", " -0.011974769,\n", " 0.004816568,\n", " -0.009321024,\n", " 0.003163061,\n", " -0.017364537,\n", " 0.015120265,\n", " 0.009003758,\n", " 0.025875762,\n", " 0.0012792622,\n", " -0.01565257,\n", " 0.0028838697,\n", " 0.0041043386,\n", " -0.009521023,\n", " -0.007009973,\n", " 0.010894969,\n", " 0.0017601077,\n", " 0.0053477627,\n", " 0.0049250037,\n", " -0.0006218348,\n", " 0.022057934,\n", " -0.011921848,\n", " 0.010007685,\n", " 0.0048408653,\n", " 0.01652344,\n", " -0.026266284,\n", " 0.0041309632,\n", " -0.000793974,\n", " -0.006041779,\n", " -0.004878622,\n", " -0.00848488,\n", " 0.0062587205,\n", " -0.011997021,\n", " -0.0014170379,\n", " 0.03317679,\n", " -0.01153826,\n", " 0.009222562,\n", " -0.00025648117,\n", " -0.006128342,\n", " -0.019824483,\n", " -0.020242546,\n", " 0.004578746,\n", " 0.0027718116,\n", " -0.010688295,\n", " 0.0072707273,\n", " -0.035115726,\n", " -0.0009034979,\n", " 0.0022914782,\n", " -0.011246273,\n", " -0.0029672915,\n", " 0.0002998253,\n", " -0.029806698,\n", " -0.016474688,\n", " -0.012461825,\n", " 0.007398199,\n", " -0.015541311,\n", " -0.0032258818,\n", " 0.008461472,\n", " 0.009301752,\n", " 0.005515957,\n", " -0.016816154,\n", " -0.012065389,\n", " 0.021188056,\n", " 0.001224154,\n", " 0.005623831,\n", " -0.005622118,\n", " -0.019246496,\n", " -0.005230665,\n", " 0.0024922863,\n", " -0.013673767,\n", " 0.020042181,\n", " -0.024621949,\n", " -0.013509675,\n", " 0.012731169,\n", " 0.024457127,\n", " 0.004698952,\n", " -0.00069801125,\n", " 0.01918553,\n", " 0.009510992,\n", " -0.00423933,\n", " 0.0027476759,\n", " 0.019895652,\n", " 0.0012934102,\n", " -0.008207232,\n", " -0.012624893,\n", " -0.023316009,\n", " -0.01640529,\n", " 0.012508562,\n", " -0.0007352376,\n", " 0.010327011,\n", " 0.0057493057,\n", " -0.0109667685,\n", " -0.017274708,\n", " -0.006248113,\n", " ...]],\n", " 'total_duration': 5010556908,\n", " 'load_duration': 4283442370,\n", " 'prompt_eval_count': 3}" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import ollama\n", "ollama.embed(model='mistral', input=\"Hello World\")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n", "To disable this warning, you can either:\n", "\t- Avoid using `tokenizers` before the fork if possible\n", "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Collecting pipreqs\n", " Downloading pipreqs-0.5.0-py3-none-any.whl.metadata (7.9 kB)\n", "Collecting docopt==0.6.2 (from pipreqs)\n", " Downloading docopt-0.6.2.tar.gz (25 kB)\n", " Installing build dependencies ... \u001b[?25ldone\n", "\u001b[?25h Getting requirements to build wheel ... \u001b[?25ldone\n", "\u001b[?25h Preparing metadata (pyproject.toml) ... \u001b[?25ldone\n", "\u001b[?25hCollecting ipython==8.12.3 (from pipreqs)\n", " Downloading ipython-8.12.3-py3-none-any.whl.metadata (5.7 kB)\n", "Requirement already satisfied: nbconvert<8.0.0,>=7.11.0 in ./.venv/lib/python3.12/site-packages (from pipreqs) (7.16.4)\n", "Collecting yarg==0.1.9 (from pipreqs)\n", " Downloading yarg-0.1.9-py2.py3-none-any.whl.metadata (4.6 kB)\n", "Collecting backcall (from ipython==8.12.3->pipreqs)\n", " Downloading backcall-0.2.0-py2.py3-none-any.whl.metadata (2.0 kB)\n", "Requirement already satisfied: decorator in ./.venv/lib/python3.12/site-packages (from ipython==8.12.3->pipreqs) (5.1.1)\n", "Requirement already satisfied: jedi>=0.16 in ./.venv/lib/python3.12/site-packages (from ipython==8.12.3->pipreqs) (0.19.1)\n", "Requirement already satisfied: matplotlib-inline in ./.venv/lib/python3.12/site-packages (from ipython==8.12.3->pipreqs) (0.1.7)\n", "Collecting pickleshare (from ipython==8.12.3->pipreqs)\n", " Downloading pickleshare-0.7.5-py2.py3-none-any.whl.metadata (1.5 kB)\n", "Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in ./.venv/lib/python3.12/site-packages (from ipython==8.12.3->pipreqs) (3.0.47)\n", "Requirement already satisfied: pygments>=2.4.0 in ./.venv/lib/python3.12/site-packages (from ipython==8.12.3->pipreqs) (2.18.0)\n", "Requirement already satisfied: stack-data in ./.venv/lib/python3.12/site-packages (from ipython==8.12.3->pipreqs) (0.6.3)\n", "Requirement already satisfied: traitlets>=5 in ./.venv/lib/python3.12/site-packages (from ipython==8.12.3->pipreqs) (5.14.3)\n", "Requirement already satisfied: pexpect>4.3 in ./.venv/lib/python3.12/site-packages (from ipython==8.12.3->pipreqs) (4.9.0)\n", "Requirement already satisfied: requests in ./.venv/lib/python3.12/site-packages (from yarg==0.1.9->pipreqs) (2.32.3)\n", "Requirement already satisfied: beautifulsoup4 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (4.12.3)\n", "Requirement already satisfied: bleach!=5.0.0 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (6.1.0)\n", "Requirement already satisfied: defusedxml in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (0.7.1)\n", "Requirement already satisfied: jinja2>=3.0 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (3.1.4)\n", "Requirement already satisfied: jupyter-core>=4.7 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (5.7.2)\n", "Requirement already satisfied: jupyterlab-pygments in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (0.3.0)\n", "Requirement already satisfied: markupsafe>=2.0 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (2.1.5)\n", "Requirement already satisfied: mistune<4,>=2.0.3 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (3.0.2)\n", "Requirement already satisfied: nbclient>=0.5.0 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (0.10.0)\n", "Requirement already satisfied: nbformat>=5.7 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (5.10.4)\n", "Requirement already satisfied: packaging in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (24.1)\n", "Requirement already satisfied: pandocfilters>=1.4.1 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (1.5.1)\n", "Requirement already satisfied: tinycss2 in ./.venv/lib/python3.12/site-packages (from nbconvert<8.0.0,>=7.11.0->pipreqs) (1.3.0)\n", "Requirement already satisfied: six>=1.9.0 in ./.venv/lib/python3.12/site-packages (from bleach!=5.0.0->nbconvert<8.0.0,>=7.11.0->pipreqs) (1.16.0)\n", "Requirement already satisfied: webencodings in ./.venv/lib/python3.12/site-packages (from bleach!=5.0.0->nbconvert<8.0.0,>=7.11.0->pipreqs) (0.5.1)\n", "Requirement already satisfied: parso<0.9.0,>=0.8.3 in ./.venv/lib/python3.12/site-packages (from jedi>=0.16->ipython==8.12.3->pipreqs) (0.8.4)\n", "Requirement already satisfied: platformdirs>=2.5 in ./.venv/lib/python3.12/site-packages (from jupyter-core>=4.7->nbconvert<8.0.0,>=7.11.0->pipreqs) (4.2.2)\n", "Requirement already satisfied: jupyter-client>=6.1.12 in ./.venv/lib/python3.12/site-packages (from nbclient>=0.5.0->nbconvert<8.0.0,>=7.11.0->pipreqs) (8.6.2)\n", "Requirement already satisfied: fastjsonschema>=2.15 in ./.venv/lib/python3.12/site-packages (from nbformat>=5.7->nbconvert<8.0.0,>=7.11.0->pipreqs) (2.20.0)\n", "Requirement already satisfied: jsonschema>=2.6 in ./.venv/lib/python3.12/site-packages (from nbformat>=5.7->nbconvert<8.0.0,>=7.11.0->pipreqs) (4.23.0)\n", "Requirement already satisfied: ptyprocess>=0.5 in ./.venv/lib/python3.12/site-packages (from pexpect>4.3->ipython==8.12.3->pipreqs) (0.7.0)\n", "Requirement already satisfied: wcwidth in ./.venv/lib/python3.12/site-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython==8.12.3->pipreqs) (0.2.13)\n", "Requirement already satisfied: soupsieve>1.2 in ./.venv/lib/python3.12/site-packages (from beautifulsoup4->nbconvert<8.0.0,>=7.11.0->pipreqs) (2.5)\n", "Requirement already satisfied: charset-normalizer<4,>=2 in ./.venv/lib/python3.12/site-packages (from requests->yarg==0.1.9->pipreqs) (3.3.2)\n", "Requirement already satisfied: idna<4,>=2.5 in ./.venv/lib/python3.12/site-packages (from requests->yarg==0.1.9->pipreqs) (3.7)\n", "Requirement already satisfied: urllib3<3,>=1.21.1 in ./.venv/lib/python3.12/site-packages (from requests->yarg==0.1.9->pipreqs) (2.2.2)\n", "Requirement already satisfied: certifi>=2017.4.17 in ./.venv/lib/python3.12/site-packages (from requests->yarg==0.1.9->pipreqs) (2024.7.4)\n", "Requirement already satisfied: executing>=1.2.0 in ./.venv/lib/python3.12/site-packages (from stack-data->ipython==8.12.3->pipreqs) (2.0.1)\n", "Requirement already satisfied: asttokens>=2.1.0 in ./.venv/lib/python3.12/site-packages (from stack-data->ipython==8.12.3->pipreqs) (2.4.1)\n", "Requirement already satisfied: pure-eval in ./.venv/lib/python3.12/site-packages (from stack-data->ipython==8.12.3->pipreqs) (0.2.3)\n", "Requirement already satisfied: attrs>=22.2.0 in ./.venv/lib/python3.12/site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert<8.0.0,>=7.11.0->pipreqs) (23.2.0)\n", "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in ./.venv/lib/python3.12/site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert<8.0.0,>=7.11.0->pipreqs) (2023.12.1)\n", "Requirement already satisfied: referencing>=0.28.4 in ./.venv/lib/python3.12/site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert<8.0.0,>=7.11.0->pipreqs) (0.35.1)\n", "Requirement already satisfied: rpds-py>=0.7.1 in ./.venv/lib/python3.12/site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert<8.0.0,>=7.11.0->pipreqs) (0.19.1)\n", "Requirement already satisfied: python-dateutil>=2.8.2 in ./.venv/lib/python3.12/site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert<8.0.0,>=7.11.0->pipreqs) (2.9.0.post0)\n", "Requirement already satisfied: pyzmq>=23.0 in ./.venv/lib/python3.12/site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert<8.0.0,>=7.11.0->pipreqs) (26.0.3)\n", "Requirement already satisfied: tornado>=6.2 in ./.venv/lib/python3.12/site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert<8.0.0,>=7.11.0->pipreqs) (6.4.1)\n", "Downloading pipreqs-0.5.0-py3-none-any.whl (33 kB)\n", "Downloading ipython-8.12.3-py3-none-any.whl (798 kB)\n", "\u001b[2K \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m798.3/798.3 kB\u001b[0m \u001b[31m11.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0mm eta \u001b[36m0:00:01\u001b[0m0:01\u001b[0m\n", "\u001b[?25hDownloading yarg-0.1.9-py2.py3-none-any.whl (19 kB)\n", "Downloading backcall-0.2.0-py2.py3-none-any.whl (11 kB)\n", "Downloading pickleshare-0.7.5-py2.py3-none-any.whl (6.9 kB)\n", "Building wheels for collected packages: docopt\n", " Building wheel for docopt (pyproject.toml) ... \u001b[?25ldone\n", "\u001b[?25h Created wheel for docopt: filename=docopt-0.6.2-py2.py3-none-any.whl size=13705 sha256=0d029a8ff4638c40fa356fa0811eb9062726c077827fea0c117d5162219a1c1b\n", " Stored in directory: /home/dsyer/.cache/pip/wheels/1a/bf/a1/4cee4f7678c68c5875ca89eaccf460593539805c3906722228\n", "Successfully built docopt\n", "Installing collected packages: pickleshare, docopt, backcall, yarg, ipython, pipreqs\n", " Attempting uninstall: ipython\n", " Found existing installation: ipython 8.26.0\n", " Uninstalling ipython-8.26.0:\n", " Successfully uninstalled ipython-8.26.0\n", "Successfully installed backcall-0.2.0 docopt-0.6.2 ipython-8.12.3 pickleshare-0.7.5 pipreqs-0.5.0 yarg-0.1.9\n", "\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.2\u001b[0m\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%%bash\n", "pip install pipreqs\n", "jupyter nbconvert --to=python README.ipynb\n", "pipreqs --ignore \".venv\" ." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>instruction</th>\n", " <th>context</th>\n", " <th>response</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>How to make a cup of tea?</td>\n", " <td></td>\n", " <td>Boil water. Add tea bag. Pour water into cup. ...</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>What is the capital of France?</td>\n", " <td></td>\n", " <td>Paris</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>What is the capital of Germany?</td>\n", " <td></td>\n", " <td>Berlin</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>What is the capital of Italy?</td>\n", " <td></td>\n", " <td>Rome</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>What is the capital of Spain?</td>\n", " <td></td>\n", " <td>Madrid</td>\n", " </tr>\n", " <tr>\n", " <th>5</th>\n", " <td>What is the capital of Portugal?</td>\n", " <td></td>\n", " <td>Lisbon</td>\n", " </tr>\n", " <tr>\n", " <th>6</th>\n", " <td>What is the capital of Greece?</td>\n", " <td></td>\n", " <td>Athens</td>\n", " </tr>\n", " <tr>\n", " <th>7</th>\n", " <td>What is the capital of Turkey?</td>\n", " <td></td>\n", " <td>Ankara</td>\n", " </tr>\n", " <tr>\n", " <th>8</th>\n", " <td>What is the capital of Egypt?</td>\n", " <td></td>\n", " <td>Cairo</td>\n", " </tr>\n", " <tr>\n", " <th>9</th>\n", " <td>What is the capital of South Africa?</td>\n", " <td></td>\n", " <td>Pretoria</td>\n", " </tr>\n", " <tr>\n", " <th>10</th>\n", " <td>What is the capital of Nigeria?</td>\n", " <td></td>\n", " <td>Abuja</td>\n", " </tr>\n", " <tr>\n", " <th>11</th>\n", " <td>What is the capital of Kenya?</td>\n", " <td></td>\n", " <td>Nairobi</td>\n", " </tr>\n", " <tr>\n", " <th>12</th>\n", " <td>What is the capital of India?</td>\n", " <td></td>\n", " <td>New Delhi</td>\n", " </tr>\n", " <tr>\n", " <th>13</th>\n", " <td>What is the capital of China?</td>\n", " <td></td>\n", " <td>Beijing</td>\n", " </tr>\n", " <tr>\n", " <th>14</th>\n", " <td>What is the capital of Japan?</td>\n", " <td></td>\n", " <td>Tokyo</td>\n", " </tr>\n", " <tr>\n", " <th>15</th>\n", " <td>What is the capital of Australia?</td>\n", " <td></td>\n", " <td>Canberra</td>\n", " </tr>\n", " <tr>\n", " <th>16</th>\n", " <td>What is the capital of New Zealand?</td>\n", " <td></td>\n", " <td>Wellington</td>\n", " </tr>\n", " <tr>\n", " <th>17</th>\n", " <td>What is the capital of Canada?</td>\n", " <td></td>\n", " <td>Ottawa</td>\n", " </tr>\n", " <tr>\n", " <th>18</th>\n", " <td>What is the capital of the United States?</td>\n", " <td></td>\n", " <td>Washington D.C.</td>\n", " </tr>\n", " <tr>\n", " <th>19</th>\n", " <td>What is the capital of Brazil?</td>\n", " <td></td>\n", " <td>Brasilia</td>\n", " </tr>\n", " <tr>\n", " <th>20</th>\n", " <td>What is the capital of Argentina?</td>\n", " <td></td>\n", " <td>Buenos Aires</td>\n", " </tr>\n", " <tr>\n", " <th>21</th>\n", " <td>What is the capital of Chile?</td>\n", " <td></td>\n", " <td>Santiago</td>\n", " </tr>\n", " <tr>\n", " <th>22</th>\n", " <td>What is the capital of Peru?</td>\n", " <td></td>\n", " <td>Lima</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " instruction context \\\n", "0 How to make a cup of tea? \n", "1 What is the capital of France? \n", "2 What is the capital of Germany? \n", "3 What is the capital of Italy? \n", "4 What is the capital of Spain? \n", "5 What is the capital of Portugal? \n", "6 What is the capital of Greece? \n", "7 What is the capital of Turkey? \n", "8 What is the capital of Egypt? \n", "9 What is the capital of South Africa? \n", "10 What is the capital of Nigeria? \n", "11 What is the capital of Kenya? \n", "12 What is the capital of India? \n", "13 What is the capital of China? \n", "14 What is the capital of Japan? \n", "15 What is the capital of Australia? \n", "16 What is the capital of New Zealand? \n", "17 What is the capital of Canada? \n", "18 What is the capital of the United States? \n", "19 What is the capital of Brazil? \n", "20 What is the capital of Argentina? \n", "21 What is the capital of Chile? \n", "22 What is the capital of Peru? \n", "\n", " response \n", "0 Boil water. Add tea bag. Pour water into cup. ... \n", "1 Paris \n", "2 Berlin \n", "3 Rome \n", "4 Madrid \n", "5 Lisbon \n", "6 Athens \n", "7 Ankara \n", "8 Cairo \n", "9 Pretoria \n", "10 Abuja \n", "11 Nairobi \n", "12 New Delhi \n", "13 Beijing \n", "14 Tokyo \n", "15 Canberra \n", "16 Wellington \n", "17 Ottawa \n", "18 Washington D.C. \n", "19 Brasilia \n", "20 Buenos Aires \n", "21 Santiago \n", "22 Lima " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "pd.read_json(\"data/lines.jsonl\", lines=True)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "data = load_dataset(\"./data\", split=\"train\")" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "datasets.arrow_dataset.Dataset" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.__class__\n" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 2 }