/markov-next-word

Next word prediction using markov chains. No neural networks, no lstm, no transformers... Just 60 lines of python and simple math.

Primary LanguagePythonMIT LicenseMIT

markov-next-word

Next word prediction using markov chains no numpy (using numpy only for random choices), no pytorch, no neural networks, no lstm, no transformers and so on.

  • Just 60 lines of raw python and a little bit of math, enjoy ;)

Generated text using the model (trained with sakespeare)

Input:  love
Output: love to my self to my friend and in thy heart

Input:  sweet
Output: sweet love that i have seen the world will bear thine 

Input:  i
Output: i am not the world s best

Usage (with all of shakespeare -> generate shakespeare like text)

Import and create a model.

from src.markov_next_word import MarkovNextWord
NextWordPrediction = MarkovNextWord()

Train the model with your text data.

# Replace data/shakespeare.txt with your data.
NextWordPrediction.train('data/shakespeare.txt')

Generate text

# Generate text after love.
input_text = 'love'

# Num of words to generate.
sequence_length = 10
NextWordPrediction.generate_text(input_text, sequence_length)

Generated output

love to my self to my friend and in thy heart

What is a markov property?

In a process where the next state depends only on the current state, this property is called markov property. A sequence of events which follow the Markov property is referred to as the Markov Chain.

Short example

Lets define 3 states:

1: 'sun shines' 2: 'cloudy' 3: 'raining'

Define follwoing rules:
  • Rule 1: Sunny → next day: 80% rain, 20% cloudy, 0% sun.
  • Rule 2: Cloudy → next day: 50% rain, 50% sun, 0% cloudy.
  • Rule 3: Rainy → next day: 90% cloudy, 10% rain, 0% rain.
The stochastic matrix would look like this:
        sun, cloudy, rain
sun     [[0, 0.2, 0.8],
cloudy  [0.5, 0, 0.5],
rain    [0.1, 0.9, 0]]
Weather simulation
Today the sun is shining, predicting the weather in the next 7 days as follow.
import numpy as np
from numpy.linalg import matrix_power

p = np.array([[0, 0.2, 0.8],
              [0.5, 0, 0.5],
              [0.1, 0.9, 0]])

# For example, we have sunday and the sun shines today.
v_start = np.array([1, 0, 0])

# Simulate 7 days.
# Day1 = Monday, ..., Day7 = Sunday
for day in range(1, 8):
    v = np.dot(v_start, matrix_power(p, day))
    prop_sun, prop_cloudy, prop_rain = round(100 * v[0], 2), round(100 * v[1], 2), round(100 * v[2], 2)
    print(f"Day: {day}, sun shines = {prop_sun}%, cloudy = {prop_cloudy}%, rain = {prop_rain}%")
Day: 1, sun shines: 0.0%, cloudy: 20.0%, rain: 80.0%
Day: 2, sun shines: 18.0%, cloudy: 72.0%, rain: 10.0%
Day: 3, sun shines: 37.0%, cloudy: 12.6%, rain: 50.4%
Day: 4, sun shines: 11.34%, cloudy: 52.76%, rain: 35.9%
Day: 5, sun shines: 29.97%, cloudy: 34.58%, rain: 35.45%
Day: 6, sun shines: 20.83%, cloudy: 37.9%, rain: 41.27%
Day: 7, sun shines: 23.08%, cloudy: 41.31%, rain: 35.62%

Next word prediction using markov property

Consider following example data

I like Math.
I like Physics.
I hate War.
I love Schnitzel.
I love Science.

Mapping words to next words.

i -> [like, like, hate, love, love]
like -> [math, physics]
hate -> [war]
love -> [schnitzel, science]

Graph representation

image

Mapping word and next_word to its probability.

(i, like) -> 0.4
(i, hate) -> 0.2
(i, love) -> 0.4
(like, math) -> 0.5
(like, physics) -> 0.5
(hate, war) -> 1.0
(love, schnitzel) -> 0.5
(love, science) -> 0.5

Graph representation with probabilitys

image

Example using the model

>>> from src/markov_next_word import MarkovNextWord
>>> mnw = MarkovNextWord()
>>> mnw.train('data/test.txt')
>>> mnw.word_to_nextwords
{'i': ['like', 'like', 'hate', 'love', 'love', 'love'], 'like': ['math', 'physics'], 'hate': ['war'], 'love': ['schnitzel', 'science']}
>>> mnw.word_to_next_word_prob
{('i', 'like'): 0.4, ('i', 'hate'): 0.2, ('i', 'love'): 0.4, ('like', 'math'): 0.5, ('like', 'physics'): 0.5, ('hate', 'war'): 1.0, ('love', 'schnitzel'): 0.5, ('love', 'science'): 0.5}