Summary of my Approach :

First of all, I thought of drawing some conclusions out of the given training dataset, like one of the conclusions I drew was that in a given word, the probability of finding a vowel is good until it constitutes about 45-48% of the given word length. I meticulously detailed all my precomputations in the attached notebook.
Apart from that, I decided to use the n-grams model, and here, I am using 1 (unigram), 2 (bigrams), and 3 (trigrams) up to 5 (five-grams). Although I experimented with 6 (six-grams), it yielded marginal improvement, prompting me to settle on five as the optimal cutoff. This decision aligns with intuition, as longer sequences tend to become a concatenation of shorter ones
The model is trained on a dictionary of approximately 250,000 words. This dictionary is used to determine the n-gram frequencies. For example, the following structure for the bigram frequencies are:

word length (n-gram frequencies depend on the length of the word)
first letter
second letter
second letter frequency (this indicates how many letter1-letter2 sequences there are)

Also, I have used different weights for the different grams like as shown below:

unigram_lambda = 0.10
bigram_lambda = 0.20
trigram_lambda = 0.30
fourgram_lambda = 0.40
fivegram_lambda = 0.45

Also, My model is using what I can say is dynamic n-grams, like for a 3-letter word, I am calculating bigrams and unigrams to calculate required probabilities and results. Also, Some more optimizations I've applied are like:

Removing the incorrect words at a benchmark of 3, i.e. I am recalibrating n-grams when my 3 tries are used, this is improving the result by decreasing the weightage of those elements which are leading to incorrect guesses.
Dynamically deciding the weightage of vowels while guessing the letters. According to some statistics which I showed as the pre-computations, I observed that we can give some extra weightage to vowels and only up to a certain limit, so, Incorporating this optimization leads to a good improvement.
Utilising the EM algorithm for optimizing the weights for the n-grams (unigram lambda, bigram lambda, etc.) to obtain improvement.

By using the above model and doing some greedy and strategic optimizations, I managed to achieve an overall success rate of 53.4% (1000 games played).

Alphx-rgb/TrexQuant-Hangman-Challenge

Summary of my Approach :