TheAlgorithms/Python

Feature Request: Add LSTM Algorithm to Neural Network Algorithms

LEVIII007 opened this issue · 2 comments

Feature description

Add LSTM Algorithm to Neural Network Algorithms

Feature Description:

I would like to propose adding an LSTM (Long Short-Term Memory) algorithm to the existing neural network algorithms in the repository. LSTMs are a type of recurrent neural network (RNN) that excel in handling sequential and time-series data, making them particularly valuable for tasks such as language modeling, text generation, and time-series forecasting.

Proposed Improvements:

  1. Implementation of LSTM: Develop a comprehensive LSTM class that includes essential functionalities such as:

    • Forward propagation through LSTM layers.
    • Backpropagation through time (BPTT) for training.
    • Methods for saving and loading the model.
    • Support for various activation functions (sigmoid, tanh, softmax).
  2. Example Usage: Include example usage code demonstrating how to train the LSTM on a dataset, such as predicting the next character in Shakespeare's text.

  3. Documentation: Provide detailed documentation on the LSTM algorithm's implementation, explaining its structure, hyperparameters, and training process.

  4. Unit Tests: Implement unit tests to ensure the correctness and robustness of the LSTM functionality.

Rationale:

Adding LSTM capabilities will enhance the versatility of the neural network algorithms available in this repository, allowing users to tackle a wider range of problems involving sequential data. Given the growing importance of time-series analysis and natural language processing, this addition would significantly benefit the community.

@LEVIII007 Should I create a separate file (.py) for lstm and also if I resolve this can you update that PR with hactober-accepted? (like if that is possible).

I have created a pull request addressing this issue. You can view it here: Link to PR #12082

text = "To be, or not to be, that is the question"
chars = list(set(text))
char_to_idx = {char: i for i, char in enumerate(chars)}
idx_to_char = {i: char for char, i in char_to_idx.items()}
vocab_size = len(chars)

hidden_size = 100
seq_length = 25
learning_rate = 0.001

Model Initialization

lstm = LSTM(input_size=vocab_size, hidden_size=hidden_size, output_size=vocab_size)

for epoch in range(1000):
h_prev = np.zeros((hidden_size, 1))
c_prev = np.zeros((hidden_size, 1))
loss = 0

for i in range(0, len(text) - seq_length, seq_length):
    inputs = [char_to_idx[ch] for ch in text[i:i + seq_length]]
    targets = [char_to_idx[ch] for ch in text[i + 1:i + seq_length + 1]]

    x_seq = np.zeros((vocab_size, seq_length))
    for t, idx in enumerate(inputs):
        x_seq[idx, t] = 1

    caches = []
    for t in range(seq_length):
        yt, h_prev, c_prev, cache = lstm.forward(x_seq[:, t:t + 1], h_prev, c_prev)
        loss += -np.log(yt[targets[t], 0])
        caches.append(cache)

   
    dh_next = np.zeros_like(h_prev)
    dc_next = np.zeros_like(c_prev)
    for t in reversed(range(seq_length)):
        dy = np.copy(yt)
        dy[targets[t]] -= 1
        grads = lstm.backward(dy, dh_next, dc_next, caches[t])
        

print(f"Epoch {epoch}, Loss: {loss}")

this is the code Relevent to that u can try it