Feature Request: Add LSTM Algorithm to Neural Network Algorithms
LEVIII007 opened this issue · 2 comments
Feature description
Add LSTM Algorithm to Neural Network Algorithms
Feature Description:
I would like to propose adding an LSTM (Long Short-Term Memory) algorithm to the existing neural network algorithms in the repository. LSTMs are a type of recurrent neural network (RNN) that excel in handling sequential and time-series data, making them particularly valuable for tasks such as language modeling, text generation, and time-series forecasting.
Proposed Improvements:
-
Implementation of LSTM: Develop a comprehensive LSTM class that includes essential functionalities such as:
- Forward propagation through LSTM layers.
- Backpropagation through time (BPTT) for training.
- Methods for saving and loading the model.
- Support for various activation functions (sigmoid, tanh, softmax).
-
Example Usage: Include example usage code demonstrating how to train the LSTM on a dataset, such as predicting the next character in Shakespeare's text.
-
Documentation: Provide detailed documentation on the LSTM algorithm's implementation, explaining its structure, hyperparameters, and training process.
-
Unit Tests: Implement unit tests to ensure the correctness and robustness of the LSTM functionality.
Rationale:
Adding LSTM capabilities will enhance the versatility of the neural network algorithms available in this repository, allowing users to tackle a wider range of problems involving sequential data. Given the growing importance of time-series analysis and natural language processing, this addition would significantly benefit the community.
@LEVIII007 Should I create a separate file (.py) for lstm and also if I resolve this can you update that PR with hactober-accepted? (like if that is possible).
I have created a pull request addressing this issue. You can view it here: Link to PR #12082
text = "To be, or not to be, that is the question"
chars = list(set(text))
char_to_idx = {char: i for i, char in enumerate(chars)}
idx_to_char = {i: char for char, i in char_to_idx.items()}
vocab_size = len(chars)
hidden_size = 100
seq_length = 25
learning_rate = 0.001
Model Initialization
lstm = LSTM(input_size=vocab_size, hidden_size=hidden_size, output_size=vocab_size)
for epoch in range(1000):
h_prev = np.zeros((hidden_size, 1))
c_prev = np.zeros((hidden_size, 1))
loss = 0
for i in range(0, len(text) - seq_length, seq_length):
inputs = [char_to_idx[ch] for ch in text[i:i + seq_length]]
targets = [char_to_idx[ch] for ch in text[i + 1:i + seq_length + 1]]
x_seq = np.zeros((vocab_size, seq_length))
for t, idx in enumerate(inputs):
x_seq[idx, t] = 1
caches = []
for t in range(seq_length):
yt, h_prev, c_prev, cache = lstm.forward(x_seq[:, t:t + 1], h_prev, c_prev)
loss += -np.log(yt[targets[t], 0])
caches.append(cache)
dh_next = np.zeros_like(h_prev)
dc_next = np.zeros_like(c_prev)
for t in reversed(range(seq_length)):
dy = np.copy(yt)
dy[targets[t]] -= 1
grads = lstm.backward(dy, dh_next, dc_next, caches[t])
print(f"Epoch {epoch}, Loss: {loss}")
this is the code Relevent to that u can try it