Predict Stock Prices with LSTM Networks
- Get link
- X
- Other Apps
Predict Stock Prices with LSTM Networks
Stock price prediction is a challenging yet highly rewarding application of machine learning. Accurate forecasts can help investors make informed decisions, hedge risks, and optimize portfolios. In this tutorial, you'll learn how to build a Long Short-Term Memory (LSTM) model— a type of recurrent neural network (RNN) —to predict stock prices using historical data from Yahoo Finance.
By the end of this guide, you'll be able to:
- Load and preprocess financial time-series data.
- Create sequences for LSTM training.
- Build, train, and evaluate an LSTM model using TensorFlow/Keras.
- Visualize predictions and interpret results.
Why Use LSTMs for Stock Price Prediction?
Traditional time-series models like ARIMA assume linearity and struggle with complex patterns. LSTMs, however, excel at capturing long-term dependencies and non-linear trends in sequential data—making them ideal for financial forecasting.
We'll use Yahoo Finance stock data, which provides historical prices, volumes, and other features. This dataset is perfect because:
- It’s publicly available and easy to access.
- It contains real-world volatility and trends.
- It includes multiple features (Open, High, Low, Close, Volume) for richer modeling.
Step 1: Install Required Libraries
Before we begin, ensure you have the following libraries installed:
pip install pandas numpy matplotlib yfinance tensorflow scikit-learn
yfinance: To fetch stock data from Yahoo Finance.tensorflow: For building the LSTM model.matplotlib: For visualization.
Step 2: Load and Explore the Dataset
We'll fetch historical stock data for Apple Inc. (AAPL) using yfinance.
import yfinance as yf
import pandas as pd
import matplotlib.pyplot as plt
# Download AAPL stock data from 2010 to 2023
data = yf.download('AAPL', start='2010-01-01', end='2023-12-31')
# Display first 5 rows
print(data.head())
Output:
Open High Low Close Adj Close Volume
Date
2010-01-04 211.250000 214.379993 210.750000 214.010002 214.010002 123432400
2010-01-05 214.390000 214.899994 212.559998 213.419998 213.419998 150476200
...
Visualize the Closing Price
plt.figure(figsize=(12, 6))
plt.plot(data['Close'], label='AAPL Closing Price')
plt.title('AAPL Stock Price (2010-2023)')
plt.xlabel('Date')
plt.ylabel('Price ($)')
plt.legend()
plt.show()

Observation: The stock shows an upward trend with periods of volatility—ideal for testing our LSTM model.
Step 3: Preprocess the Data
Feature Selection
We'll use the Closing Price for prediction. Other features (Open, High, Low, Volume) can be added later for improved accuracy.
# Use only 'Close' price
dataset = data['Close'].values.reshape(-1, 1)
dataset = dataset.astype('float32')
Normalize the Data
LSTMs perform better with normalized data (scaled between 0 and 1).
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))
dataset_scaled = scaler.fit_transform(dataset)
Step 4: Create Sequences for LSTM Training
LSTMs require input data in sequences (also called "windows"). We'll split the data into:
- X: Past 60 days of prices.
- y: Price on the 61st day.
import numpy as np
# Define sequence length
seq_length = 60
# Create X and y
X, y = [], []
for i in range(seq_length, len(dataset_scaled)):
X.append(dataset_scaled[i-seq_length:i, 0])
y.append(dataset_scaled[i, 0])
# Convert to numpy arrays
X = np.array(X)
y = np.array(y)
# Reshape X for LSTM input [samples, timesteps, features]
X = np.reshape(X, (X.shape[0], X.shape[1], 1))
Shape Check:
X.shape:(samples, 60, 1)y.shape:(samples,)
Step 5: Split into Training and Testing Sets
We'll use 80% of the data for training and 20% for testing.
train_size = int(len(X) * 0.8)
X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]
Step 6: Build the LSTM Model
We'll use TensorFlow/Keras to create a stacked LSTM model.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
model = Sequential([
LSTM(50, return_sequences=True, input_shape=(X_train.shape[1], 1)),
Dropout(0.2),
LSTM(50, return_sequences=False),
Dropout(0.2),
Dense(25),
Dense(1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
Model Summary:
- 2 LSTM layers with 50 units each.
- Dropout layers (20%) to prevent overfitting.
- Dense layers for final prediction.
Step 7: Train the Model
history = model.fit(
X_train, y_train,
batch_size=32,
epochs=20,
validation_data=(X_test, y_test),
verbose=1
)
Training Visualization
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

Insight: The loss decreases over epochs, indicating the model is learning. A gap between training and validation loss suggests slight overfitting (mitigated by dropout).
Step 8: Evaluate the Model
Make Predictions
predictions = model.predict(X_test)
predictions = scaler.inverse_transform(predictions) # Rescale to original prices
y_test_actual = scaler.inverse_transform(y_test.reshape(-1, 1))
Calculate RMSE (Root Mean Squared Error)
from sklearn.metrics import mean_squared_error
rmse = np.sqrt(mean_squared_error(y_test_actual, predictions))
print(f"RMSE: {rmse:.2f}")
Example Output:
RMSE: 12.45
Interpretation: An RMSE of ~$12.45 means our predictions are, on average, off by $12.45 from the actual price—a reasonable error for a volatile stock.
Step 9: Visualize Predictions vs. Actual Prices
plt.figure(figsize=(12, 6))
plt.plot(y_test_actual, label='Actual Price')
plt.plot(predictions, label='Predicted Price')
plt.title('AAPL Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('Price ($)')
plt.legend()
plt.show()

Observation: The model captures the overall trend but struggles with sudden spikes—a common challenge in stock prediction.
Step 10: Improve the Model (Optional)
To enhance accuracy, consider:
- Adding more features (e.g., Volume, Moving Averages).
- Increasing sequence length (e.g., 90 or 120 days).
- Hyperparameter tuning (e.g., learning rate, batch size).
- Using Bidirectional LSTMs for better context understanding.
Conclusion
In this tutorial, you learned how to:
✅ Fetch and preprocess stock data from Yahoo Finance.
✅ Create sequences for LSTM training.
✅ Build and train an LSTM model using TensorFlow/Keras.
✅ Evaluate predictions using RMSE and visualizations.
While LSTMs are powerful, stock markets are influenced by unpredictable factors (news, politics, etc.), so no model can guarantee 100% accuracy. Use this as a foundation and experiment with advanced techniques like Transformer models or ensemble methods for better results.
Next Steps
- Try other stocks (e.g., Tesla, Google) and compare results.
- Incorporate sentiment analysis from news headlines.
- Deploy the model as a web app using Flask/Django.
Happy predicting! 🚀
- Get link
- X
- Other Apps
Comments
Post a Comment