Stock Price Predictor

A full-stack educational machine-learning project for predicting the next trading day's stock closing price. The project trains regression models on historical market data, exposes predictions through a Flask API, and provides a simple browser-based client for entering ticker symbols and viewing predicted price movement.

Disclaimer
This project is intended for learning, experimentation, and portfolio demonstration only. It is not financial advice and should not be used as the sole basis for investment decisions.

Overview

The system predicts the next-day closing price of a stock using historical OHLCV data, S&P 500 market movement, and technical indicators. The default training flow creates one pooled global model across selected tickers and saves it as GLOBAL.pkl. The web client then calls the API with global=true and uses that global model for predictions.

The project also supports training separate per-ticker models such as AAPL.pkl, MSFT.pkl, and TSLA.pkl.

Key Features

Next trading day stock closing-price prediction.
Historical market-data download with yfinance.
S&P 500 daily return as a market-context feature.
Technical indicators including moving averages, RSI, Bollinger Bands, volume ratios, spreads, and short-term returns.
Default Quantile Gradient Boosting model with prediction ranges.
Optional MLP neural-network regressor.
Global pooled model across many tickers, with compact ticker identity hash features.
Optional per-ticker model training.
Flask API with CORS support.
Lightweight static HTML/CSS/JavaScript client.
Model evaluation using MAE, RMSE, and Pinball Loss for quantile models.

Tech Stack

Layer	Technologies
Machine Learning	Python, scikit-learn, NumPy, pandas
Market Data	yfinance
Backend API	Flask, Flask-CORS
Frontend	HTML, CSS, JavaScript
Model Storage	Pickle files (`.pkl`)
Optional Tuning	Optuna

Project Structure

predictStockMachineLearning-main/
├── Client/
│   ├── CSS/
│   │   └── style.css
│   ├── JS/
│   │   └── script.js
│   └── index.html
├── ModelTraining/
│   ├── features.py
│   ├── model.py
│   ├── predict.py
│   └── train.py
├── Server/
│   ├── requirements.txt
│   └── server.py
├── requirements.txt
├── .gitignore
└── README.md

Main Components

Path	Purpose
`ModelTraining/features.py`	Builds the feature set used during both training and prediction.
`ModelTraining/model.py`	Contains model wrappers and metric functions.
`ModelTraining/train.py`	Trains global or per-ticker models and saves them as `.pkl` files.
`ModelTraining/predict.py`	Loads trained models and generates next-day predictions.
`Server/server.py`	Exposes the prediction API on `localhost:8080`.
`Client/index.html`	Browser UI for submitting ticker symbols.
`Client/JS/script.js`	Calls the Flask API and renders prediction results.

How It Works

Historical stock data is downloaded from Yahoo Finance through yfinance.
S&P 500 historical data is downloaded and converted into daily returns.
Technical indicators are calculated from each ticker's historical price and volume data.
The training script creates supervised examples where today's features are mapped to tomorrow's closing price or tomorrow's return.
A model is trained and saved under ModelTraining/models/.
The Flask server loads the trained model and exposes prediction endpoints.
The web client sends ticker requests to the API and displays current price, predicted price, expected change, and model metrics.

Getting Started

Prerequisites

Python 3.10 or newer recommended.
Internet connection for downloading market data.
A modern browser for the frontend client.

1. Clone the Repository

git clone <your-repository-url>
cd predictStockMachineLearning-main

2. Create and Activate a Virtual Environment

On macOS/Linux:

python -m venv .venv
source .venv/bin/activate

On Windows PowerShell:

python -m venv .venv
.venv\Scripts\Activate.ps1

3. Install Dependencies

pip install -r requirements.txt

4. Train a Demo Global Model

This trains one global model on a small demo set: AAPL, MSFT, GOOGL, and TSLA.

python ModelTraining/train.py --demo --target return

The trained model is saved to:

ModelTraining/models/GLOBAL.pkl

5. Start the API Server

python Server/server.py

The API will run at:

http://localhost:8080

6. Open the Web Client

Open this file directly in your browser:

Client/index.html

Enter a ticker symbol such as AAPL, MSFT, GOOGL, or TSLA and click Predict.

Training Models

Train the Default Global Model

By default, if no --tickers or --demo flag is provided, the script attempts to train on the full S&P 500 list.

python ModelTraining/train.py

For a faster demo run:

python ModelTraining/train.py --demo

Train a Global Model on Specific Tickers

python ModelTraining/train.py --tickers AAPL MSFT NVDA AMZN --global-model

Train Separate Per-Ticker Models

python ModelTraining/train.py --tickers AAPL MSFT TSLA --per-ticker

This creates files such as:

ModelTraining/models/AAPL.pkl
ModelTraining/models/MSFT.pkl
ModelTraining/models/TSLA.pkl

Train on Tomorrow's Return Instead of Tomorrow's Price

python ModelTraining/train.py --demo --target return

Training on return can sometimes produce more stable behavior than predicting absolute prices directly. During prediction, the return is converted back into an estimated price.

Use the MLP Neural Network Model

python ModelTraining/train.py --demo --model mlp

Custom hidden layers can be passed as a comma-separated list:

python ModelTraining/train.py --demo --model mlp --mlp-hidden 64,32 --mlp-max-iter 1200

Use Walk-Forward Validation

python ModelTraining/train.py --tickers AAPL MSFT --per-ticker --walk-forward

Tune Gradient Boosting Hyperparameters with Optuna

python ModelTraining/train.py --demo --optuna-trials 25

Optuna is included in the root requirements.txt. Hyperparameter tuning is currently supported for the Quantile Gradient Boosting model.

Running the API Server

Start the server from the project root:

python Server/server.py

The server exposes:

GET http://localhost:8080/health
GET http://localhost:8080/stock?ticker=AAPL&global=true

The server loads models from:

ModelTraining/models/

Using the Web Client

The frontend is a static client located in Client/index.html. It sends requests to:

http://localhost:8080/stock?ticker=<TICKER>&global=true

Because the client uses global=true, make sure ModelTraining/models/GLOBAL.pkl exists before using the UI.

API Reference

Health Check

GET /health

Example response:

{
  "status": "ok"
}

Predict Stock Price

GET /stock?ticker=AAPL&global=true

Query parameters:

Parameter	Required	Description
`ticker`	Yes	Stock ticker symbol, for example `AAPL`.
`global`	No	Use the global model when set to `true`, `1`, `yes`, or `y`. If omitted, the server attempts to load a per-ticker model.

Example response:

{
  "ticker": "AAPL",
  "last_close": 195.64,
  "last_date": "2026-06-08",
  "prediction": 197.21,
  "range_low": 192.10,
  "range_high": 201.45,
  "change": 1.57,
  "change_pct": 0.80,
  "mae": 3.42,
  "rmse": 4.91
}

Response fields:

Field	Description
`ticker`	Normalized ticker symbol.
`last_close`	Latest available closing price.
`last_date`	Date of the latest available market data.
`prediction`	Predicted next-day closing price.
`range_low`	Lower quantile prediction, when available.
`range_high`	Upper quantile prediction, when available.
`change`	Difference between prediction and latest close.
`change_pct`	Percentage change between prediction and latest close.
`mae`	Mean Absolute Error measured during validation.
`rmse`	Root Mean Squared Error measured during validation.

Modeling Details

Default Model

The default model is a Quantile Gradient Boosting regressor. It trains separate models for multiple quantiles, usually:

0.1, 0.5, 0.9

The median quantile (0.5) is used as the main prediction. The lower and upper quantiles provide an estimated prediction range.

Optional Model

The project also includes a simple MLP regressor based on scikit-learn's MLPRegressor. It uses feature scaling and supports configurable hidden layers.

Features

The default feature set includes:

Close
SP500_Return
SMA_5, SMA_20, SMA_50
EMA_5, EMA_20, EMA_50
RSI_14
BB_Upper_20, BB_Lower_20
Volume, Volume_MA_20, Volume_Ratio
High_Low_Spread
Return_1d, Return_3d, Return_5d

For global models, additional ticker hash features are added by default so that one pooled model can learn ticker-specific patterns without creating one model file per stock.

Metrics

The project reports:

Metric	Meaning
MAE	Average absolute prediction error in price units.
RMSE	Square-root average squared error; penalizes larger errors more heavily.
Pinball Loss	Quantile-regression loss used for evaluating quantile predictions.
Baseline MAE/RMSE	Naive baseline that predicts tomorrow's close as today's close.

Important Notes

Generated model files are intentionally excluded from Git by .gitignore.
If GLOBAL.pkl does not exist, the web client will not work with the default API request.
Training on the full S&P 500 can take significantly longer than the demo mode.
Predictions depend on external data from Yahoo Finance, so network issues or unavailable tickers may cause errors.
Pickle model files should only be loaded from trusted sources.
This is an educational project and not a production trading system.

Future Improvements

Add automated tests for feature engineering and API responses.
Add Docker support for easier deployment.
Add a configuration file for API URL, model type, and default prediction mode.
Add charts for historical prices and prediction ranges in the frontend.
Add model versioning and experiment tracking.
Add CI workflow for linting and test execution.
Add a proper LICENSE file before publishing the repository publicly.

License

No license file is currently included in the project. Before publishing or accepting contributions, add a license such as MIT, Apache-2.0, or another license that matches your intended use.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Client		Client
ModelTraining		ModelTraining
Server		Server
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation