Comparing Machine Learning Algorithms for stock prediction and stock index movement using trend deterministic data preparation techniques.
If youโre into presentations rather than readme files, weโd suggest you to check out the project presentation
๐ฉ Table of Contents
- Introduction
- What are we doing
- Base Research Papers
- Technology Stack
- Further Reading
- Screenshots
- Team
- Mentor
- Guidelines
- License
๐ก Introduction
In 2014, Xinjie Di, an SCPD student from Apple Inc. submitted a paper which focused on predicting stock price trend for a company in the near future. The feature space was derived from the time series of the stock itself and was concerned with potential movement of past price. Tree algorithm was applied to feature selection and it suggests a subset of stock technical indicators are critical for predicting the stock trend.
Experiment results suggested an accuracy of more than 70% on predicting 3-10 day average price trend with SVM algorithm.
Another paper presented in Expert Systems with Applications journal under Elsevier publishing company by Jigar Patel, Sahil Shah, Priyank Thakkar, K. Kotecha quoted as Patel , J., et al. Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques. Expert Systems with Applications (2014) addressed the problem of predicting direction of movement of stock and stock price index for Indian Stock Markets.
The paper compares four prediction models, Aritificial Neural Network(ANN), Support Vector Machine(SVM), Random Forest and Naive Bayes with two approaches for input to these models.
The first approach for input data involves computation of ten technical parameters using stock trading data(open, high, low & close prices) while the second approach focuses on representing these technical parameters as trend deterministic data. Accuracy of each of the prediction models for each of the two input approaches was evaluated. Evaluation was carried out on 10 years of historical data from 2003 to 2012 of two stocks namely Reliance Industries and Infosys Ltd.
Experiment results suggested that for the first approach of input data Random Forest outperforms other three prediction models on overall performance. Experimental results also show that the performance of all the prediction models improve when these technical parameters are represented as trend deterministic data.
๐ What are we doing
In our initial research we found out that efficiency in predicting stock price movements and their trends maybe improved if the prediction model could โrememberโ the historical movement of the stocks.
General prediction models that are used in the previously described research papers like Naive Bayes, Support Vector Machines, Random Forest, etc., can only take in consideration the most recent previous input received to the model.
Our motive is to implement the Long Short Term Memory networks in the prediction on stock price movements and their trends, as LSTM networks have internal contextual state cells that act as long-term or short-term memory cells.
The output of the LSTM network is modulated by the state of these cells. This is a very important property when we need the prediction of the neural network to depend on the historical context of inputs, rather than only on the very last input.
๐ Base Research Papers
๐ป Technology Stack
-
Programming Language
Python3.6 is the programming language used in the experiment.
-
Editor
For code editing creating files we are using the following editors:
For development, training, deployment of the models, we are using Jupyter Notebook along with Anaconda Integrated Development Environment.
-
Libraries
- Numpy
- Pandas
- scikit-learn
- matplotlib
- keras
-
tensorflow
-
Dataset
All the dataset will be used from quandl.com. Quandl is a platform for financial, economic, and alternative data that serves investment professionals. Quandl sources data from over 500 publishers. All Quandlโs data are accessible via an API. API access is possible through packages for multiple programming languages including R, Python, Matlab.
We will be using stock price dataset of OHLC format of the following companies to train and test our prediction models:
- Google(GOOGL)
- Apple (AAPL)
- Amazon (AMZN)
๐ Further Reading
- Naive Bayes
- Support Vector Machine
- Random Forest
- Artificial Neural Network
- XGBoost
- Long Short Term Memory
๐พ Screenshots
๐ Results
๐ซ Team
- Samar Srivastava
- Mohak Kulshrestha
- Saurabh Arya
- Kashvi Agarwal
๐คฏ Mentor
Mr. Munish Khanna, Head Of Department, CSE, HCST, Farah
Guidelines
Please follow the following guidelines in order to commit to the repository.
๐ License
This software is licensed under the GNU General Public License v3.0