Welcome to Scribd!

Machine Learning Beats Traditional Method for Stock Selection

Uploaded by

0% found this document useful (0 votes)

26 views19 pages

This document describes research using machine learning for stock selection. It presents the Prototype Ranking (PR) method, which finds representative data points (prototypes) from the training data and uses k-nearest neighbors on the prototypes to rank test data. Experimental results show that PR achieves diversification by selecting larger portfolios with lower average returns and risk compared to smaller portfolios. PR also outperforms a traditional non-machine learning baseline method in selecting a portfolio of 10 stocks. The researchers conclude that PR is effective for the noisy and imbalanced stock data problem and has potential for practical investment applications.

Original Description:

presentation

Original Title

kdd07_ling_mlfss_01

Copyright

Available Formats

PPT, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

26 views19 pages

Machine Learning Beats Traditional Method for Stock Selection

Uploaded by

thyagosmesme

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 19

Search inside document

Machine Learning for Stock Selection

Robert J. Yan
Charles X. Ling
University of Western Ontario, Canada
{jyan, cling}@csd.uwo.ca

1
Outline
 Introduction
 The stock selection task
 The Prototype Ranking method
 Experimental results
 Conclusions

2
Introduction
 Objective:
– Use machine learning to select a small number
of “good” stocks to form a portfolio
 Research questions:
– Learning in the noisy dataset
– Learning in the imbalanced dataset
 Our solution: Prototype Ranking
– A specially designed machine learning method

3
Outline
 Introduction
 The stock selection task
 The Prototype Ranking method
 Experimental results
 Conclusions

4
Stock Selection Task
Given information prior to week t, predict
performance of stocks of week t
– Training set

Predictor 1 Predictor 2 Predictor 3 Goal

Stock ID Return of Return of Volume ratio Return of
week t-1 week t-2 of t-2/t-1 week t

Learning a ranking function to rank testing data

– Select n highest to buy, n lowest to short-sell

5
Outline
 Introduction
 The stock selection task
 The Prototype Ranking method
 Experimental results
 Conclusions

6
Prototype Ranking

 Prototype Ranking (PR): special machine

learning for noisy and imbalanced stock data

 The PR System
Step 1. Find good “prototypes” in training data
Step 2. Use k-NN on prototypes to rank test data

7
Step 1: Finding Prototypes
Prototypes: representative points
– Goal: discover the underlying
density/clusters of the training
samples by distributing
prototypes in sample
space
– Reduce data size
prototypes samples

prototype
neighborhood 8
Finding prototypes using competitive learning

General competitive learning

 Step 1: Randomly initialize a set of prototypes
 Step 2: Search the nearest prototypes
 Step 3: Adjust the prototypes
 Step 4: Output the prototypes

Hidden density in training is reflected in prototypes

10
Modifications for Stock data

 In step 1: Initial prototypes organized in a tree-structure

– Fast nearest prototype searching
 In step 2: Searching prototypes in the predictor space
– Better learning effect for the prediction tasks
 In step 3: Adjusting prototypes in the goal attribute space
– Better learning effect in the imbalanced stock data
 In step 4, prune the prototype tree
– Prune children prototypes if they are similar to the parent
– Combine leaf prototypes to form the final prototypes

11
Step 2: Predicting Test Data
 The weighted average of k nearest prototypes
 Online update the model with new data

12
Outline
 Introduction
 The stock selection task
 The Prototype Ranking method
 Experimental results
 Conclusions

13
Data
CRSP daily stock database
– 300 NYSE and AMEX stocks, largest market cap
– From 1962 to 2004

14
Testing PR

 Experiment 1: Larger portfolio, lower average

return, lower risk – diversification
 Experiment 2: is PR better than Cooper’s
method?

15
Results of Experiment 1
1.8
1.6

Weekly Average
1.4

Return (%)
1.2
1
Average 0.8
0.6
Return 0.4
0.2
(1978-2004) 0
0 10 20 30 40 50 60 70 80 90 100 110
Stock Number in Portfolio

5
Weekly Std.(%)

4.5

4
Risk (std)
3.5
(1978-2004) 3

2.5

2
0 10 20 30 40 50 60 70 80 90 100 110

Stock Number in Portfolio

16
Experiment 2: Comparison to
Cooper’s method
 Cooper’s method (CP): A traditional non-
ML method for stock selection…
 Compare PR and CP in 10-stock portfolios

17
Results of Experiment 2
Measures:
 Average Return (Ret.)
 Sharpe Ratio (SR): a risk-adjusted return: SR= Ret. / Std.

1.6
1.4
1.2
1
0.8 PR 10-stock portfolio
CP 10-stock portfolio
0.6
0.4
0.2
0 18
Ret.(%) SR
Outline
 Introduction
 The stock selection task
 The Prototype Ranking method
 Experimental results
 Conclusions

20
Conclusions
 PR: modified competitive learning and k-NN
for noisy and imbalanced stock data
 PR does well in stock selection
– Larger portfolio, lower return, lower risk
– PR outperforms the non-ML method CP
 Future work: use it to invest and make money!

DT-0 (3 Files Merged)
Document143 pages
DT-0 (3 Files Merged)
Qasim Abid
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
Document29 pages
Classification With Decision Trees I: Instructor: Qiang Yang
Poornima Venkatesh
No ratings yet
Statistics Related Syllabus
Document2 pages
Statistics Related Syllabus
john
No ratings yet
Module 1
Document140 pages
Module 1
naitik S T
No ratings yet
Data Science
Document18 pages
Data Science
Dilip Devil
No ratings yet
Lab 8
Document6 pages
Lab 8
dpo
No ratings yet
Unit 7 - Week 7: Cluster & Discriminant Analysis, Rural & International Marketing Research
Document3 pages
Unit 7 - Week 7: Cluster & Discriminant Analysis, Rural & International Marketing Research
Deepak Sharma
No ratings yet
5212256
Document6 pages
5212256
ali fatehi
No ratings yet
Machine Learning & Data Mining
Document4 pages
Machine Learning & Data Mining
Priyaprasad Panda
No ratings yet
Semester 2, 2020 Week 8: Data Mining in WEKA Tutorial/Lab Session - 7
Document13 pages
Semester 2, 2020 Week 8: Data Mining in WEKA Tutorial/Lab Session - 7
Tiarana Paka
No ratings yet
KD III 5 DecisionTrees 1415
Document46 pages
KD III 5 DecisionTrees 1415
Manuel
No ratings yet
Machine Learning: Data Analysis Process
Document30 pages
Machine Learning: Data Analysis Process
Fifa Hany
No ratings yet
Doe
Document37 pages
Doe
Snehasish Ishar
No ratings yet
Sampling:: Design and Procedure
Document47 pages
Sampling:: Design and Procedure
Jesah Christa Montaño
No ratings yet
Experiment Title. 1.1: Computer Generation of Random Numbers
Document3 pages
Experiment Title. 1.1: Computer Generation of Random Numbers
19BCS2860Divyanshu Dobriyal
No ratings yet
Advance Machine Learning
Document4 pages
Advance Machine Learning
ranjit.e10947
No ratings yet
ML Module Iii
Document12 pages
ML Module Iii
Crazy Chethan
No ratings yet
SubSylPDF 988
Document2 pages
SubSylPDF 988
bonaventureago
No ratings yet
Lab 01
Document35 pages
Lab 01
alibnraslan
No ratings yet
Machine Learning Tutorial Machine Learning Tutorial
Document33 pages
Machine Learning Tutorial Machine Learning Tutorial
Sudhakar Murugasen
No ratings yet
St. John College of Engineering and Management, Palghar - Maharashtra
Document11 pages
St. John College of Engineering and Management, Palghar - Maharashtra
pranay
No ratings yet
Data Science Chapitre 1
Document54 pages
Data Science Chapitre 1
Leonel Ska
No ratings yet
MARK977 Research For Marketing Decisions Autumn Semester, 2017 Week 8
Document54 pages
MARK977 Research For Marketing Decisions Autumn Semester, 2017 Week 8
Soumia Handou
No ratings yet
IE005 Lab Exercise 1 and 2
Document15 pages
IE005 Lab Exercise 1 and 2
J
No ratings yet
Operating System
Document5 pages
Operating System
james
No ratings yet
Assignment+1 +Regression+This+assignment+is+to+
Document6 pages
Assignment+1 +Regression+This+assignment+is+to+
Daniel Gonzalez
No ratings yet
Module 5 Lesson 4 Mean and Variance of The Sampling Distribution of Sample Means
Document64 pages
Module 5 Lesson 4 Mean and Variance of The Sampling Distribution of Sample Means
santiagorachel445
No ratings yet
Reagrding Lab Test
Document8 pages
Reagrding Lab Test
aman raj
No ratings yet
PG Program in Data Science - Detailed
Document14 pages
PG Program in Data Science - Detailed
Guru Baran
No ratings yet
Data Mining Tasks and Techniques
Document53 pages
Data Mining Tasks and Techniques
Soma Fadi
No ratings yet
DSCI 6003 Class Notes
Document7 pages
DSCI 6003 Class Notes
Desire Matouba
No ratings yet
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
Document11 pages
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
Vj Kumar
No ratings yet
Marketing Analytics Sample Syllabus UG Version 1.0
Document4 pages
Marketing Analytics Sample Syllabus UG Version 1.0
fabri
No ratings yet
Thesis
Document103 pages
Thesis
Gheorghe
No ratings yet
Analytics Handbook Final
Document136 pages
Analytics Handbook Final
mamibo5118
No ratings yet
Statistical Process Control Charts Guide for Continuous and Attribute Data
Document28 pages
Statistical Process Control Charts Guide for Continuous and Attribute Data
Ryan ms
No ratings yet
Mime 4100 Final Exam: Prospectus Spring 2011
Document2 pages
Mime 4100 Final Exam: Prospectus Spring 2011
dereje
No ratings yet
Algorithms For Predictive Maintenance Efficiently Developed With Matlab
Document22 pages
Algorithms For Predictive Maintenance Efficiently Developed With Matlab
Hanif Andi
No ratings yet
Machine Learning Objectives Course Schedule Objectives of Business Analytics
Document28 pages
Machine Learning Objectives Course Schedule Objectives of Business Analytics
Om Mishra
No ratings yet
Feature importance using permutation and alternative methods
Document11 pages
Feature importance using permutation and alternative methods
Juan
No ratings yet
Pa ZG512 Ec-3r First Sem 2022-2023
Document5 pages
Pa ZG512 Ec-3r First Sem 2022-2023
2022mb21301
No ratings yet
ICML - 2022 - Active Testing Sample-Efficient Model Evaluation
Document11 pages
ICML - 2022 - Active Testing Sample-Efficient Model Evaluation
gao jiashi
No ratings yet
Collaborative Learning of Programming using Automated Grading System
Document25 pages
Collaborative Learning of Programming using Automated Grading System
Prasanta Paul
No ratings yet
A Comparative Study of Data Splitting Algorithms For Machine Learning Model Selection
Document29 pages
A Comparative Study of Data Splitting Algorithms For Machine Learning Model Selection
ELIANE
No ratings yet
Chapter 1:introduction To Design of Experiments
Document20 pages
Chapter 1:introduction To Design of Experiments
Sachin K Kamble
No ratings yet
17.Inter-Laboratory Comparisons (EQA)
Document53 pages
17.Inter-Laboratory Comparisons (EQA)
aslamrasul
No ratings yet
Course Syllabus - Basic Statistics
Document3 pages
Course Syllabus - Basic Statistics
Sokhoeurn Mour
No ratings yet
Smai A1 PDF
Document3 pages
Smai A1 PDF
Zubair Ahmed
No ratings yet
Noc17-Mg24 Week 08 Assignment 01
Document4 pages
Noc17-Mg24 Week 08 Assignment 01
Amarendra Pattanayak
No ratings yet
AI Final Report
Document29 pages
AI Final Report
Vishesh Bhargava
No ratings yet
Data Science Interview Questions
Document68 pages
Data Science Interview Questions
Ava White
100% (1)
Exp 1
Document4 pages
Exp 1
Biki Kumar
No ratings yet
Hypothesis Testing and P-Values in Business Analytics
Document16 pages
Hypothesis Testing and P-Values in Business Analytics
Trương Phương
No ratings yet
Statiscal Method Using R Lab, Syllabus
Document3 pages
Statiscal Method Using R Lab, Syllabus
Abdul Wajeed
No ratings yet
15 Chapter6 PDF
Document12 pages
15 Chapter6 PDF
diva
No ratings yet
Machine Learning
Document21 pages
Machine Learning
Footloose
100% (1)
ML Lab Manual
Document38 pages
ML Lab Manual
Rahul
No ratings yet
7 Classification
Document63 pages
7 Classification
sunnynnus
100% (1)
DWDM Lab Manual: Department of Computer Science and Engineering
Document46 pages
DWDM Lab Manual: Department of Computer Science and Engineering
Dilli Books
No ratings yet
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
From Everand
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
Blaine Bateman
No ratings yet
Bootstrap 1
Document16 pages
Bootstrap 1
thyagosmesme
No ratings yet
TWO Estimated Snow Load Curves AND THE Bootstrap Significance Test
Document15 pages
TWO Estimated Snow Load Curves AND THE Bootstrap Significance Test
thyagosmesme
No ratings yet
Probability Theory and Stochastic Processes With Applications
Document382 pages
Probability Theory and Stochastic Processes With Applications
nogarycms
70% (10)
Mathematics: Least-Squares Estimators of Drift Parameter For Discretely Observed Fractional Ornstein-Uhlenbeck Processes
Document20 pages
Mathematics: Least-Squares Estimators of Drift Parameter For Discretely Observed Fractional Ornstein-Uhlenbeck Processes
thyagosmesme
No ratings yet
Backtest OverFitting
Document58 pages
Backtest OverFitting
alexbernal0
No ratings yet
Classification With Rejection Concepts and Evaluations
Document13 pages
Classification With Rejection Concepts and Evaluations
thyagosmesme
No ratings yet
1 s2.0 S105381192100344X Main
Document8 pages
1 s2.0 S105381192100344X Main
thyagosmesme
No ratings yet
Mathematics: Least-Squares Estimators of Drift Parameter For Discretely Observed Fractional Ornstein-Uhlenbeck Processes
Document20 pages
Mathematics: Least-Squares Estimators of Drift Parameter For Discretely Observed Fractional Ornstein-Uhlenbeck Processes
thyagosmesme
No ratings yet
ABCT Vista PPT 11.18 .10
Document1 page
ABCT Vista PPT 11.18 .10
thyagosmesme
No ratings yet
Prediction of Treatment Outcomes and Longitudinal Analysis in Children With Autism Undergoing Intensive Behavioral Intervention
Document10 pages
Prediction of Treatment Outcomes and Longitudinal Analysis in Children With Autism Undergoing Intensive Behavioral Intervention
thyagosmesme
No ratings yet
Haas
Document20 pages
Haas
thyagosmesme
No ratings yet
How To Spot Backtest Overfitting: Lawrence Berkeley National Lab (Retired), and University of California, Davis
Document15 pages
How To Spot Backtest Overfitting: Lawrence Berkeley National Lab (Retired), and University of California, Davis
thyagosmesme
No ratings yet
1 Gbpusd-Rsi PDF
Document1 page
1 Gbpusd-Rsi PDF
thyagosmesme
No ratings yet
Analysis of Financial Time Series Tsay
Document57 pages
Analysis of Financial Time Series Tsay
thyagosmesme
No ratings yet
Haas
Document20 pages
Haas
thyagosmesme
No ratings yet
Front-End Factor Analysis For Speaker Verification
Document12 pages
Front-End Factor Analysis For Speaker Verification
thyagosmesme
No ratings yet
Harvey
Document11 pages
Harvey
thyagosmesme
No ratings yet
Computational Models of Financial Price Prediction: A Survey of Neural Networks, Kernel Machines and Evolutionary Computation Approaches
Document9 pages
Computational Models of Financial Price Prediction: A Survey of Neural Networks, Kernel Machines and Evolutionary Computation Approaches
mailmado
No ratings yet
6 Simple Strategies For Trading Forex PDF
Document95 pages
6 Simple Strategies For Trading Forex PDF
Leo
100% (3)
SSRN Id2532906
Document62 pages
SSRN Id2532906
thyagosmesme
No ratings yet
Machine Learning Framework for Stock Selection
Document6 pages
Machine Learning Framework for Stock Selection
thyagosmesme
No ratings yet
Take Profit and Stop Loss Trading Strategies Comparison in Combination With An MACD Trading System
Document23 pages
Take Profit and Stop Loss Trading Strategies Comparison in Combination With An MACD Trading System
thyagosmesme
No ratings yet
Lecture 20: Model Adaptation: Machine Learning April 15, 2010
Document35 pages
Lecture 20: Model Adaptation: Machine Learning April 15, 2010
thyagosmesme
No ratings yet
How To Spot Backtest Overfitting: Lawrence Berkeley National Lab (Retired), and University of California, Davis
Document15 pages
How To Spot Backtest Overfitting: Lawrence Berkeley National Lab (Retired), and University of California, Davis
thyagosmesme
No ratings yet
Fa Theory
Document17 pages
Fa Theory
thyagosmesme
No ratings yet
91 01 006 PDF
Document49 pages
91 01 006 PDF
Muhammad Azhar Saleem
No ratings yet
Lec 1
Document40 pages
Lec 1
bababirinchi
No ratings yet
Poster
Document2 pages
Poster
thyagosmesme
No ratings yet
Dos and Don'ts in Tour Guiding
Document9 pages
Dos and Don'ts in Tour Guiding
LEONARDO FLORES
100% (1)
EDUC 302 Learning Episode 10 CBAR 1
Document1 page
EDUC 302 Learning Episode 10 CBAR 1
Cedrick Lumibao
No ratings yet
Theories On Origin of Languages
Document6 pages
Theories On Origin of Languages
merlin selastina
No ratings yet
Report On Integrating Technology in Mathematics Curriculum
Document6 pages
Report On Integrating Technology in Mathematics Curriculum
Ailyn Larita Estrada
No ratings yet
Language Program Singapore
Document5 pages
Language Program Singapore
Hope Dadivas
No ratings yet
Test of Interactive English: Teachers' Information
Document22 pages
Test of Interactive English: Teachers' Information
Kaokao Teh Tarik
No ratings yet
Civil War Lesson Plan
Document2 pages
Civil War Lesson Plan
garrece
No ratings yet
Bioethics Seminar
Document1 page
Bioethics Seminar
api-24500288
No ratings yet
Final Portfolio
Document17 pages
Final Portfolio
Diego Prada
No ratings yet
What Is Informatics
Document17 pages
What Is Informatics
Maria Erica Jan Miranda
No ratings yet
Arrey Final
Document46 pages
Arrey Final
Alibert Ncho
No ratings yet
Mtglass Resume
Document2 pages
Mtglass Resume
api-557253004
No ratings yet
RWskillsQ1 Edited
Document40 pages
RWskillsQ1 Edited
janeangelica117
No ratings yet
New TIP Course 1 DepEd Teacher
Document119 pages
New TIP Course 1 DepEd Teacher
Aiza Eala
No ratings yet
Benefits of CAR
Document2 pages
Benefits of CAR
Iman Nugroho
No ratings yet
Conscious Experience Chapter 8
Document15 pages
Conscious Experience Chapter 8
Queeny Cor
No ratings yet
Sinha's Comprehensive Anxiety Test (SCAT)
Document9 pages
Sinha's Comprehensive Anxiety Test (SCAT)
Pragya
100% (13)
(Paul Begley) Values of Educational Administration
Document285 pages
(Paul Begley) Values of Educational Administration
Rosa Shafira
No ratings yet
BRM Module 2 CG
Document7 pages
BRM Module 2 CG
sudharshan235
No ratings yet
Nurse-Patient Interaction: Saint Louis University Baguio City
Document13 pages
Nurse-Patient Interaction: Saint Louis University Baguio City
Nilkram
No ratings yet
Aliah DLL Mass and Count Nouns
Document6 pages
Aliah DLL Mass and Count Nouns
Aliah Lope Dimacangan
No ratings yet
Ali Balai - Knowledge Maps A Systematic Literature Review and Directions For
Document25 pages
Ali Balai - Knowledge Maps A Systematic Literature Review and Directions For
huala hul
No ratings yet
Pepsi Project
Document13 pages
Pepsi Project
api-533625034
No ratings yet
An Overview of Deep Learning in Medical Imaging Fo
Document45 pages
An Overview of Deep Learning in Medical Imaging Fo
freak show
No ratings yet
Learning Goals
Document3 pages
Learning Goals
Brooke Gagnon
No ratings yet
PSY 201 Course Syllabus
Document13 pages
PSY 201 Course Syllabus
uopfinancestudent
0% (1)
Lesson Plan 1st, 7 Molave Music
Document3 pages
Lesson Plan 1st, 7 Molave Music
Weng Corales Canoy
No ratings yet
Know Yourself: Music Teaching Styles of Malaysian School Music Teachers
Document11 pages
Know Yourself: Music Teaching Styles of Malaysian School Music Teachers
Suhaimi Shoib
No ratings yet
How to Write an Effective Argumentative Essay
Document15 pages
How to Write an Effective Argumentative Essay
Dulara Rajapaskha
No ratings yet
Michael Skyer's academic profile and career experiences
Document26 pages
Michael Skyer's academic profile and career experiences
Michael Skyer
No ratings yet