You are on page 1of 66

CRYPTO CURRENCY PREDICTION USING DEEP LEARNING IN BITCOIN

ABSTRACT

Due to the difficulty in assessing the exact nature of a time series, it is often

considerably challenging to generate appropriate forecasts. Over the years, various

forecasting models have been developed in the literature, but they have produced

minimum accuracy in forecasting the bit coin price. The study involves the time series

forecasting of the bit coin prices with improved efficiency using long Short –term memory

techniques (LSTM) and compares its predictability with the traditional method

(ARIMA).Parallelization of algorithms is not limited to GPU devices. Field Programmable Gate

Arrays(FPGA) are an interesting alternative to GPU devices in terms of parallelization and

machine learning models have been shown to perform better on FPGA than on a GPU. The

implemented in a practical or real time setting for predicting into the proposed as opposed to

learning what has already happened. In addition, the ability to predict using streaming data

should improve the model. Sliding window validation is an approach not implemented here but

this may be explored as proposed work


Objectives

 Verify empirical results with available experimental facilities.

 Comparative analysis based on the experiment with proposed algorithm and benchmark

algorithms.

 Experimental validation of the predictive performance on near-future price of Bitcoin.

Challenges

 The confidence intervals are too large to be meaningful.

 The overall history of Bit coin price has been a strong uptrend, so most algorithms tend to

just bitcoin price upward into the foreseeable future.


CHAPTER-1

INTRODUCTION:

To predict the future, has always been a goal or dream of humanity; however humans have

always been terrible at it1. Predicting the price movements for crypto currency or bit coin is a

relatively similar to predicting stocks or the price of USD going up or down. However, unlike a

company with physical buildings and people working in it, bit coin is purely digital and therefore

very difficult to understand when and why the price would change. With no physical entity and

the way bit coin works, most of the effect on the price is based on how the world feels about it.

For this reason, understanding people is one way to help predict bit coins future. A sentiment

analysis on different kind of social media to get a better understanding of how people feel

humans are terrible at predicting, even with information at their hand. For this reason, by using a

machine learning algorithm, the algorithm will try to predict the future price based on the

information given to it. Prediction of mature financial markets such as the stock market has been

researched at length . Bitcoin presents an interesting parallel to this as it is a time series

prediction problem in a market still in its transient stage. Traditional time series prediction

methods such as Holt-Winters exponential smoothing models rely on linear assumptions and

require data that can be broken down into trend, seasonal and noise to be effective . This type of

methodology is more suitable for a task such as forecasting sales where seasonal effects are

present. Due to the lack of seasonality in the Bitcoin market and its high volatility, these methods

are not very effective for this task. Given the complexity of the task, deep learning makes for an

interesting technological solution based on its performance in similar areas. The recurrent neural
network (RNN) and the long short term memory (LSTM) are favoured over the traditional

multilayer perceptron (MLP) due to the temporal nature of Bitcoin data.

AIM &OBJECTIVES:

The aim of this project is to investigate with what accuracy the price of Bitcoin can be predicted

using machine learning and compare parallelisation methods executed on multi-core and GPU

environments.

 Verify empirical results with available experimental facilities.

 Comparative analysis based on the experiment with proposed algorithm and benchmark

algorithms.

 Experimental validation of the predictive performance on near-future price of Bitcoin.

CHALLENGES

 The confidence intervals are too large to be meaningful.

 The overall history of Bit coin price has been a strong uptrend, so most algorithms tend to

just bit coin price upward into the foreseeable future.


PROBLEM DESCRIPTION:

The goal is to be able to predict if the bit coin price change would increase or decrease and by

how much, however this goal can be accomplished in many ways. How often does it need to be

able to predict? Whether it predicts every minute, 10 minutes or hourly will all decide the time

frame the data set will be collected in. If predictions are 10 minutes, then all social media in a 10

minute timeframe will be used to predict, however if it is 1 hour, then 1 hour worth of data will

be collected to predict etc. In all, how often to predict would be based on the amount of data

gathered in a day, the more data gathered the smaller timeframe, however according to they

concluded that the shorter the frame, the more influence sentiment value had which is something

to take into consideration as well. However news and social network messages takes time to get

around. A longer time frame would help on that, since messages and news has to be written and

when it reaches people, they also have to open and read it.
CHAPTER -2

SYSTEM ANALYSIS

EXISTING SYSTEM

There are many approaches people have taken to try to tackle intra-day and high-speed

trading. Just like in stock, forex, commodity, and options trading, these are coupled with highly-

efficient bots and systems that perform the trades automatically. Humans are a problem in this

situation. Bitcoin presents an interesting parallel to this as it is a time series prediction problem

in a market still in its transient stage. Traditional time series prediction methods such as Holt-

Winters exponential smoothing models rely on linear assumptions and require data that can be

broken down into trend, seasonal and noise to be effective. This type of methodology is more

suitable for a task such as forecasting sales where seasonal effects are present. Due to the lack of

seasonality in the Bitcoin market and its high volatility, these methods are not very effective for

this task.

DRAWBACK IN EXISTING SYSTEM

 If you lose the offline wallet, the coins are lost forever

 It is still not yet accepted by a lot of businesses as a means of making payment.

 Online coins can be hacked. Once hacked coins are lost for life.

EXISTING ALGORITHM

 Basiyean classification algorithm

 Recurrent neural network (RNN)


PROPOSED SYSTEM

It focuses only on the closing price of the bit coin to develop the predictive model. It does not

take into consideration the other economic factors such as news about bit coin, government

policies, and market sentiments into account which could be the future scope of the project to

predict the price with much more accuracy. The prediction is limited to the past data. The ability

to predict on streaming data would improve the performance and predictability of the model. The

study involves only the comparison between ARIMA and LSTM. Comparing with more machine

learning models would confirm the result. The model developed using LSTM have more

accuracy than the traditional models which prove deep learning model, in our case LSTM(Long

Short-Term Memory) is evidently effective learner on training data than ARIMA with the LSTM

more capable for recognizing longer-term dependencies. The study is done using the daily price

fluctuations of the bit coin which triggers the study to further investigate in future the

predictability of the model using hourly price fluctuations.

Advantages

 Total Bit coin passing through

 Net Bit coin flow (received minus sent)

 number of transactions

 closeness centrality

 The transactions are decentralized. Unlike the centralized transactions associated with

cash and card transactions, bitcoin transactions are carried out via decentralized private
Proposed Algorithm

 ARIMA (Auto Regressive Integrated Moving Average)

 Root Mean Square Error

BITCOIN CRYPTOCURRENCY

The first decentralized crypto currency was created in 2009 called Bitcoin2. In 2010 price for

one bit coin never reached above one dollar3, but in May 2018 it was the one with the highest

market cap of around 149 billion dollars with a little more than 17 million bit coins in

circulation4. This makes one bit coin worth around 8700 dollars (Market cap / circulation).

Bitcoin is a digital currency, which can be used to buy or trade things electronically, however

what makes it different from normal currency is the decentralization of the system by using

blockchain technology5. By being decentralized, it has no single entity controlling the network

but instead maintained by a group of volunteers running on different computers around the

world. Physical currency such as USD can have unlimited supply of currency, if the government

decides to print more, which can change the value of the currency related to others. However, the

increase of Bitcoin circulation is heavily guarded by an algorithm, which allows only a few

Bitcoin to be created every hour by miners until it reaches a total cap of 21 million bitcoins6.

The decentralization of the system in theory allows for anonymity, since there are no bank or

centralized entity to confirm and validate transactions that are happening. In practice however,

when a transaction is made, each user uses their identifier, known as addresses during the
transaction. These addresses are not associated with a name, but because the transaction must be

transparent because of the way decentralization of the system works, it is possible for everyone

to see it. A disadvantage of the system is that, in case of losing the address, forgetting it or

wanting to reverse a transaction, because there is no authority involved, it is not possible to do

anything about it.

Web sites – To a degree any websites that has information related to bit coin can be written here,

but most important type of sites would be major media outlets and different government sites of

different countries for any change on regulations.

Internet forums – Bitcointalk.org is created by one of the founder of bit coin, and is one of the

biggest forums on bit coin, however many smaller ones exist as well, whom dedicates

themselves to news and information on bit coin. These sites are mostly guaranteed to have users

who are interested and understand bit coin, which mean most posts will be made by educated

users on the topic and be related to bit coin.

Price collection – To obtain prices on bit coin, sites like Bitcoin exchange exists where it is

possible to buy or sell bit coins. Since every sites have different prices on the bit coin, it is not

possible to guarantee that the biggest site is the one with the cheapest price. According to

article27 the top 3 most recommended is Coin base, for being the biggest, Gemini Exchange for

low fees and Changelly since they have lesser known crypto currencies.

These social platforms and other data sources will be looked upon and analyzed to see which

would fit best to gather data to use for predicting bit coin prices. Although each platform has

their own demographic, many of them are overlapping, this could be a problem if multiple
platforms were used, giving duplicate data rather than more data. Facebook has the potential of

giving the most data, however since the main demographic probably do not know or use Bitcoin,

it might be mostly data which does not influence Bitcoin price. Twitter and Reddit on the other

hand might have less data to obtain, however more specialized knowledge of the users might

lead to some changes in the Bitcoin price. Telegram being the extreme of them all, with the least

users, but with the highest expertise.


CHAPTER-3

SYSTEM STUDY

FEASIBILITY STUDY

The feasibility of the project is analyzed in this phase and business proposal is put forth with a

very general plan for the project and some cost estimates. During system analysis the feasibility

study of the proposed system is to be carried out. This is to ensure that the proposed system is

not a burden to the company. For feasibility analysis, some understanding of the major

requirements for the system is essential.

Three key considerations involved in the feasibility analysis are

 ECONOMICAL FEASIBILITY

 TECHNICAL FEASIBILITY

 SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY

This study is carried out to check the economic impact that the system will have on

the organization. The amount of fund that the company can pour into the research and

development of the system is limited. The expenditures must be justified. Thus the developed

system as well within the budget and this was achieved because most of the technologies used

are freely available. Only the customized products had to be purchased.

TECHNICAL FEASIBILITY

This study is carried out to check the technical feasibility, that is, the technical

requirements of the system. Any system developed must not have a high demand on the available

technical resources. This will lead to high demands on the available technical resources. This

will lead to high demands being placed on the client. The developed system must have a modest

requirement, as only minimal or null changes are required for implementing this system.

SOCIAL FEASIBILITY

The aspect of study is to check the level of acceptance of the system by the user. This includes

the process of training the user to use the system efficiently. The user must not feel threatened by

the system, instead must accept it as a necessity. The level of acceptance by the users solely

depends on the methods that are employed to educate the user about the system and to make him
familiar with it. His level of confidence must be raised so that he is also able to make some

constructive criticism, which is welcomed, as he is the final user of the system.

INTRODUCTION TO DEEP LEARNING:

Deep learning, a class of machine learning techniques that are used to extract features

from data, and CNN (Convolutional Neural Network), a type of artificial neural network that has

been extended across space using shared weights, have been found suitable for computer vision

tasks . At the beginning, researchers experimented with small datasets. With the lowered cost of

expensive processing hardware, increasing chip processing capabilities and increasing number of

data existing online, it was possible to implement deep neural networks in larger data sets and in

real-life scenario data sets as well.Deep learning (also known as deep structured learning or

hierarchical learning) is part of a broader family of machine learning methods based on

learning data representations, as opposed to task-specific algorithms. Learning can be supervised,

semi-supervised or unsupervisedDeep learning architectures such as deep neural networks, deep

belief networks and recurrent neural networks have been applied to fields including computer

vision, speech recognition, natural language processing, audio recognition, social network

filtering, machine translation, bioinformatics, drug design and board game programs, where they

have produced results comparable to and in some cases superior to human experts.Deep learning

models are vaguely inspired by information processing and communication patterns in biological

nervous systems yet have various differences from the structural and functional properties of
biological brains (especially human brain), which make them incompatible with neuroscience

evidences.

Deep Learning Development Lifecycle

DEEP LEARNING METHODS

In the past, deep learning was not a growth item due to hardware constraints. However,

nowadays, there is a lot of information that can be learned, also, GPU parallel processing

technology became processing a lot of information in real time, and hardware performance is

better than before so that the deep learning technology can be applied to various fields such as

computer vision field, image recognition. The models of deep running that are mainly studied are
CNN (Convolution Neural Network) and RNN (Recurrent Neural Network) . CNN is an

excellent way to extract high-level abstraction features from images or to process texture

information, and has proven to be excellent in object recognition in 2012. CNN has been applied

to various fields such as video and speech recognition in an effort to express and learn large

amounts of data in the form of meaningful data. RNN is a kind of artificial neural network. When

inputting the data, the value of the hidden layer is stored in the neural network and calculated as

the next input value, which is good for modeling the time series information. Traditional RNNs

use Long Short-Term Memory (LSTM) as an alternative to long-term retention and information

memorization problems. Therefore, LSTM RNN is a deep learning model that solves the

vanishing gradient problem of existing RNN model. A support vector is an optimal hyperplane

that can distinguish two types of data given in a feature space. CRF (Conditional Random Field)

is not referring to a single sequence, but by referring to previous and subsequent states to

determine the current state. HMM (Hidden Markov Model) and MEMM (Maximum Entropy

Markov Model). HMM is modeled under the strong independent assumption of Markov

assumption. This has the advantage that it is easy to model the real problem, but the current

hidden state is affected by the current observation state. The model to solve this is the maximum

entropy model. However, MEMM also has a label bias problem.

Decision trees and Random Forest

Decisions trees use branches to represent observations, and the leaves are the target values.

Decision trees has been used in the past studies in vulnerabilities testing . They are capable of

modeling non-linear data which is the case in our current study. They have also proven to work
wellwith data that has outliers . Experts, however, state that decision trees are prone to

overfitting. While there are techniques to prevent this event from happening, a better tree-based

algorithm can be used as a replacement. One of this is random forest which uses ensemble

technique by generating a series of decision trees and then averaging the predictive result. The

final output is strengthened by combining the generated trees to make better prediction or

classification. Random forest has been proven to be a successful vulnerability predictor model in

some of the studies.

Deep Learning Innovation

Deep Learning is one of the recent innovations in artificial intelligence. It extends the

power of neural models by using more hidden layers. Usually, the typical neural networks utilize

shallow learning algorithms which are less complicated. Deep Learning, however, amplifies the

neural network by optimizing the neural model. It does this by adding multiple layers between

the input and output. The deep feed forward network, one of the Deep Learning techniques, is

the chosen technique for our study. In deep feed forward network, the goal is to approximate a

function and maps it into a group. Neural network information flows from the evaluated function

to the define function and finally to the output. Feed forward network forms the basis of other

Deep Learning networks such as convolutional networks, which are considered as a particular

type of feed forward network. While deep learning is said to be data intensive, it can also be fine-

tuned to work on smaller datasets as we have done in our study. however, acknowledge that we

still have not fully exploited the capabilities of deep learning. Due to the number of hyper

parameters deep learning has plus the presence of other deep learning techniques, the research

area for deep learning is still fertile.


DATA WHICH COULD AFFECT BITCOIN PRICES

One of the most difficult tasks is to understand, during the process of this research, what

information is just noise (makes more harm to the prediction than if it left out) and what

information has a lot of impact on the bit coin price. With the research being focused around

sentiment data, the first part is about psychology of humanity, why sentiment score might be

valuable information rather than noise and where to gather such data from. This information can

be gathered from different social network platforms, but which depends on the people that the

sentiment score is aiming for. Sentiment data can also be gathered from global news, that not

necessary has to affect the whole world, but has a big enough impact for people to notice. Lastly

is the technical data, such as price of the bit coin or how many were sold etc.

In the world of economics, they believe that the psychology affects the investors' decision also

known as Animal Spirit36. This means that negative and positive emotions of the investors can

have influence on the price. The confirms this with happy investors is more likely to take more

risk making the bit coin increase, because of them being more likely to buy it. Related but not

completely the same is the media coverage of the bit coin, the investors can be more comfortable

with their bit coin and that will result in the price increasing, however the opposite is also true

with bad publicity, they might sell their bit coins in fear of it dropping, resulting in it dropping.

Positive and negative media news and personal emotion is related however not the same, as they

are data collected different ways (read more in 3.4). Another way to look at it, is that a person

has their emotions however it can be affected or changed based on good or bad news. However

with the internet today, the news of people's feelings can be spread out extremely quickly
through social networks (read more in 3.4) such as Twitter, Reddit, Facebook etc. This means

that not only does news on a national or global scale have a big effect on people, but friends or

celebrity can have big impact on their emotions as well.


CHAPTER -4

SYSTEM SPECIFICATION

HARDWARE REQUIREMENTS:

 System : Dual core processor.

 Hard Disk : 80 GB.

 Monitor : 15 VGA Color.

 Ram : 2 GB.

SOFTWARE REQUIREMENTS:

 Operating system : Windows 7 / 8

 Coding Language : Mat lab

 Platform : Math works Simulator

 Tools : Simulink
SOFTWARE REQUIREMENT SPECIFICATION

MATLAB
MATLAB (matrix laboratory) is a multi-paradigm numerical computing environment and
fourth-generation programming language. A proprietary programming language developed by
MathWorks, MATLAB allows matrix manipulations, plotting of functions and data,
implementation of algorithms, creation of user interfaces, and interfacing with programs written
in other languages, including C, C++, C#, Java, Fortran and Python.
Although MATLAB is intended primarily for numerical computing, an optional toolbox
uses the MuPAD symbolic engine, allowing access to symbolic computing abilities. An
additional package, Simulink, adds graphical multi-domain simulation and model-based design
for dynamic and embedded systems.
In 2004, MATLAB had around one million users across industry and academia.
MATLAB users come from various backgrounds of engineering, science, and economics.

HISTORY
Cleve Moler, the chairman of the computer science department at the University of New
Mexico, started developing MATLAB in the late 1970s. He designed it to give his students
access to LINPACK and EISPACK without their having to learn Fortran. It soon spread to other
universities and found a strong audience within the applied mathematics community. Jack little,
an engineer, was exposed to it during a visit Moler made to Stanford University in 1983.
Recognizing its commercial potential, he joined with Moler and Steve Bangert. They rewrote
MATLAB in C and founded MathWorks in 1984 to continue its development. These rewritten
libraries were known as JACKPAC. In 2000, MATLAB was rewritten to use a newer set of
libraries for matrix manipulation, LAPACK.

MATLAB was first adopted by researchers and practitioners in control engineering,


Little's specialty, but quickly spread to many other domains. It is now also used in education, in
particular the teaching of linear algebra, numerical analysis, and is popular amongst scientists
involved in image processing.
MATLAB FUNCTIONS
 A function is a group of statements that together perform a task. In MATLAB, functions
are defined in separate files. The name of the file and of the function should be the same.
 Functions operate on variables within their own workspace, which is also called the local
workspace, separate from the workspace you access at the MATLAB command prompt
which is called the base workspace.
 Functions can accept more than one input arguments and may return more than one
output arguments.
 The first line of a function starts with the keyword function. It gives the name of the
function and order of arguments. In our example, the mymax function has five input
arguments and one output argument.

Anonymous Functions
 An anonymous function is like an inline function in traditional programming languages,
defined within a single MATLAB statement. It consists of a single MATLAB expression
and any number of input and output arguments.
 You can define an anonymous function right at the MATLAB command line or within a
function or script.
 This way you can create simple functions without having to create a file for them.

Primary and Sub-Functions


 Any function other than an anonymous function must be defined within a file. Each
function file contains a required primary function that appears first and any number of
optional sub-functions that comes after the primary function and used by it.
 Primary functions can be called from outside of the file that defines them, either from
command line or from other functions, but sub-functions cannot be called from command
line or other functions, outside the function file.
 Sub-functions are visible only to the primary function and other sub-functions within the
function file that defines them.
Nested Functions
You can define functions within the body of another function. These are called nested
functions. A nested function contains any or all of the components of any other function.
Nested functions are defined within the scope of another function and they share access to the
containing function's workspace.

Private Functions
 A private function is a primary function that is visible only to a limited group of other
functions. If you do not want to expose the implementation of a function(s), you can
create them as private functions.
 Private functions reside in subfolders with the special name private.
 They are visible only to functions in the parent folder.

Global Variables
 Global variables can be shared by more than one function. For this, you need to declare
the variable as global in all the functions.
 If you want to access that variable from the base workspace, then declare the variable at
the command line.
 The global declaration must occur before the variable is actually used in a function. It is a
good practice to use capital letters for the names of global variables to distinguish them
from other variables.

Features of MATLAB
 It is a high-level language for numerical computation, visualization and application
development.
 It also provides an interactive environment for iterative exploration, design and problem
solving.
 It provides vast library of mathematical functions for linear algebra, statistics, Fourier
analysis, filtering, optimization, numerical integration and solving ordinary differential
equations.
 It provides built-in graphics for visualizing data and tools for creating custom plots.
 MATLAB's programming interface gives development tools for improving code quality
maintainability and maximizing performance.
 It provides tools for building applications with custom graphical interfaces.
 It provides functions for integrating MATLAB based algorithms with external
applications and languages such as C, Java, .NET and Microsoft Excel.
Chapter 5

System Design

. This chapter describes the overall and the detailed architectural design. It

also describes each module that is to be implement ion first phase.

4.1 SYSTEM ARCHITECTURE


Figure 2- Proposed Approach

Pseudocode Forecast model

1. Start open price T1

2. Used Naïve forecasting method for predict next T2 by shifting the T1 as predict value.

3. Loop: Naïve Forecasting method T1 -> T24 Log the data

4. Transform log data form T1 – T24 (2-hour data collected) to mean, min and max

5. Create sliding window based on T12 threshold value (12 window size was used to convert 5

minutes’ timestamp to 1-hour timestamp)

6. Set transform data as TransData

7. Training and Testing TransData in Random forest to forecast predicted value in Mean, Min

and Max.

Algorithm Implementation

Fuzzy Rules : Trading Strategies

Rules 1: Start trading with small amounts (1% of total assets)

Check the (predict_min, predict_max) compare (previous actual value)


If (predict_min < previous actual value) {

Skip and wait for the next price

If (predict_max) > previous actual value){

Sell more than 1%

Rules 2: Buy Low, sell high referring Figure 1.

If (predict_min < previous actual value ){

Invest

If (predict_max > previous actual value){

Trade

}
Encrypt data

Gateway
Switches

Attacks

Payment details (coin Elements) Third Party


Attacks

Users
Identify
elements
Fig.No.4.1.1 System Architecture Diagram

4.2 UML DIAGRAM

UML is simply another graphical representation of a common semantic

model. UML provides comprehensive notification for the full life cycle of object

oriented development.

ADVANTAGES

 To represent complete system (instant of only the software portion) using

object oriented concept.

 To establish a explicit coupling between concept and executable code.

 To take into account the scaling factors that are inherit to complex and

critical system.

 To creating a modelling language usable by both humans and machines.


UML DEFINES SEVERAL MODLES FOR REPRESENTING

SYSTEM

 The class model captures the static structure.

 The state model expresses the dynamic behaviour of object.

 The use case model describe the requirements of the user.

 The interaction model represent the scenarios and message flow.

 The implementation models shows the work units.

4.2.1 USE CASE DIAGRAM

Use case diagrams overview the usage requirements for system. They are

useful for presentations to management or projects stakeholders, but for actual

development you will find that use case provide significantly more values because

they describe “the meant” of the actual requirements. A use case describes a

sequence of action that provide something of measurable value to an action and is

drawn as a horizontal ellipse.


USER

Registration

View Product

Payment

Coin Element

File decrypt

Fig.No. 4.2.1 Use case diagram for user


4.2.2 SEQUENCE DIAGRAM

A sequence diagram is an interaction diagram that shows how

processes operate with one another and in what order. It is a construct of a

Message Sequence Chart.

USER
Fig.No.4.2.2 Sequence diagram for user

4.2.3 CLASS DIAGRAM

A class diagram is a type of static structure diagram that describes the

structure diagram that describes the structure of a system by showing the system’s

classes their attributes, operations (or methods) and the relationship among objects.

The classes in a class diagram represent both the main elements, interactions in the

application and the classes to be programmed.


Fig.No.4.2.3 Class Diagram

4.2.4 ACTIVITY DIAGRAM

Activity diagram are graphical representations of work flow of step wise

activities and actions with support for choice, iteration and concurrency. The
activity diagram can be used to describe the business and operational step by step

work flows of components in a system . Activity diagram consist of initial node,

activity final node and activities in between .

USER

Fig.No.4.2.4 Activity Diagram


CHAPTER 6

SYSTEM IMPLEMENTATION

Modules Description

Dataset

Before we build the model, we need to obtain some data for it. There’s a dataset on UCI
Machine learning repository that details minute by minute Bitcoin prices (plus some other
factors) for the last few years (featured on that other blog post). Over this timescale, noise could
overwhelm the signal, so we’ll opt for daily prices. The issue here is that we may have not
sufficient data (we’ll have hundreds of rows rather than thousands or millions). In deep learning,
no model can overcome a severe lack of data. I also don’t want to rely on static files, as that’ll
complicate the process of updating the model in the future with new data. Instead, we’ll aim to
pull data from websites and APIs.

Rule-based features

Human experts with years of experience created many rules to detect whether a
user is fraud or not. An example of such rules is “bitcoin”, i.e. whether the user has been detected
or complained as coin prediction before. Each rule can be regarded as a binary feature that
indicates the coin price prediction likeliness.

Selective labeling

If the coin price score is above a certain threshold, the case will enter a queue for further
investigation by human experts. Once it is reviewed, the final result will be labeled as Boolean,
i.e. coin or clean. Cases with higher scores have higher priorities in the queue to be reviewed.
The cases whose price score are below the threshold are determined as clean by the system
without any human judgment. Once one case is labeled as fraud by human experts, it is very
likely that the seller is not trustable and may be also selling other coin price; hence all the items
submitted by the same seller are labeled as coin price too. The bitcoin seller along with his/her
cases will be removed from the website immediately once detected

Machine Learning and Bitcoin

Before you get started, let’s establish a couple of things. First, it’s nearly impossible to
accurately predict the value of a stock (and cryptocurrency) with a simple computer algorithm.
This is because there are so many factors that can affect the price of a stock that we cannot
account for. Think about this, for almost no apparent reason, the price of bitcoin rises and surges.
There is no mathematical variable or equation that we can use to predict these rises and falls.
Yes, there are some really advanced computer models for stocks which takes into account many
long-term factors, but nothing is going to give you 100% accuracy.

Deep Learning Models

Appropriate design of deep learning models in terms of network parameters is imperative to their
success. The three main options available when choosing how to select parameters for deep
learning models are random search, grid search and heuristic search methods such as genetic
algorithms. Manual grid search and Bayesian optimization were utilized in this study. Grid
search, implemented for the Elman RNN, is the process of selecting two hyper paramaters with a
minimum and maximum for each. One then searches that feature space looking for the best
performing parameters. This approach was taken for parameters which were unsuitable for
Bayesian optimisation. This model was built using Visual studio in the Python programming
language. Similar to the RNN, Bayesian optimization was chosen for selecting LTSM parameters
where possible. This is a heuristic search method which works by assuming the function was
sampled from a Gaussian process and maintains a posterior distribution for this function as the
results of different hyper parameter selections are observed.

LSTM

In terms of temporal length, the LSTM is considerably better at learning long term
dependencies. As a result, picking a long window was less detrimental for the LSTM. This
process followed a similar process to the RNN in which autocorrelation lag was used as a
guideline. The LSTM performed poorly on smaller window sizes. Its most effective length found
was 100 days, and two hidden LSTM layers were chosen. For a time series task two layers is
enough to find nonlinear relationships among the data. 20 hidden nodes were also chosen for
both layers as per the RNN model. The Hyperas library2 was used to implement the Bayesian
optimisation of the network parameters. The optimiser searched for the optimal model in terms
of how much dropout per layer and which optimizer to use.

System Structure

 Feature Engineering
 Feature Evaluation
 RNN
 LSTM

MODULE DESCRIPTION:

Feature Engineering
Feature engineering is the art of extracting useful patterns from data to make it easier for
machine learning models to per-form its prediction. It can be considered one of the most
important skills to achieve good results for prediction tasks . It investigated the behaviour of
consistent top performers in Kaggle data mining competitions. The findings were that feature
engineering is often the most import-ant part. It is quite a subjective process requiring domain
knowledge to be effective. It is also considered an art. Engineered features should represent what
one is trying to each the network.

Feature Evaluation

Features must be evaluated once selected. The reason for this is dealing with too large of a
feature set will considerably increase training time. In addition, machine learning algorithms can
suffer from decreased accuracy if the number of variables is significantly higher than the optimal
number. Several methods of feature evaluation exist including filter based selection and wrapper
based selection. Filter based selectors filter features based on a particular statistical property of
the feature e.g. correlation. Wrapper based methods perform a heuristic search of solutions to a
classifier.

FILTER BASED SELECTION MODEL:


WRAPPER BASED SELECTION MODEL

The Boruta algorithm in R is one such wrapped based methods. This algorithm is a wrapper built
around the random forest classification algorithm. This is an ensemble classification method in
which classification is performed by voting of multiple classifiers. The algorithm works on a
similar principle as the random forest classifier. It adds randomness to the model and collects
results from the ensemble of randomized samples to evaluate attributes. This extra randomness
provides you with a clear view on which attributes are important . All features were deemed
important to the model based on the random forest, with 5 day and 10 days the highest
importance among the tested averages. The de-noised closing price was one of the most
important variables also.

The dimensionality reduction technique of principal component analysis (PCA) was also
explored. The result was four principal groups in which all attributes belonged to. The results of
the PCA was not included in the final model as computation was not an issue and the original
data performed reasonably well.

RNN: (RECURRENT NEURAL NETWORKS)

The recurrent neural network (RNN) was first developed by Elman . The RNN is structured
similarly to the MLP, with the exception that signals can flow both forward and backwards in an
iterative.Appropriate design of deep learning models in terms of network parameters is
imperative to their success. The three main options available when choosing how to select
parameters for deep learning models are random search, grid search and heuristic search methods
such as genetic algorithms. As mentioned in the related work section manual grid search and
Bayesian optimization are utilized in this study. Grid search, implemented for the Elman RNN, is
the process of selecting two hyper parameters with a minimum and maximum for each. One then
searches that feature space looking for the best performing parameters. This approach was taken
for parameters which were unsuitable for Bayesian optimization.

RECURRENT NEURAL NETWORK


In addition to passing input between layers, the out-put of each layer is fed to the context layer
to be fed into the next layer with the next input. In this context, the state is overwritten at each
time step. This offers the benefit of allowing the network to assign particular weights to events
that occur in a series rather than the same weights to all input as with the MLP. This results in a
dynamic network. The length of the temporal window in a sense is the length of your networks
memory. While this addresses the temporal issue faced with a time series task, vanishing
gradient can still be an issue. In addition, some research has found that while RNN are capable of
handling long-term dependencies, in practice they often fail to learn due to the difficulties
between gradient descent and long term dependencies.

LSTM(LONG SHORT TERM MEMORY):

Similar to the RNN, Bayesian optimization was chosen for selecting parameters for this model
where possible. This is a heuristic search method which works by assuming the function was
sampled from a Gaussian process and maintains a posterior distribution for this function as the
results of different hyper parameter selections are observed. One can then optimise the expected
improvement over the best result to pick hyper parameters for the next experiment. The
performance of both the RNN and LSTM network are evaluated on validation data with
significant over fitting measures in place. Dropout is implemented in both layers. In addition, an
early stopper is programmed into the model to prevent over fitting. This stops the model if its
validation loss doesn’t improve for 5 epochs. In terms of temporal length, the LSTM is
considerably better at learning long term dependencies. As a result, picking a long window for
this parameter was less detrimental for the LSTM as the RNN. This process followed a similar
process to the RNN in which autocorrelation lag was used as a guideline. The LSTM performed
poorly on smaller window sizes. Its most effective length found was 100 days.

Long short term memory (LSTM) units address both of these issues. Developed by Hochreiter et
al. they allow the preservation of the weights that are forward and back-propagated through
layers. This is in contrast to the Elman RNN in which the state gets overwritten at each step.
They also allow the network to continue to learn over many time steps by maintaining a more
constant error. This allows the network to learn long term dependencies. A LSTM cell contains
forget and remember gates which allow the cell to decide what information to block or pass
based on its strength and importance. As a result, weak signals can be blocked which prevents
vanishing gradient.

CHAPTER -7

SYSTEM TESTING

TESTING METHODOLOGIES

The following are the Testing Methodologies:

o Unit Testing.
o Integration Testing.
o User Acceptance Testing.
o Output Testing.
o Validation Testing.

Unit Testing

Unit testing focuses verification effort on the smallest unit of Software design that
is the module. Unit testing exercises specific paths in a module’s control structure to
ensure complete coverage and maximum error detection. This test focuses on each
module individually, ensuring that it functions properly as a unit. Hence, the naming is Unit
Testing.
During this testing, each module is tested individually and the module interfaces
are verified for the consistency with design specification. All important processing path are
tested for the expected results. All error handling paths are also tested.

Integration Testing

Integration testing addresses the issues associated with the dual problems of
verification and program construction. After the software has been integrated a set of high order
tests are conducted. The main objective in this testing process is to take unit tested modules and
builds a program structure that has been dictated by design.

The following are the types of Integration Testing:

1. Top Down Integration

This method is an incremental approach to the construction of program structure.


Modules are integrated by moving downward through the control hierarchy, beginning with the
main program module. The module subordinates to the main program module are incorporated
into the structure in either a depth first or breadth first manner.
In this method, the software is tested from main module and individual stubs are
replaced when the test proceeds downwards.

2. Bottom-up Integration

This method begins the construction and testing with the modules at the lowest level in
the program structure. Since the modules are integrated from the bottom up, processing required
for modules subordinate to a given level is always available and the need for stubs is eliminated.
The bottom up integration strategy may be implemented with the following steps:
 The low-level modules are combined into clusters into clusters that
perform a specific Software sub-function.
 A driver (i.e.) the control program for testing is written to coordinate test case
input and output.
 The cluster is tested.
 Drivers are removed and clusters are combined moving upward in the
program structure

The bottom up approaches tests each module individually and then each module is module is
integrated with a main module and tested for functionality.

7.1.3 User Acceptance Testing

User Acceptance of a system is the key factor for the success of any system. The system
under consideration is tested for user acceptance by constantly keeping in touch with the
prospective system users at the time of developing and making changes wherever required. The
system developed provides a friendly user interface that can easily be understood even by a
person who is new to the system.

7.1.4 Output Testing

After performing the validation testing, the next step is output testing of the proposed
system, since no system could be useful if it does not produce the required output in the specified
format. Asking the users about the format required by them tests the outputs generated or
displayed by the system under consideration. Hence the output format is considered in 2 ways –
one is on screen and another in printed format.
7.1.5 Validation Checking

Validation checks are performed on the following fields.

Text Field:

The text field can contain only the number of characters lesser than or equal to its size.
The text fields are alphanumeric in some tables and alphabetic in other tables. Incorrect entry
always flashes and error message.

Numeric Field:

The numeric field can contain only numbers from 0 to 9. An entry of any character
flashes an error messages. The individual modules are checked for accuracy and what it has to
perform. Each module is subjected to test run along with sample data. The individually tested
modules are integrated into a single system. Testing involves executing the real data
information is used in the program the existence of any program defect is inferred from the
output. The testing should be planned so that all the requirements are individually tested.

A successful test is one that gives out the defects for the inappropriate data and produces
and output revealing the errors in the system.

Preparation of Test Data


Taking various kinds of test data does the above testing. Preparation of test data
plays a vital role in the system testing. After preparing the test data the system under study is
tested using that test data. While testing the system by using test data errors are again uncovered
and corrected by using above testing steps and corrections are also noted for future use.

Using Live Test Data:

Live test data are those that are actually extracted from organization files. After a system
is partially constructed, programmers or analysts often ask users to key in a set of data from their
normal activities. Then, the systems person uses this data as a way to partially test the system. In
other instances, programmers or analysts extract a set of live data from the files and have them
entered themselves.

It is difficult to obtain live data in sufficient amounts to conduct extensive testing. And,
although it is realistic data that will show how the system will perform for the typical processing
requirement, assuming that the live data entered are in fact typical, such data generally will not
test all combinations or formats that can enter the system. This bias toward typical values then
does not provide a true systems test and in fact ignores the cases most likely to cause system
failure.

Using Artificial Test Data:


Artificial test data are created solely for test purposes, since they can be generated to test
all combinations of formats and values. In other words, the artificial data, which can quickly be
prepared by a data generating utility program in the information systems department, make
possible the testing of all login and control paths through the program.

The most effective test programs use artificial test data generated by persons other than
those who wrote the programs. Often, an independent team of testers formulates a testing plan,
using the systems specifications.

The package “Virtual Private Network” has satisfied all the requirements specified as per
software requirement specification and was accepted.

7.2 USER TRAINING

Whenever a new system is developed, user training is required to educate them about the
working of the system so that it can be put to efficient use by those for whom the system has
been primarily designed. For this purpose the normal working of the project was demonstrated to
the prospective users. Its working is easily understandable and since the expected users are
people who have good knowledge of computers, the use of this system is very easy.

7.3 MAINTAINENCE
This covers a wide range of activities including correcting code and design errors. To
reduce the need for maintenance in the long run, we have more accurately defined the user’s
requirements during the process of system development. Depending on the requirements, this
system has been developed to satisfy the needs to the largest possible extent. With development
in technology, it may be possible to add many more features based on the requirements in future.
The coding and designing is simple and easy to understand which will make maintenance easier.

TESTING STRATEGY :

A strategy for system testing integrates system test cases and design techniques into a
well planned series of steps that results in the successful construction of software. The testing
strategy must co-operate test planning, test case design, test execution, and the resultant data
collection and evaluation .A strategy for software testing must accommodate low-level tests
that are necessary to verify that a small source code segment has been correctly implemented
as well as high level tests that validate major system functions against user requirements.

Software testing is a critical element of software quality assurance and represents the
ultimate review of specification design and coding. Testing represents an interesting anomaly for
the software. Thus, a series of testing are performed for the proposed system before the
system is ready for user acceptance testing.

SYSTEM TESTING:

Software once validated must be combined with other system elements (e.g. Hardware,
people, database). System testing verifies that all the elements are proper and that overall system
function performance is

achieved. It also tests to find discrepancies between the system and its original objective, current
specifications and system documentation.

UNIT TESTING:

In unit testing different are modules are tested against the specifications produced during
the design for the modules. Unit testing is essential for verification of the code produced during
the coding phase, and hence the goals to test the internal logic of the modules. Using the
detailed design description as a guide, important Conrail paths are tested to uncover errors
within the boundary of the modules. This testing is carried out during the programming stage
itself. In this type of testing step, each module was found to be working satisfactorily as regards
to the expected output from the module.

In Due Course, latest technology advancements will be taken into consideration. As part of
technical build-up many components of the networking system will be generic in nature so that
future projects can either use or interact with this. The future holds a lot to offer to the
development and refinement of this project.

CHAPTER-8

Sample Coding and Screens Shots

8. SAMPLE CODE

using System;

using System.Collections.Generic;

using System.Data;

using System.Data.SqlClient;

using System.IO;

using System.Linq;

using System.Net;

using System.Net.Mail;

using System.Security.Cryptography;

using System.Text;

using System.Web;

using System.Web.UI;
using System.Web.UI.WebControls;

public partial class Default9 : System.Web.UI.Page

String strConnString =

System.Configuration.ConfigurationManager.ConnectionStrings["con"].Connectio

nString;

string pwd;

protected void Page_Load(object sender, EventArgs e)

Label1.Text = Session["ac"].ToString();

DataTabledt = new DataTable();

string strQuery = "Select * from req Where cno='"+Label1.Text+"' order by

rid";

SqlCommandcmd = new SqlCommand(strQuery);

SqlConnection con = new SqlConnection(strConnString);

SqlDataAdaptersda = new SqlDataAdapter();

cmd.CommandType = CommandType.Text;

cmd.Connection = con;
try

con.Open();

sda.SelectCommand = cmd;

sda.Fill(dt);

GridView1.DataSource = dt;

GridView1.DataBind();

//GridView2.DataSource = dt;

//GridView2.DataBind();

catch (Exception ex)

Response.Write(ex.Message);

finally

con.Close();

sda.Dispose();

con.Dispose();
dt.Dispose();

protected void GridView1_SelectedIndexChanged(object sender, EventArgs e)

string id = GridView1.SelectedRow.Cells[1].Text;

SqlConnection con = new SqlConnection(strConnString);

con.Open();

SqlCommandcmd = new SqlCommand("select * from Account where Acc_no='"

+id+ "'", con);

//cmd.Parameters.AddWithValue("@appid", Label7.Text);

SqlDataReaderdr = cmd.ExecuteReader();

dr.Read();

if (dr.HasRows)

string mail = dr["Email"].ToString();

string allowedChars = "";


allowedChars += "1,2,3,4,5,6,7,8,9,0";

char[] sep = { ',' };

string[] arr = allowedChars.Split(sep);

//string passwordString = "";

string temp = "";

Random rand = new Random();

for (inti = 0; i< 4; i++)

temp = arr[rand.Next(0, arr.Length)];

pwd += temp;

Session["otp"] = pwd;

using (MailMessage mm = new

MailMessage("dotnetjava.projects@gmail.com", mail))

mm.Subject = "Your Coin Element";

mm.Body = "Hai Customer, Your Coin Element " +pwd;

mm.IsBodyHtml = false;

SmtpClientsmtp = new SmtpClient();


smtp.Host = "smtp.gmail.com";

smtp.EnableSsl = true;

NetworkCredentialNetworkCred = new

NetworkCredential("dotnetjava.projects@gmail.com", "ve_dotnetjava2016");

smtp.UseDefaultCredentials = true;

smtp.Credentials = NetworkCred;

smtp.Port = 587;

smtp.Send(mm);

//ClientScript.RegisterStartupScript(GetType(), "alert", "alert('Email

sent.');", true);

con.Close();

protected void OnPageIndexChanging(object sender, GridViewPageEventArgs

e)

GridView1.PageIndex = e.NewPageIndex;

GridView1.DataBind();
}

private string Decrypt(string cipherText)

string EncryptionKey = "123456";

byte[] cipherBytes = Convert.FromBase64String(cipherText);

using (Aesencryptor = Aes.Create())

Rfc2898DeriveBytes pdb = new Rfc2898DeriveBytes(EncryptionKey, new

byte[] { 0x49, 0x76, 0x61, 0x6e, 0x20, 0x4d, 0x65, 0x64, 0x76, 0x65, 0x64, 0x65,

0x76 });

encryptor.Key = pdb.GetBytes(32);

encryptor.IV = pdb.GetBytes(16);

using (MemoryStreamms = new MemoryStream())

using (CryptoStreamcs = new CryptoStream(ms,

encryptor.CreateDecryptor(), CryptoStreamMode.Write))

cs.Write(cipherBytes, 0, cipherBytes.Length);

cs.Close();
}

cipherText = Encoding.Unicode.GetString(ms.ToArray());

return cipherText;

protected void OnRowDataBound(object sender, GridViewRowEventArgs e)

if (e.Row.RowType == DataControlRowType.DataRow)

e.Row.Cells[5].Text = Decrypt(e.Row.Cells[5].Text);

e.Row.Cells[6].Text = Decrypt(e.Row.Cells[6].Text);

public partial class Default11 : System.Web.UI.Page

{
String strConnString =

System.Configuration.ConfigurationManager.ConnectionStrings["con"].Connectio

nString;

string pwd;

protected void Page_Load(object sender, EventArgs e)

Label1.Text = Session["ac"].ToString();

DataTabledt = new DataTable();

string strQuery = "Select * from req Where cno='" + Label1.Text + "' order by

rid";

SqlCommandcmd = new SqlCommand(strQuery);

SqlConnection con = new SqlConnection(strConnString);

SqlDataAdaptersda = new SqlDataAdapter();

cmd.CommandType = CommandType.Text;

cmd.Connection = con;

try

con.Open();

sda.SelectCommand = cmd;

sda.Fill(dt);
GridView1.DataSource = dt;

GridView1.DataBind();

//GridView2.DataSource = dt;

//GridView2.DataBind();

catch (Exception ex)

Response.Write(ex.Message);

finally

con.Close();

sda.Dispose();

con.Dispose();

dt.Dispose();

protected void GridView1_SelectedIndexChanged(object sender, EventArgs e)

{
}

protected void OnPageIndexChanging(object sender, GridViewPageEventArgs

e)

GridView1.PageIndex = e.NewPageIndex;

GridView1.DataBind();

private string Decrypt(string cipherText)

string EncryptionKey = "123456";

byte[] cipherBytes = Convert.FromBase64String(cipherText);

using (Aesencryptor = Aes.Create())

Rfc2898DeriveBytes pdb = new Rfc2898DeriveBytes(EncryptionKey, new

byte[] { 0x49, 0x76, 0x61, 0x6e, 0x20, 0x4d, 0x65, 0x64, 0x76, 0x65, 0x64, 0x65,

0x76 });

encryptor.Key = pdb.GetBytes(32);

encryptor.IV = pdb.GetBytes(16);
using (MemoryStreamms = new MemoryStream())

using (CryptoStreamcs = new CryptoStream(ms,

encryptor.CreateDecryptor(), CryptoStreamMode.Write))

cs.Write(cipherBytes, 0, cipherBytes.Length);

cs.Close();

cipherText = Encoding.Unicode.GetString(ms.ToArray());

return cipherText;

protected void OnRowDataBound(object sender, GridViewRowEventArgs e)

}
CONCLUSION :

Deep learning models such as the RNN and LSTM are evidently effective for Bitcoin

prediction with the LSTM more capable for recognising longer-term dependencies. However, a

high variance task of this nature makes it difficult to transpire this into impressive validation

results. As a result it remains a difficult task. There is a fine line between overfitting a model and

preventing it from learning sufficiently. Dropout is a valuable feature to assist in improving this.

However, despite using Bayesian optimisation to optimize the selection of dropout it still

couldn’t guarantee good validation results. Despite the metrics of sensitivity, specificity and

precision indicating good performance, the actual performance of the ARIMA forecast based on

error was significantly worse than the neural network models. The LSTM outperformed the RNN

marginally, but not significantly. However, the LSTM takes considerably longer to train. The

performance benefits gained from the parallelization of machine learning algorithms on a GPU

are evident with a 70.7% performance improvement for training the LSTM model.

FUTURE WORK:

Looking at the task from purely a classification perspective it may be possible to achieve

better results. One limitation o is that the model has not been implemented in a practical or real

time setting for predicting into the future as opposed to learning what has already happened. In
addition, the ability to predict using streaming data should improve the model. Sliding window

validation is an approach not implemented here but this may be explored as future work.

REFERENCES:

1. S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” 2008.

[2] M. Bri`ere, K. Oosterlinck, and A. Szafarz, “Virtual currency, tangible

return:Portfoliodiversificationwithbitcoins,” Tangible Return: Portfolio Diversification

with Bitcoins (September 12, 2013), 2013.

[3] I. Kaastra and M. Boyd, “Designing a neural network for forecasting financial and

economic time series,” Neurocomputing, vol. 10, no. 3, pp. 215–236, 1996.

[4] H. White, “Economic prediction using neural networks: The case of ibm daily stock

returns,” in Neural Networks, 1988., IEEE International Conference on. IEEE, 1988, pp.

451–458.

[5] C. Chatfield and M. Yar, “Holt-winters forecasting: some practical issues,” The

Statistician, pp. 129–140, 1988.

[6] B. Scott, “Bitcoin academic paper database,” suitpossum blog, 2016.

[7] M. D. Rechenthin, “Machine-learning classification techniques for the analysis and

prediction of high-frequency stock direction,” 2014.


[8] D. Shah and K. Zhang, “Bayesian regression and bitcoin,” in Communication,

Control, and Computing (Allerton), 2014 52nd Annual Allerton Conference on. IEEE,

2014, pp. 409–414.

[9] G. H. Chen, S. Nikolov, and D. Shah, “A latent source model for nonparametric time

series classification,” in Advances in Neural Information Processing Systems, 2013, pp.

1088–1096.

[10] I. Georgoula, D. Pournarakis, C. Bilanakos, D. N. Sotiropoulos, and G. M. Giaglis,

“Using time-series and sentiment analysis to detect the determinants of bitcoin prices,”

Available at SSRN 2607167, 2015.

[11] M. Matta, I. Lunesu, and M. Marchesi, “Bitcoin spread prediction using social and

web search media,” Proceedings of DeCAT, 2015.

[12] ——, “The predictor impact of web search media on bitcoin trading volumes.”

[13] B. Gu, P. Konana, A. Liu, B. Rajagopalan, and J. Ghosh, “Identifying information in

stock message boards and its implications for stock market efficiency,” in Workshop on

Information Systems and Economics, Los Angeles, CA, 2006.

[14] A. Greaves and B. Au, “Using the bitcoin transaction graph to predict the price of

bitcoin,” 2015.

[15] I. Madan, S. Saluja, and A. Zhao, “Automated bitcoin trading via machine learning

algorithms,” 2015.

You might also like