You are on page 1of 5

Data mining uses in mining

Tad S. Golosinski
University of Missouri-Rolla, Rolla, MO, USA

ABSTRACT: The paper discusses potential use of data mining techniques in mining. It reviews the basic
techniques and methods of data mining and proceeds to identify possible mining applications of this
methodology. In particular the paper proposes use of data mining to develop predictive capacity related to
condition and performance of mining equipment. Other possible uses of data mining include optimization of
mine performance as well as equipment operator training.
1 INRODUCTION found a range of commercial and scientific
applications. Nowadays it is widely used by retail
Modern mine control systems and mine equipment industry to analyze sales, direct promotion and
are highly computerized. One result of this situation marketing efforts, by cellular telephone companies
is that large volumes of data are collected that define to assure client retention, by scientists to search for
mine performance and equipment condition. Some information in large databases created by Hubble
of this data is processed in real time to provide space telescope, and in many other applications.
information that allows for optimization of mine This paper briefly reviews data mining and the
performance. Examples are the fleet dispatch related techniques, and proposes their use for
systems that develop equipment assignments, best discovery of knowledge in data acquired by a variety
matched to the stated objectives of the mining of data acquisition systems used in today’s mines. In
operation and based on real-time processing of data particular the paper suggests that data mining can be
that defines equipment status and location. used to develop predictive capacity related to
Most of the collected data, however, is used for equipment condition and its performance. Data
reporting and post-mortem analysis of mine mining offers a potential for further, significant
performance, for equipment failure analysis and for improvement of mine performance.
prevention of its catastrophic failures only. An
example are the vital signs monitoring systems
installed on larger pieces of mining equipment. 2 DATA MINING
These systems collect data generated by a variety of
sensors and store it to facilitate easy failure Data mining is an iterative process that involves
diagnostics. In addition these systems have a setting the objectives of the search, selecting and
capability to warn the operator of impending failure cleaning input data, transforming it, running a
or to conduct orderly equipment shut-down if an mining function and interpreting the results. The
emergency situation occurs. schematic in fig.1, adopted from IBM (International
Availability of huge databases and spreading Business Machines, 2000), presents these tasks
computerization has led to large strides in data graphically.
processing capabilities and techniques. Variety of The selection of data to be analyzed may involve
powerful data processing methods have been integration of data from various sources and often
developed over the last years that facilitate rapid requires their formatting to fit the format acceptable
processing of voluminous data for extraction of user to the data mining software. In a mining situation
friendly information. One of such methods is data where the objective may be optimization of
mining. Originally developed by intelligence Komatsu truck performance, data on load carried, on
community to look for information in huge cycle times, and on truck component performance
communication databases, data mining has since may be needed, acquired in different formats from
engine monitoring system (say Cummins engine
monitoring system), from truck dispatch system
(say Modular Mining’s Dispatch), and from an on-
board weigh

Figure 1.
The data
mining
process
measuring system provided by a third party. Major2.2 Neural networks
problem may be faced with making data formats Neural networks are computer implementations of
compatible with each other and with that of data sophisticated pattern detection and machine
mining software to be used. learning algorithms used to build predictive models
The next step, transforming the data or its pre- from large historical databases. They allow for
processing may involve filtration, discretization, construction of highly accurate predictive models
data joining and similar actions. It allows that serve to solve a large number of different
organization of the data so that it may be mined problems. The main problem with neural modeling
efficiently. In the case of Komatsu truck mentioned is lack of clarity, the price often paid for their
above the data joining would be a major task, as complexity and high accuracy. To overcome this
would its discretization and filtration. problem, various visualization techniques are used
Mining data is done using one or more of data in conjunction with neural models to help explain
mining techniques briefly discussed below. It needs and control the model.
to be noted that data mining did not originally relate The primary application of neural models in
to mining. It is a general-purpose data processing data mining is clustering, the technique that is used
method that permits discovery of information that to segment a database into clusters, or sub-sets,
may exist in various databases. based on a set of predetermined attributes. The
Interpreting the results is the last and a very ability of neural models to perform accurate
important step of data mining. Usually various numerical predictions led to variety of applications,
visualization tools are used in the process, which including predictions of the stock markets behavior.
allow for easy viewing of the information and As related to a mining truck, neural clustering may
identification of information discovered during the be used to define and quantify the relations between
data mining process. various data streams collected on this truck,
following by clustering of these streams into
mutually dependent groups. Thus, for example, the
DATA MINING TECHNIQUES factors that have an impact on cycle time of the
truck can be defined and quantified.
A number of techniques are used in data mining,
each with its own interesting applications. Several
textbooks summarize and describe these techniques2.3 Nearest neighbor and clustering
(Berson and Smith, 1997, Westphal and Blaxton, Both these techniques are very intuitive and
1998, Weiss and Indurkhya, 1998, others). As an between the first used for data mining. Nearest
example. Berson and Smith (1997) classifies data neighbor prediction algorithms are convenient and
mining techniques as follows. simple predictive tools that allow for clear
explanation of why a prediction was made. The
2.1 Decision trees predictions are based on behavior or properties of
the “neighbor” data with the highest weight
The decision trees are predictive models that an be assigned to the data that is closest. Clustering is
viewed as a tree, with tree branches representing a grouping, or “clustering’ together the data that has
classification question and the leaves representing the same or similar attributes.
partitions of the data set with their classification. Both clustering and nearest neighbor techniques
The prediction is made on the basis of a series of are between the easiest to use and have a variety of
sequential decisions. Thus in case of mining trucks applications. Both are primarily used for prediction
the decision tree could be used to identify which of new data rather than extraction of rules from an
trucks are most likely to fail, and when, based on extensive databases. Using the mine truck example,
such questions as: what is the truck make, how old these techniques appear to be most suited for
is it, how long it has operated, what is its past repair prediction of when and how this truck will fail, a
history, who was its operator and the like. A key piece of information for a mine operator.
decision tree model can be confirmed or modified
by hand and it can be directed based on the
expertise of the person constructing it. 2.4 Genetic algorithms
The decision tree models are best used for Genetic algorithms refer to simulated evolutionary
exploration of the data sets and that of the problem systems that dictate how populations should be
at hand. It is done by looking at the predictors and formed, evaluated and modified. One of a variety
values that are chosen for each split of the tree. of algorithms known as optimization techniques
They can also be used for data pre-processing for generic algorithms are in their infancy and more
other prediction algorithms. An example of such experience with them is required before a mine-
application is shown in the companion paper related use can be proposed.
(Golosinski et al, 2001).
2.5 Rule induction Caterpillar is equipped with the so called VIMS
(Vital Information Management System) system
Rule induction is one of the most common forms of
that has a capacity to collect, store and transmit
knowledge discovery in unsupervised learning
information from over 150 sensors installed
systems. This technique is often used to “mine”
throughout the truck. With the sensor indication
databases, to discover information that is not
sampling rate of one per second, and truck
obvious or readily available. The technique
operating 7,000 hrs per year, over 3,780 MB of data
retrieves all potentially interesting data patterns in
can be collected for each truck during one year of
the database with the found rules being generally
its operation.
simple and easy to understand.
While some of this data is used to generate
The rule induction can be used to make
information describing truck performance and
predictions, but its main use is for unsupervised
condition, most of the collected data remains
learning to find rules that are not already known. In
unused and is not analyzed. Very little of it, if any at
reference to the mining truck the rule induction may
all, is used to forecast truck condition or
be used to define relations between various data
performance into the future. Instead the whole data
streams collected on this truck. As an example a
analysis effort directed on assessment of past
rule can be discovered that states: “if this truck is
performance. Use of data mining techniques for
operated by operator x and it is Monday, the
information discovery in this huge database appears
performance of the truck will be dismal”. Likewise
to be one of the promising ways to improve
a rule can be defined that states “if the truck engine
performance of many mines.
overheats and strut pressures are within certain
Review of current industrial applications of data
range, the truck is overloaded”.
mining indicates that there are numerous
This technique offers a great promise if applied
opportunities for its use in mines. Three most
to mining equipment operator training.
obvious applications are (1) mining equipment
condition monitoring and failure prediction, and (2)
2.6 Statistical methods quantification of and prognostication the mining
equipment performance (3) training of equipment
Use of statistics is by far the most common
operators.
approach to data analysis and various statistical
theories and calculations can be used to discover
hidden patterns in the databases. These include, but3.1 Equipment condition
are not limited to regression, curve fitting, principal
This application offers the highest potential for
component analysis, factor analysis and other.
successful application of data mining in mining.
As the statistics is one of the well established
The approach judged most promising is to (1) find,
sciences and a huge volume of information on its
define and quantify the relations between various
application to pattern discovery is available, this
indicators of equipment condition based on data
data mining technique is not discussed further in
mining of the data collected by relevant sensors,
this paper.
and (2) use the discovered relations to build
predictive models that would permit prognosticating
future equipment performance.
3 MINING USES OF DATA MINING
Data mining techniques of clustering and
association appear to be the most promising in
The focus of data mining is to discover and define
defining the relations and associations that may be
hidden patterns and trends. Once a pattern is
of interest. On the other hand rule induction and
defined it can be used in many ways, such as a
polynomial regression, the latter not discussed here,
training input into a neural network or encoded as a
may be the best techniques to develop the predictive
rule into an expert system. Traditional applications
capability.
of data mining include those for monitoring medical
bill fraud, marketing with coupons, monitoring
credit card transactions, and the like (Westphal and3.2 Equipment performance
Blaxton, 1998).
In addition to equipment condition related data,
The data mining is estimated to be a $20 billion
variety of performance related data is available for
industry today. In spite of this, to the best
each piece of mining equipment. This data is
knowledge of the author, no attempt was made to
collected though fleet dispatch systems now used by
use data mining techniques to address mining
a majority of surface mines and some underground
related problems so far.
mines. Alternatively, this data can be collected by
Huge volumes of various data are collected on
on-board monitoring systems, an example being
today’s mining equipment. As and example each
Caterpillar VIMS system discussed above. If
large off-highway truck manufactured by
installed on a mining truck the VIMS collects data
on truck load size, truck speeds, and the like. It also Golosinski, T.S., Hu, Hui and Elias, R. 2001. Data mining
calculates cycle times and other truck performance VIMS for information on truck condition. APCOM 2001,
Beijing, China.
related data, and stores all for downloading or International Business Machines Corp. 1999. Using the
transmittal to mine databases. Intelligent Miner for data. Company publication.
Similar to equipment condition monitoring, Weiss, S.M. and Indurkhya, N. 1998. Predictive data mining.
discussed above, the database that contains Morgan Kaufman Publishers, Inc.
equipment performance data can be mined for Westphal, C. and Blaxton, T. 1998. Data mining solutions.
pattern discovery. Discovery of patterns which John Wiley & Sons, Inc.
undoubtedly exist in this database may then permit
construction of a model able to prognosticate
performance of the mining equipment under a
variety of scenarios. While this concept is
somewhat similar to fleet simulation models that
may be a part of the dispatch system it offers a
number of added benefits. These include, but are
not limited to, ability to set performance standards
for future enforcement and to define the optimum
operating parameters for various pieces of
equipment.

3.3 Operator training


As an extension of data mining use for mine
performance improvement, hidden pattern and trend
discovery can be used to design and implement
more effective operator training program. Based on
quantified patterns and trends the optimum operator
responses to various operations conditions can be
defined and communicated to the operator. This
may include definition of optimum speed at a
specific segment of a haulroad, definition of
optimum load, definition of the optimum
accelerating and braking patterns, and the like.

4 CONCLUSIONS

Modern mines generate huge quantities of data that


describe and quantify condition and performance of
mine equipment and of the mines themselves.
Availability of this data creates a unique
opportunity to improve performance of both.
Data mining, a set of techniques used to discover
hidden relations and trends in large databases, is the
likely tool that will permit this to realize this
opportunity.
The most obvious mining applications of data
mining are to prognosticating condition of mining
equipment, to prognosticating its performance and
to training of equipment operators.

REFERENCES

Berson, A. and Smith, S.J. 1997. Data warehousing, data


mining and OLAP. McGraw-Hill

You might also like