are highly computerized. One result o this situation is that large !olumes o data are collected that deine mine perormance and equipment condition. "ome o this data is processed in real time to pro!ide inormation that allo#s or optimization o mine perormance. $%amples are the leet dispatch systems that de!elop equipment assignments& 'est matched to the stated o'(ecti!es o the mining operation and 'ased on real)time processing o data that deines equipment status and location. Most o the collected data& ho#e!er& is used or reporting and post)mortem analysis o mine perormance& or equipment ailure analysis and or pre!ention o its catastrophic ailures only. *n e%ample are the !ital signs monitoring systems installed on larger pieces o mining equipment. These systems collect data generated 'y a !ariety o sensors and store it to acilitate easy ailure diagnostics. In addition these systems ha!e a capa'ility to #arn the operator o impending ailure or to conduct orderly equipment shut)do#n i an emergency situation occurs. *!aila'ility o huge data'ases and spreading computerization has led to large strides in data processing capa'ilities and techniques. +ariety o po#erul data processing methods ha!e 'een de!eloped o!er the last years that acilitate rapid processing o !oluminous data or e%traction o user riendly inormation. One o such methods is data mining. Originally de!eloped 'y intelligence community to loo, or inormation in huge communication data'ases& data mining has since ound a range o commercial and scientiic applications. No#adays it is #idely used 'y retail industry to analyze sales& direct promotion and mar,eting eorts& 'y cellular telephone companies to assure client retention& 'y scientists to search or inormation in large data'ases created 'y -u''le space telescope& and in many other applications. This paper 'riely re!ie#s data mining and the related techniques& and proposes their use or disco!ery o ,no#ledge in data acquired 'y a !ariety o data acquisition systems used in today.s mines. In particular the paper suggests that data mining can 'e used to de!elop predicti!e capacity related to equipment condition and its perormance. Data mining oers a potential or urther& signiicant impro!ement o mine perormance. / D*T* MININ0 Data mining is an iterati!e process that in!ol!es setting the o'(ecti!es o the search& selecting and cleaning input data& transorming it& running a mining unction and interpreting the results. The schematic in ig.1& adopted rom I1M 2International 1usiness Machines& /3334& presents these tas,s graphically. The selection o data to 'e analyzed may in!ol!e integration o data rom !arious sources and oten requires their ormatting to it the ormat accepta'le to the data mining sot#are. In a mining situation #here the o'(ecti!e may 'e optimization o 5omatsu truc, perormance& data on load carried& on cycle times& and on truc, component perormance may 'e needed& acquired in dierent ormats rom engine monitoring system 2say Cummins engine monitoring system4& rom truc, dispatch system Data mining uses in mining Tad ". 0olosins,i University of Missouri-Rolla, Rolla, MO, USA *1"TR*CT6 The paper discusses potential use o data mining techniques in mining. It re!ie#s the 'asic techniques and methods o data mining and proceeds to identiy possi'le mining applications o this methodology. In particular the paper proposes use o data mining to de!elop predicti!e capacity related to condition and perormance o mining equipment. Other possi'le uses o data mining include optimization o mine perormance as #ell as equipment operator training. 2say Modular Mining.s Dispatch4& and rom an on) 'oard #eigh
7igure 1. The data mining process measuring system pro!ided 'y a third party. Ma(or pro'lem may 'e aced #ith ma,ing data ormats compati'le #ith each other and #ith that o data mining sot#are to 'e used. The ne%t step& transorming the data or its pre) processing may in!ol!e iltration& discretization& data (oining and similar actions. It allo#s organization o the data so that it may 'e mined eiciently. In the case o 5omatsu truc, mentioned a'o!e the data (oining #ould 'e a ma(or tas,& as #ould its discretization and iltration. Mining data is done using one or more o data mining techniques 'riely discussed 'elo#. It needs to 'e noted that data mining did not originally relate to mining. It is a general)purpose data processing method that permits disco!ery o inormation that may e%ist in !arious data'ases. Interpreting the results is the last and a !ery important step o data mining. Usually !arious !isualization tools are used in the process& #hich allo# or easy !ie#ing o the inormation and identiication o inormation disco!ered during the data mining process. D*T* MININ0 T$C-NI8U$" * num'er o techniques are used in data mining& each #ith its o#n interesting applications. "e!eral te%t'oo,s summarize and descri'e these techniques 21erson and "mith& 199:& ;estphal and 1la%ton& 199<& ;eiss and Indur,hya& 199<& others4. *s an e%ample. 1erson and "mith 2199:4 classiies data mining techniques as ollo#s. /.1 Decision trees The decision trees are predicti!e models that an 'e !ie#ed as a tree& #ith tree 'ranches representing a classiication question and the lea!es representing partitions o the data set #ith their classiication. The prediction is made on the 'asis o a series o sequential decisions. Thus in case o mining truc,s the decision tree could 'e used to identiy #hich truc,s are most li,ely to ail& and #hen& 'ased on such questions as6 #hat is the truc, ma,e& ho# old is it& ho# long it has operated& #hat is its past repair history& #ho #as its operator and the li,e. * decision tree model can 'e conirmed or modiied 'y hand and it can 'e directed 'ased on the e%pertise o the person constructing it. The decision tree models are 'est used or e%ploration o the data sets and that o the pro'lem at hand. It is done 'y loo,ing at the predictors and !alues that are chosen or each split o the tree. They can also 'e used or data pre)processing or other prediction algorithms. *n e%ample o such application is sho#n in the companion paper 20olosins,i et al& /3314. /./ Neural networks Neural net#or,s are computer implementations o sophisticated pattern detection and machine learning algorithms used to 'uild predicti!e models rom large historical data'ases. They allo# or construction o highly accurate predicti!e models that ser!e to sol!e a large num'er o dierent pro'lems. The main pro'lem #ith neural modeling is lac, o clarity& the price oten paid or their comple%ity and high accuracy. To o!ercome this pro'lem& !arious !isualization techniques are used in con(unction #ith neural models to help e%plain and control the model. The primary application o neural models in data mining is clustering& the technique that is used to segment a data'ase into clusters& or su')sets& 'ased on a set o predetermined attri'utes. The a'ility o neural models to perorm accurate numerical predictions led to !ariety o applications& including predictions o the stoc, mar,ets 'eha!ior. *s related to a mining truc,& neural clustering may 'e used to deine and quantiy the relations 'et#een !arious data streams collected on this truc,& ollo#ing 'y
clustering o these streams into mutually dependent groups. Thus& or e%ample& the actors that ha!e an impact on cycle time o the truc, can 'e deined and quantiied. /.= Nearest neighbor and clustering 1oth these techniques are !ery intuiti!e and 'et#een the irst used or data mining. Nearest neigh'or prediction algorithms are con!enient and simple predicti!e tools that allo# or clear e%planation o #hy a prediction #as made. The predictions are 'ased on 'eha!ior or properties o the >neigh'or? data #ith the highest #eight assigned to the data that is closest. Clustering is grouping& or >clustering. together the data that has the same or similar attri'utes. 1oth clustering and nearest neigh'or techniques are 'et#een the easiest to use and ha!e a !ariety o applications. 1oth are primarily used or prediction o ne# data rather than e%traction o rules rom an e%tensi!e data'ases. Using the mine truc, e%ample& these techniques appear to 'e most suited or prediction o #hen and ho# this truc, #ill ail& a ,ey piece o inormation or a mine operator. /.@ enetic algorith!s 0enetic algorithms reer to simulated e!olutionary systems that dictate ho# populations should 'e ormed& e!aluated and modiied. One o a !ariety o algorithms ,no#n as optimization techniques generic algorithms are in their inancy and more e%perience #ith them is required 'eore a mine) related use can 'e proposed. /.A Rule induction Rule induction is one o the most common orms o ,no#ledge disco!ery in unsuper!ised learning systems. This technique is oten used to >mine? data'ases& to disco!er inormation that is not o'!ious or readily a!aila'le. The technique retrie!es all potentially interesting data patterns in the data'ase #ith the ound rules 'eing generally simple and easy to understand. The rule induction can 'e used to ma,e predictions& 'ut its main use is or unsuper!ised learning to ind rules that are not already ,no#n. In reerence to the mining truc, the rule induction may 'e used to deine relations 'et#een !arious data streams collected on this truc,. *s an e%ample a rule can 'e disco!ered that states6 "if this truck is o#erated by o#erator $ and it is Monday, the #erfor!ance of the truck will be dis!al%& Bi,e#ise a rule can 'e deined that states "if the truck engine overheats and strut #ressures are within certain range, the truck is overloaded%. This technique oers a great promise i applied to mining equipment operator training. /.C Statistical !ethods Use o statistics is 'y ar the most common approach to data analysis and !arious statistical theories and calculations can 'e used to disco!er hidden patterns in the data'ases. These include& 'ut are not limited to regression& cur!e itting& principal component analysis& actor analysis and other. *s the statistics is one o the #ell esta'lished sciences and a huge !olume o inormation on its application to pattern disco!ery is a!aila'le& this data mining technique is not discussed urther in this paper. = MININ0 U"$" O7 D*T* MININ0 The ocus o data mining is to disco!er and deine hidden patterns and trends. Once a pattern is deined it can 'e used in many #ays& such as a training input into a neural net#or, or encoded as a rule into an e%pert system. Traditional applications o data mining include those or monitoring medical 'ill raud& mar,eting #ith coupons& monitoring credit card transactions& and the li,e 2;estphal and 1la%ton& 199<4. The data mining is estimated to 'e a D/3 'illion industry today. In spite o this& to the 'est ,no#ledge o the author& no attempt #as made to use data mining techniques to address mining related pro'lems so ar. -uge !olumes o !arious data are collected on today.s mining equipment. *s and e%ample each large o)high#ay truc, manuactured 'y Caterpillar is equipped #ith the so called +IM" 2+ital Inormation Management "ystem4 system that has a capacity to collect& store and transmit inormation rom o!er 1A3 sensors installed throughout the truc,. ;ith the sensor indication sampling rate o one per second& and truc, operating :&333 hrs per year& o!er =&:<3 M1 o data can 'e collected or each truc, during one year o its operation. ;hile some o this data is used to generate inormation descri'ing truc, perormance and condition& most o the collected data remains unused and is not analyzed. +ery little o it& i any at all& is used to orecast truc, condition or perormance into the uture. Instead the #hole data analysis eort directed on assessment o past perormance. Use o data mining techniques or inormation disco!ery in this huge data'ase appears to 'e one o the promising #ays to impro!e perormance o many mines. Re!ie# o current industrial applications o data mining indicates that there are numerous opportunities or its use in mines. Three most o'!ious applications are 214 mining equipment condition monitoring and ailure prediction& and 2/4 quantiication o and prognostication the mining equipment perormance 2=4 training o equipment operators. =.1 '(ui#!ent condition This application oers the highest potential or successul application o data mining in mining. The approach (udged most promising is to 214 ind& deine and quantiy the relations 'et#een !arious indicators o equipment condition 'ased on data mining o the data collected 'y rele!ant sensors& and 2/4 use the disco!ered relations to 'uild predicti!e models that #ould permit prognosticating uture equipment perormance. Data mining techniques o clustering and association appear to 'e the most promising in deining the relations and associations that may 'e o interest. On the other hand rule induction and polynomial regression& the latter not discussed here& may 'e the 'est techniques to de!elop the predicti!e capa'ility. =./ '(ui#!ent #erfor!ance In addition to equipment condition related data& !ariety o perormance related data is a!aila'le or each piece o mining equipment. This data is collected though leet dispatch systems no# used 'y a ma(ority o surace mines and some underground mines. *lternati!ely& this data can 'e collected 'y on)'oard monitoring systems& an e%ample 'eing Caterpillar +IM" system discussed a'o!e. I installed on a mining truc, the +IM" collects data on truc, load size& truc, speeds& and the li,e. It also calculates cycle times and other truc, perormance related data& and stores all or do#nloading or transmittal to mine data'ases. "imilar to equipment condition monitoring& discussed a'o!e& the data'ase that contains equipment perormance data can 'e mined or pattern disco!ery. Disco!ery o patterns #hich undou'tedly e%ist in this data'ase may then permit construction o a model a'le to prognosticate perormance o the mining equipment under a !ariety o scenarios. ;hile this concept is some#hat similar to leet simulation models that may 'e a part o the dispatch system it oers a num'er o added 'eneits. These include& 'ut are not limited to& a'ility to set perormance standards or uture enorcement and to deine the optimum operating parameters or !arious pieces o equipment. =.= O#erator training *s an e%tension o data mining use or mine perormance impro!ement& hidden pattern and trend disco!ery can 'e used to design and implement more eecti!e operator training program. 1ased on quantiied patterns and trends the optimum operator responses to !arious operations conditions can 'e deined and communicated to the operator. This may include deinition o optimum speed at a speciic segment o a haulroad& deinition o optimum load& deinition o the optimum accelerating and 'ra,ing patterns& and the li,e. @ CONCBU"ION" Modern mines generate huge quantities o data that descri'e and quantiy condition and perormance o mine equipment and o the mines themsel!es. *!aila'ility o this data creates a unique opportunity to impro!e perormance o 'oth. Data mining& a set o techniques used to disco!er hidden relations and trends in large data'ases& is the li,ely tool that #ill permit this to realize this opportunity. The most o'!ious mining applications o data mining are to prognosticating condition o mining equipment& to prognosticating its perormance and to training o equipment operators. R$7$R$NC$" 1erson& *. and "mith& ".E. 199:. Data #arehousing& data mining and OB*F. Mc0ra#)-ill 0olosins,i& T.".& -u& -ui and $lias& R. /331. Data mining +IM" or inormation on truc, condition. *FCOM /331& 1ei(ing& China. International 1usiness Machines Corp. 1999. Using the Intelligent Miner or data. Company pu'lication. ;eiss& ".M. and Indur,hya& N. 199<. Fredicti!e data mining. Morgan 5auman Fu'lishers& Inc. ;estphal& C. and 1la%ton& T. 199<. Data mining solutions. Eohn ;iley G "ons& Inc.