You are on page 1of 6

Vehicule identication from inductive loops Application : travel time estimation for a mixed population of cars and trucks

C edric Le Bastard, David Guilbert, Antoine Delepoulle, Abderrahmane Boubezoul, Sio-Song Ieng and Yide Wang
Abstract This paper addresses the use of existing widespread Inductive Loops Detector (ILD) Network for realizing an estimation of individual travel time for a mixed population of cars and trucks. The aim is to provide trafc information to both users and trafc managers. The identication of vehicles is realized by comparing the destination inductive signature features with the origin inductive signature features using an identication method. In this paper, we propose to use three identication methods : a Bayesian based learning approach, a fuzzy logic method and the SVM method. These methods are evaluated on a real site. In order to increase the level of identication, several propositions are carried out and discussed.

I. INTRODUCTION For some decades, people and goods ows have been French governments central preoccupation due to the problems related to the reduction of congestions, dangerous driving or, more recently, pollution. Today, it is acquired that environment-friendly solutions depend on good trafc management on existing network. To effectively manage road trafc, in real-time, data collection and information analysis are essential. Thus, research in the trafc domain is interested in the use of various powerful systems of detection: video, Lidar [1], magnetometer [2],... In spite of the realized advanced research on these systems, the Inductive Loop Technology (ILT) still remains the sensor the most largely installed in trafc network in many countries such as France and the USA. This system is more discrete and robust than cameras and it works in all weathers. Moreover, it preserves the users privacy. In the past, ILT allowed the estimation of volume, speed, occupancy, presence and vehicle classication by the length. In the nineties, new efcient electronic devices provide signals, called electromagnetic signatures sufciently complex to provide useful characteristics of vehicles. More comprehensive information is thus available in real time. This information is used to increase vehicle classication robustness [3]. Recently, the vehicle signature has allowed, rstly, to estimate the individual travel time [4], [5], [6] and secondly to estimate the origin-destination matrices [7]. Some authors have focused on identication methods based
A part of this paper has been carry out during the internship of A. Delepoulle. A. Delepoulle is student at INP of Grenoble. C. Le Bastard and D. Guilbert are researchers in CETE de lOuest, France, cedric.lebastard@developpement-durable.gouv.fr A. Boubezoul, S.-S. Ieng are researchers in the IFSTTAR, France Y. Wang is professor in the IREENA Laboratory, PolytechNantes, France

on neural networks [6], [9], conventional distance measures and spatio-temporal distance measures from signature data [6]. Unlike individual reidentication, [8] proposes to use the sequences of vehicle lengths derived from loop detectors for matching platoons or groups of vehicles. In this paper, we focus on the recognition of electromagnetic signatures of individual vehicle to provide real-time information by using three other methods: the Bayesian based learning approach, the fuzzy logic method and the SVM method. The two rst methods, i.e. the Bayesian based learning approach and the fuzzy logic method, have already been used in [4], [5] to estimate the travel time of cars only. In this paper, we propose to generalize this study by taking into account the entire population of vehicules. Moreover, we also suggest using one of the most powerful supervised machine learning algorithms, the SVM algorithm. Then, we will analyse the behavior of the used identication methods for three types of population: cars, trucks and mixed population (car-truck). The principle of vehicles recognition by loops is based on the comparison of signature features collected on two successive loops of a section. These recognition methods are called in this paper reidentication methods, associated with electromagnetic loops allow us to identify, and reidentify vehicles and thus to anonymously carry out vehicle tracking. Thus, the estimation of travel time can be achieved. II. E XPERIMENTAL SITE This study has been carried out from two databases where the measurements were taken in Angers, France (Fig. 1). Before starting the experiments, the clocks of the two used systems (cameras and electromagnetic loop stations) were synchronized with the clock of the acquisition computer. The information stored on different areas of measurements was the electromagnetic signatures and video. Then, a data processing is performed to combine the electromagnetic signatures with video information. The characteristics of this site are shown in Table I. It is worth noting that the distance between the two areas is short but the difculty of the vehicle individual identication lies in the number of vehicles that does not pass through the two instrumented areas. Fig. 2 shows the histogram of speed in km/h for both populations: cars and trucks. Fig. 2 shows that trucks have a lower average speed than cars. Moreover, we note that the speed distribution of trucks is less spread. These speed distributions will allow us to establish the parameters of time windows which will

TABLE I S ITE CHARACTERISTICS


Site Distance between the 2 areas Speed control Non-instrumented inputs Non-instrumented outputs Number of lanes Number of cars Number of trucks Origin 2 1844 210 560 m 70 km/h 1 1 Destination 2 2881 237

Fig. 3.

Example of car and truck signatures.

Number of vehicles passing through the 2 areas Number of cars 1538 Number of trucks 206 Observations Short distance, uncongested

Fig. 1.

Site: urban freeway. Fig. 2.

Distribution with regards to speed.

be used with the identication methods presented in section III. III. I DENTIFICATION METHODS A. data In these experiments, the license plate recognition is necessary and is carried out manually. An operator views the video and identies all the vehicles which passed over the instrumented areas. Different elds are observed and collected including the license plate, the vehicle type (car, truck, van, ...), the timestamp, the le name containing the signature associated with this vehicle. After this analysis, a database is obtained with vehicles that have been in one or two instrumented areas. In these experiments, various characteristics are also measured from the Inductive Loops Detector (ILD) such as sensor number, signature, timestamp. The ILD measures the electromagnetic signature. This is a measurement of the variation of the deformation of the magnetic eld during the passage of a vehicle on an electromagnetic loop. For this study, the sampling frequency of ILD is 500 Hz. Fig. 3 presents the electromagnetic signatures of a car (signature with one maximum) and a truck (signature with several maxima). In the rst approach, we can see a large difference between these two signatures. In the studies presented in [6], [10], [9], in order to compare the two signals of the same vehicle passing over two different loop areas, various

preprocessings were made. In this study, we have carried out two preprocessings. Firstly, the speed of the vehicle could be different between the two loop areas. Indeed, for example, the rst area can be composed of a lightly congestioned ow whereas the second is composed of a non-congestioned ow. Thus, to bypass this problem, the signal is normalized to 96 points on the abscissa, and this is for each speed. Secondly, the signal amplitude depends on the vehicles type. Especially, it depends on the metal mass height of the vehicle with regards to the sensor, and it also depends on the vehicles position on the loop. Since the aim is to compare the two signals coming from the same vehicle but on two different sensors far away from each other, the trajectories will be different and, therefore, their amplitude too. To bypass this problem, each signal was normalized in amplitude. Indeed, we have chosen to normalize the signature maximum at the value 5000. Finally, in order to eliminate the electronic noise of the signal, only values above a certain threshold were recorded. In this paper, each signature is re-sampled at 96 points, whatever the vehicle type is (cars or trucks). This value was chosen in order to conserve the maximum information (especially for trucks), but also to keep a certain computational speed. We think that some extracted features should be easier to use than the signature because they should give a shorter summary. The choice of reliable features is not really straightforward. [11] has already investigated some properties of four features which are extracted from both raw and normalized signatures. As in [4], [5], 206 features are calculated from the signatures: 96 values associated with the signature, 47 frequency variables coming from the Fourier transform and 63 called global variables which are obtained by calculation. Among the features we are interested in, there are for example : the so-called shape parameter (SP) [11], the kurtosis, the mean, the median, the Standard deviation, Normalized signature Fourier Transform (FT) features among others. Thus, each signature is characterized by 206 features, which present many data. There are certainly redundancies. This also leads to great calculation burden. Our goal is to select a few features among the hundred collected. As in [5], we use the classical PCA method in order to analyse the correlation between features and to select independent features that contribute to the most part of signature information. The goal is to select the variables the most representative and the least correlated.

Unlike in [4], [5], in this paper, the database is composed of cars and other vehicles (trucks, bus,...) . This analysis is used to select the variables the most relevant for each type of population. Three types of population have been analysed : cars (Class 1 in the French norm [12]), other vehicles (composed mainly of trucks and called in the following trucks, Class 2 to 10 in the French norm [12]) and cartruck population (Class 1 to 10). In this analysis, each feature of vehicle is also important. The features have been normalized (mean = 0 and standard deviation = 1). In order to keep the most signicant information in the original data, we have selected 11 features for cars, 9 features for trucks and 12 features for the mixed population (cars and trucks). In the latter case, the database is composed of 90% of cars and 10% of trucks. For cars, three of these features are derived from the sampled signal (42nd , 56th and 90th signature components) , ve are derived from the Fourier transform (the real part of the 9th component, the imaginary part of the 4th , 5th and 6th components and the module of the third component) and three global temporal features are also identied: interval inside which all the values are greater than the maximum value divided by 2, standard deviation and skewness. For trucks, ve features are obtained from the Fourier transform (the real part of the 1st and 2nd components, the imaginary part of the 2nd and 3rd components and the module of the 3rd component). In addition, four global features are selected (the secondary maximum, the root mean square, the left root mean square and the skewness). For the mixed population, twelve features have been retained. Three features come from the sampled signal (42nd , 43rd , 44th components of the signature), eight features are derived from the Fourier transform (the real part of the 2nd , 3rd and 7th components, the imaginary part of the 2nd , 3rd and 4th components, module of the 2nd component and the quadratic sum of the eight rst modules of the Fourier transform component. There is also a time feature which is the sum of the right part of the signature divided by the sum overall. This subsection allows us to reduce the feature number by PCA. We have established the most p relevant features which are used afterward. Thus, each vehicle is characterized by a vector x of p features as follows x = (x1 , x2 , ..., xp )T . In the ILD based vehicle reidentication problem, a downstream signature feature vector xd is compared to a set of upstream vehicle feature vectors xu,i with i = (1, .., n) to nd a feature vector pair from one single vehicle. An additive noise n(t ) is assumed to represent the measurement uncertainties. Then, the upstream and downstream signals can be written as xu,i = xi + nu and xd = x + nd . Errors between the two feature vectors close to zero signify that two signatures should likely correspond to the same vehicle. The aim is to nd a feature vector pair that minimizes the error. In the paper, we propose to use three methods to solve the problem: the Bayesian based learning approach, a fuzzy logic method and the SVM method. Note that in practice, vehicle signatures vary from different detection stations. Indeed, the vehicle speed, the vehicle

position, the measured maximum amplitude and the signature can be different for two ILD. B. Bayesian based learning approach In this approach, we assume that n is the noise vector in which all the elements are uncorrelated and Gaussian with zero mean. The covariance matrix of noise vector is then . The purpose is to nd the best vector possible from a set of upstream vectors to match a downstream vector [5]. We can see this problem as the maximization of the probability of xu,i knowing xd . According to Bayes theorem, we have P(xu,i |xd ) = P(xd |xu,i )P(xu,i )/P(xd ). We deduce that the only term P(xd |xu,i )P(xu,i ) is to be maximized because P(xd ) is a constant normalization and it is not involved in the decision. To solve this problem, [5] denes the so-called discriminant function as : 1 gxd (xu,i ) = (xd xu,i )T 1 (xd xu,i ) + ln (P(xu,i )) 2 1 (ln (||) + ln (2 )) (1) 2 The term 1 2 (ln (||) + ln (2 )) is identical for all xu,i , and we assume in this paper that the upstream candidate vehicles are considered equiprobable, so equation (1) becomes: 1 gxd (xu,i ) = (xd xu,i )T 1 (xd xu,i ) (2) 2 Thus, the goal is to look for the maximum of the cost function (eq. 2). A learning phase is carried out thanks to a set of signature pairs and it consists in estimating the covariance matrix . C. Fuzzy logic approach All vehicles do not pass over the loops in the same location. Thus, the magnitude and the signature are substantially different. To take into account the uncertainties and inaccuracies of data, we propose to use a fuzzy logic method to re-identify the vehicles. Unlike to the Bayesian based learning approach, this method does not make any assumption on the noise. However, we need to introduce the fuzzy sets. Like [5], we use a trapezoid function; for each jth component xu,i,j of the ith upstream features vector, a fuzzy set Ei, j is dened as follows : 0 si xk / [ak k , bk + k ] , 1 si xk [ak , bk ] , ak si xk ]ak k , ak [ , 1 + xk fE ,i, j (xk ) = (3) k 1 + bk xk si x ]b , b + [ . k k k k
k

with xk = xd ,k xu,i,k where xd ,k and xu,i,k represent the kth element of vectors xd and xu,i respectively, k [1, p]; ak = mk p1 k , bk = mk + p1 k and k = p2 k . The parameters mk and k , respectively the mean and the standard deviation, are obtained by using a supervised learning phase. The parameters p1 and p2 are determined in order to nd the best results. According to [5], the best result is obtained when the trapezoid becomes a triangle for ak = bk ( p1 = 0). The

discriminant function is dened by Fi (Xd ) = p k=1 f E ,i, j (xk ). When the original signature is identical to that destination, the discriminant function tends to the number of input variables, such as p=11 for cars. A value that tends to p, the number of input variables, reects a great similarity between the upstream and downstream signatures. D. Support Vector Machine The SVM algorithm was introduced by V. Vapnik [13], this method is considered as the most powerful supervised machine learning algorithm. This algorithm often achieves superior classication performance compared to other learning algorithms in many domains and it is also fairly insensitive to high dimensionality. The SVM algorithm is based on the structured risk minimization principle. In this section, we propose to use this method to re-identify the vehicles. The studied problem can be seen as a binary classication problem: re-identication of vehicles or not. For the formalism of SVM, let us dene the vector vi of size ( p, 1) as vi = xd xu,i , with i the number of candidates and p the number of features. Let A = (v1 , u1 ), ..., (vk , uk ) composed of k pairs with vi R p be the training set. In this notation, the label ui corresponds to the identication or non-identication of the vehicle. In the case of linear separable data the basic idea of SVM is to nd a hyper plane f (v) = w, v + b which separates positive uk = +1 and negatives uk = 1 training 2 ) between samexamples and maximizes the margin ( w ples and the hyperplane. Thus we can nd the hyperplane by minimizing w subject to the following constraints: uk ( w, vk + b) 1 k. The prediction class of the vector:
u v , v + b , where and b are the f (v) = sign k k k

TABLE II P ERFORMANCE (IR) WITH THE CROSS - VALIDATION METHOD Methods Bayes Fuzzy SVM cars 70 % 68 %
64.1 %

trucks 89.3 % 81 %
87.3 %

Mixed 69.6 % 51.6 % 62.2 %

algorithms are compared on an ideal database. It contains only signature pairs, so every destination has an origin and vice versa. In this case, the removal of disruption has been achieved thanks to the videos. The disruptive signatures are the signatures coming from vehicles which are not passed through the two instrumented areas. Secondly, the methods are assessed on a database which is composed of signature pairs and also disruptive signatures. Two criteria are used to evaluate the performance, the identication rate (IR) and the representation rate (RR) dened as follows: IR = RR = Number o f correct pairs proposed Total number o f couples proposed (4)

Number o f correct pairs proposed (5) Total number o f couples identi f ied by video

To assess the methods, the entire database which is composed of z vehicles is separated in a training-validation set (T-V ) and a test set (T ). In this paper, the rst set (T-V ) includes z/3 vehicles. A. Study of the ideal database To analyse the methods, the IR has been chosen (in this case, the IR and the RR represent the same magnitude). 1) Performance with cross-validation method: Firstly, we use the cross-validation method to determine which one provides the best results. The cross-validation is a statistical method which allows us to evaluate and to compare learning algorithms [15]. The basic form of cross-validation used in this paper is the k-fold cross-validation. This method is used on the training-validation set (T-V ). For this study, k is chosen equal to 5. Knowing that the number of trucks is very low, we used the cross-validation method on the whole ideal base divided by k = 5 segments. For SVM classier with RBF kernel, the two hyper-parameters C and are to set. These parameters were estimated by cross validation for the three population types (cars, trucks and mixed). Table II shows that for this ideal base the Bayesian approach obtains the best results, generally followed by SVM and the fuzzy approach. Moreover, it can be noted that whatever the method, there is a performance difference depending on the population. Indeed, the IR is higher for trucks, followed by the IR for cars and for the mixed population. The differences of performance between the populations can be explained, on one hand by the data size in each population, and on the other hand by the features used for each population type. 2) Performance on test set : in this case, the rst set (TV ) including z/3 vehicles is used for the training and the test set (T ) is used to assess the methods. In this assessment,

solutions of the dual problem. In the case where the data are nonlinearly separable, Vapnik proposes to map training data in higher dimensional space H by the function (.) through dot products ., . , i.e. on functions of the form K (v1 , v2 ) = (v1 ), (v2 ) . The prediction class of the vector: f (v) =
u K (v , v) + b . sign k k k

In this study, we have used C-SVM of the library LIBSVM [14]. Afterwards, only the kernel RBF (Radial Basis function) is presented. Indeed, tests have been carried out with the linear kernel, RBF kernel and polynomial kernel. From these tests, the best results were obtained by the RBF kernel. Thus, several hyper-parameters are to estimate (C and ) for C-SVM with RBF Kernel. The generic form of the RBF Kernel is K (x, y) = exp( x y 2 ). In order to use the SVM method in the re-identication problem, we have taken into account the distance between the sample and the hyperplane. The reidentied couple will be the couple whose distance between the hyperplane and the sample is maximum in the class of label reidentied. IV. D ISCUSSION In this section, we propose to test the different identication algorithms presented in the previous section. Firstly, the

TABLE III P ERFORMANCE (IR) WITHOUT AND WITH THE TIME WINDOWS USED
without time window Methods Real Bayes Fuzzy SVM cars 100 % 38.8 % 36.7 % 27.9 % trucks 100 % 71.7 % 69.6 % 79.7 % Mixed 100 % 41.4 % 17 % 33.6 % with time window cars 98.6 % 93.5 % 92.1 % 88.2 % trucks 99.6 % 98.6 % 97.8 % 95.7 % Mixed 99.5 % 82.4 % 81.2 % 83 %

TABLE IV IR AND RR INTO BRACKETS FOR TRUCKS


Methods Ref OwMuD Threshold Bayes 91.3 % (98.6%) 97.8 % (98.4%) 98.4 % (89.1%) Fuzzy 90.6 % (97.8 %) 97.8 % (97.8 %) 99.2 % (89.9 %) SVM 98.5 % (95.7 %) 99.2 % (95.7 %) 99.1 % (84.1 %) una 98.5 % (94.9 %) 99.2 % (94.9 %) 99.1 % (80.4 %)

TABLE V IR AND RR INTO BRACKETS FOR CARS


Methods Ref OwMuD Threshold Bayes 47.6 % (92.3%) 82.6 % (89.2%) 69 % (85.9 %) Fuzzy 46.8 % (90.7 %) 79.9 % (87 %) 66.3 % (85.6 %) SVM 66 % (87.3 %) 84.1 % (83.2 %) 70.5 % (85.8 %) una 73.6 % (84.3 %) 90 % (78.3 %) 78.2 % (80.1 %)

Fig. 4. Histogram of candidate number for cars and trucks after time window.

B. Study of the disturbed database To reduce the error rate, we propose to compare three postprocessings. This study aims to evaluate the performance of methods in order to select the most effective approach. Firstly, we use the algorithms to match one unique origin for each destination. During the signature matching process, it is possible that two (or more) destination vehicles correspond to the same origin vehicle. To alleviate this problem, we rstly propose to search the origin vehicles mentioned several times in the matching process. Then, for each origin, only pairs in which the cost function is maximum are retained. All other pairs with the same origin vehicle are eliminated. This method is called in this paper OwMuD. Secondly, we propose to use several methods in parallel in order to take into account the proposed pairs by each method. Unlike a majority vote, the unanmous vote does not need an odd number of methods. However, this type of framework is only interesting if each method has a different behavior. In the paper, the method using the unanimous vote will be called una. Finally, we also use a threshold on the cost function of each method in order to distinguish a maximum of false pairs. Each method calculates the cost for each pair. The higher the cost is, the higher the probability will be of validity of the pair. The decision threshold is calculated on the training database. The introduction of these thresholds implies a compromise between the IR and the RR. As mentionned in section IV-A.2, we use the same time window. Tables IV and V show the IR (and the RR into brackets) for cars and trucks in four cases : reference, OwMuD, threshold and una. The reference case is only composed of the time window. Therefore, the performance of this situation represents the reference to evaluate the proposed post-processings. The threshold is chosen such that 90 % of RR on the training base is retained. For the reference case, the IR for trucks are especially

we use the same framework presented in [5], [7]. Thus, we use a time window which consists of searching an origin signature belonging to the time window. The time window positionning is calculated with the trafc parameters shown in table I. We use a rectangular window centered on the mean value 60km/h, with half-width of 20km/h. The rectangular window parameters are limiting factors because some origin signatures are not inevitably in the time window as shown in table III. Indeed, the line Real in Table III shows the theoretical maximum performance with the used time window. This window allows us to reduce on one hand the computational burden and on the other hand the number of candidates. Table III shows the IR for the three methods with and without the time window. Firstly, the time window allows us to reduce the candidate number. Thus, the IR increases strongly. The car average candidate number is about 8.6 whereas the truck average candidate number is about 1.7 as shown in Fig. 4. For trucks, 47.8 % identications are made only by the time window. When the time window is used, the performance obtained by the mixed population is less than that obtained by the two other populations. Thus, in the remainder of the study, we will not use the mixed population but only the two populations cars and trucks separately. In practice, the classication truck-car is already operational on the French ILD which allows us to process the data separately. Moreover, table III shows that for this database the Bayesian approach is better, followed by the Fuzzy approach and then SVM. Furthermore, table III also shows that the rates obtained by trucks is higher than that obtained by cars. One reason to explain these results is that the car candidate number is higher than the truck candidate number.

V. C ONCLUSIONS This paper is dealing with the use of existing widespread ILD Network in order to realize an estimation of individual travel time for a mixed population of cars and trucks. The identication of the vehicles is realized by the comparison of destination inductive signature features with the origin inductive signature features by an identication method. Firstly, we choose the most signicant features for three population types : cars, trucks and mixed population. Then, we propose to use three identication methods : the Bayesian based learning approach, the fuzzy logic method and the SVM method. To avoid mismatching vehicles, several propositions are made such as the unanmous vote, the use of threshold on the cost function (or distance sample-hyperplane) and the use of a lter which allows us to remove the estimated pairs having an origin for multiple destinations. To estimate the individual travel time, we have choosen the OwMuD solution with a unanimous vote. In this case, the travel time estimation average error is less than 1 second for cars and trucks. R EFERENCES
[1] H. H. Cheng, B. D. Shaw, J. Palen, B. Lin, B. Chen and Z. Wang, Development and Field Test of a Laser-Based Nonintrusive Detection System for Identication of Vehicles on the Highway ,IEEE Trans. On Int. Trans. Syst., vol. 6, n. 2, pp 147-155, 2005. [2] K. Kwong, R. Kavaler, R. Rajagopal and P. Varaiya, Arterial travel time estimation based on vehicle re-identication using wireless sensors, Transportation Research Part C, 17(6), pp 586-606, 2009. [3] C. Sun, An Investigation in the use of Inductive Loop Signature for Vehicle Classication, California PATH Research Report UCB-ITSPRR-2000-4. [4] S.S. Ieng, J.Bertrand, A. Bacelar, J. Nouvier, Travel Time by Using Widespread Inductive Loops Network, Transport Report Arena 2008, Ljubljana. [5] S.S. Ieng, C. Grellier, J. Rivault, J. Bertrand, M. Pithon, On the Inductive Loop Based Vehicle Signature Features Analysis and the Anonymous Vehicle Re-Identication for Travel Times Estimation, Transport Report Board 86th Annual meeting, Janvier 2007, Washington. [6] B. Abdulhai, S.M. Tabib, Spatio-temporal inductance-pattern recognition for vehicle re-identication, Transportation Research Part C, vol. 11, pp 223-239, 2003. [7] D. Guilbert, C. Le Bastard, A. Bacelar, A real time dynamic origindestination matrices estimation from a widespread Inductive Loop Detector (ILD) network, Transport Report Arena 2010. [8] B. Coifman, Vehicle re-identication and travel time measurement in real-time on freeways using the existing loop detector infrastructure, Transport Report Board 77th Annual meeting, Washington, DC, 1998. [9] C. Oh and S. G. Ritchie, Anonymous Vehicle Tracking for Real-Time Trafc Surveillance, UCI-ITS-TS-WP-02-15 August 2002. [10] S. T. Jeng and S. G. Ritchie. A New Inductive Signature Data Compression and Transformation Method for On-Line Vehicle Reidentication, The 85th Annual TRB Meeting, Jan 22-26, 2006. [11] S. G. Ritchie, S. Park, C. Oh and C. Sun. Field Investigation of Advanced Vehicle Reidentication Techniques and Detector Technologies Phase 2, California PATH Research Report UCB-ITS-PRR-2005-8 University of California, Berkeley U.S, March 2005. [12] Norme NF P 99-300, Donn ees routi` eres : Elaboration, stockage, diffusion Unit es de mesure et de traitement. [13] V. N. Vapnik, The nature of statistical learning theory, 1995. [14] Chang, Chih-Chung and Lin, Chih-Jen, LIBSVM: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology, Vol. 2, Iss. 3, 2011, pp 27:127:2 , Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm. [15] G. Dreyfus, J.-M. Martinez, M. Samuelides, M.B. Gordon, F. Badran, S. Thiria, Apprentissage statistique, Edition Eyrolles, Septembre 2008.

Fig. 5.

Estimation of individual travel time for cars and trucks.

high for both SVM and vote unanimously. Moreover, the RR shows that almost all pairs are predicted for trucks. The car RR is slightly lower than the trucks RR. If we compare the methods with one another, SVM provides the best performance. For the reference case, the good rates of trucks are partly produced by a small number of perturbation in this population. In comparison with the reference case, a strong improvement of IR and a slight decrease of RR are obtained by the OwMuD case for cars. For trucks, the performance is substantially the same between the reference and OwMuD congurations. The only observation is the increase of IR for the methods only. The lter allows us to cancel a large number of couples containing disruptions. However, some true pairs are also rejected. The threshold solution allows us to obtain better IR in comparison with the reference case. However, a compromise must be found between the IR and RR which is difcult to make (especially to choose the threshold values). Tables IV and V show that the best results are obtained by the OwMuD solution with a unanimous vote.

C. Estimation of travel time This section presents an estimation of individual travel time of cars and trucks. The solution used is the OwMuD solution with a unanimous vote whose performance is presented in the previous section. Figure 5 shows travel times observed and estimated for cars and trucks with regards to the elapsed time for 15 minutes. Figure 5 shows that there is a perfect matching between the travel times observed and estimated for trucks on a elapsed time of 15 minutes. Note that the IR obtained is about 99 % for the test database. For the cars, the IR is about 90 %. Thus, some estimated travel times do not match to the observed travel times and some observed travel times are not estimated. The estimation error mean is less than 1 second for the 54 minutes of experiments for both population types.

You might also like