You are on page 1of 6

45th Annual Simulation Symposium

Privacy-Preserving License Plate Image Processing


Ikechukwu Azogu and Hong Liu
Department of Electrical and Computer Engineering University of Massachusetts Dartmouth 285 Old Westport Road, North Dartmouth, MA 02747, USA {U_IAzogu, HLiu@UMassD.edu}
AbstractAdvances in license plate detection and recognition software severely threaten privacy. Intended originally for video surveillance such as the law enforcement at automatic toll booths, license plate recognition software becomes so powerful that it can identify license plates from low-resolution and blurred images illegible to the human eye. However, the technology can be adversely used to track individuals regardless of suspicion, violating privacy, through ever present webcams, in Traveler Information Systems, for instance. This paper introduces a novel engineering solution to privacypreserving license plate recognition. A trivial solution to protect the privacy of individuals in video surveillance data is to black out each license plate; it effectively thwarts license plate recognition but renders no practical use as all information being obscured. The new system presented in this paper enables privacy preservation in the images containing license plates as well as car features while allows identification of particular license plates with legitimacy similar to a specific search warrant. Such dual purposes are fulfilled with an orchestra of information technologies. Pseudo-anonymity, a disguised identity held by many individuals, preserves privacy by ensuring the failure of license plate recognition. Steganography, security through obscurity or information hiding, reveals the license plate of a suspect by a protocol of synchronized pseudo random number generation. The synchronization protocol is superior over both symmetric encryption problematic with key distribution and asymmetric schemes involving expensive computation. Experiments are conducted on a real world Traveler Information System with favorable results.

Internet for the travelers to view. However, there are concerns that having this information available to anyone can lead to privacy violations of the travelers. Privacy can be defined as disclosure of confidential information about an individual that was not intended. Privacy can be categorized in two major attributes: Anonymity and Identity Management. Anonymity is defined as the degree to which an individual can be identified from a group of individuals [1] and [2]. If more similar the individuals in the group are, the more anonymous the individuals in the group would be, and thus the more difficult it is for an individual identification. Pseudo-anonymity further de-identifies an individual with a disguised, hence pseudo, identity held by a group of individuals. Identity management involves how different types of technology we use today such as cell phones, websites, GPS, blue tooth, surveillance cameras, RFID tags collect different kinds of information about us, and how privacy attackers can combine these different information to reveal more about us and even steal our identity [3] and [4]. There are incidents where peoples partial private information over the web based on different roles they play at different sites can be combined to facilitate identity theft [4]. The common approach to protect privacy violations in Traveler Information Systems is to disclose very low resolution images such as 320x240 pixels with low frame rate like one frame every 15 seconds and no archive of the images. In the case of masstraveler.com, recorded images only span the last 4 minutes and then discarded. Most often one sees that the license plates of the vehicles in the images are small, skewed, and blurry. This is due to the distance and angle of the cameras, the speed of the vehicles in motion, the quality of the device, and the limitation of transmission bandwidth, respectively. However, it has been proven that the above mechanisms are not sufficient to protect privacy because of the existence of reconstruction algorithms that can recover the license plates from images in the above mentioned conditions. The motivation from these restoration mechanisms arose from the need for Intelligent Traffic Management Systems and law enforcers to be able to recover license plates in limited conditions [5] and [6]. These mechanisms are also available to the public as well. In addition, simply using low resolution images and low frame rates does not protect very unique vehicles. For example, vehicles that possess unique features such as a fancy paint job or a noticeable dent may not require a license plate for identification. In other words, reduction of resolution does not protect against outliers.

Keywordsvideo surveillance, license plate detection, privacy-preserving image recognition, pseudo-anonymity, steganography, protocol of synchronized pseudo random number generation I. INTRODUCTION

Traveler Information Systems, such as masstraveler.com and mass511.com, make available on the web various aspects of traffic information to facilitate better planning for the travelers in order to avoid congested roads and harsh weather conditions when traveling. One critical type of information that these Traveler Information Systems reveal is real-time snapshots of the roads on the web. This is accomplished by installing cameras on the roadways to video record the conditions of the roads. A video is a sequence of images at a certain frame rate. These videos are made available over the This research was sponsored in part by UMass President Science and Technology Initiative 2010 fund.

978-1-4673-0040-7/11/$26.00 2011 IEEE

34

In recent years, there is an increasing demand for information to be shared in order to facilitate more accurate decision making. Different surveillance domains such as traffic monitoring, electronic toll collection, and bioterrorism detection are becoming more aware of the need for privacy-preserving data mining in order to protect confidential information of individuals. This protection is necessary because of the costs incurred from law suits or damaged credibility risks of the information source parties. The presence of data collection devices such as cameras or transponders also leads to unrest of the day-to-day users of the transportation infrastructure. In this paper, after a brief literature review on both license plate reconstruction and privacy preservation for image process, we demonstrate the insufficiency of protecting vehicles' identities in low-resolution and blurred images from a low-rate video clip. We then propose a privacy-preserving license plate image restoration system, called Disguise, which utilizes pseudo-anonymity and steganography to protect privacy and, when legitimate, to recover license plate images from traffic surveillance video data. We experiment Disguise on a real world Traveler Information System, masstraveler.com, to ensure that the society can share the video data with privacy preservation for individuals, with blacked-out license plates and distorted vehicle appearances in public, and safety protection for the whole, with a synchronization protocol to reveal a suspect's license plate by a law enforcement agency. At last, we conclude the work and direct the future research. II. BACKGROUND

techniques such as by closet or by cubic B-splint [6]. A typical MRF algorithm involves two processes: 1) Training and 2) Restoration. During the training process, the algorithm learns the relation between low and high resolution images from the examples of high resolution images and their corresponding images generated by reducing resolution and introducing blurs. The relation partitioned into overlapping blocks is registered in a statistical database in fragments of each image. In the restoration process, a lowresolution image is then transformed to a high resolution using these fragments through pattern matching. These fragments in the database are represented as a Markov Random Field (MRF) model. The MRF model gives a statistical description of the image fragments [5]. Before recognizing a license plate, a detection algorithm is applied on the image to extract the license plate image data for process. In his paper [7], Sam presents a license plate detection algorithm that finds the location in an image where the license plate of a vehicle resides. A modest Adaptive Boosting (AdaBoost) machine learning algorithm which has been used for facial detection is applied for license plate detection. The major strength of AdaBoost algorithms is their resistance to over-fitting. The algorithm reveals potential license plate locations. A template matching algorithm is then used to narrow down the real position of the license plate to some confidence level. This algorithm is accurate and efficient. Multiple vehicles in one frame under an unconstrained environment posts challenges to detect license plates. Lalonde proposes a fast license plate detection algorithm for such scenarios [8]. The proposed approach for detecting license plates is a shape-based approach. An enhanced edge detection algorithm runs across the image. Possible license plates are matched in the form of a generally horizontal rectangle. This yields a couple of possible license plates in the image. The actual license plates are then narrow down using the Hausdorff distance and matching template to recognize the shape of the plates. The Hausdorff distance is a distance metric for evaluating the dissimilarity between two pixels. The symmetry feature of a license plate is used to further narrow down the possible license plates in the image. B. Privacy Preservation for Image Processing As license plate recognition software gains power and popularity, preserving privacy becomes an urgent public concern. To our best knowledge, no work has been conducted in the content of privacy-preserving license plate image processing. Fortunately, many research results from multimedia information privacy in general [9] or deidentifying face images in particular [10] can be applied in this content. Raising anonymity level by blending individuals' features within their group is a proven effective way to preserve privacy. K-Same algorithm presented in [10] removes facial features captured by a video surveillance camera and replaces pixels of each of the K images with the averaged pixel values of the K images, resulting K identical images for the original K individual images before publish the image data. K-Same reaches the level of K-anonymous. The algorithm is as effective as blackout to thwart face identification/recognition by de-identifying the individuals to protect their privacy. K-Same is more superior to blackout

This section reviews the literature related to our research. Advances in license plate image restoration software are summarized that could be used by privacy attackers adversely. Various privacy protection mechanisms are then categorized. A. License Plate Image Restoration In the domain of Intelligent Traffic Management Systems, travelers license plate numbers need to be recognized for automatic congestion pricing to ease traffic condition, access control on toll booth usage, tracing of stolen cars, and identification of dangerous drivers for behavior-adaptive auto insurance. Videos are at low frame rate due to the economics of device quality and the limitation of transmission bandwidth, and the images are low resolution or even skewed due to the constrains where cameras can be places at certain angles. The pictures taken can also be blurry due to the movement of vehicles at high speed. The poor quality of images has challenged text partition and recognition, significantly reducing the accuracy of license plate recognition. One common method is to enlarge the image with interpolation before passing to the modules for letter partition and letter recognition. The method, however, blurs edges, losing the high contrast information. An effective method is to restore the image with superresolution. Super-resolution algorithms for license plate recognition are classified into two categories: a) Based on recovered super-resolution and b) Based on learned super-resolution. It is demonstrated that learned super-resolution based on Markov random field (MRF) outperforms the interpolation

35

because K-Same retains the facial features while blackout loses all the information. Facial identification involves the mapping of an identity to an individuals face while facial recognition involves matching of an individual face without necessarily knowing the identity of the individual. To validate the level of anonymity, the entropy is calculated across the protected region of interest to evaluate how much information is present [11]. Less entropy means higher anonymity level, i.e., better protection. While protecting individual privacy, there is a need for law enforcement agencies to identify suspects, stopping crimes for the society's safety. Though privacy covers different domain from security, some security technologies are applicable in privacy. Security involves the protection of a system from violation while privacy involves the control of disclosure. Security impacts on privacy when the confidentiality requirement is not satisfied. Cryptographic approaches, by transforming the message illegible to unintended readers, can protect regions of images such as individual faces. Carrillo presents an algorithm of privacy encoding that co-exists with compression and coding schemes [12], able to recover the original faces by law enforcers if the need arises. The regions of interest are encrypted with a key. The algorithm is robust to restore the original image even after passing through compression and the presence of noise or loss of information in the image. The drawback of this approach is the distortion of the facial images not appealing to viewers of the surveillance and open to image restoration and de-blurring techniques. In addition, since the information about the individual is still present in the image through the relation of the key and the encryption algorithm used, there is a potential risk of crypto-analysis attacks. Steganographic approaches hide the message in order to not raise the attention of unintended readers due to its appeared non-existence. They are more popular in image/video security due to the massive size of images/videos facilitating easy hiding but hindering computation process associated with cryptography. Socek demonstrates a digital video steganography that disguises a given video with another video [13]. Both cryptography and steganography involve keys to encrypt/hide and decrypt/uncover the message. In general, there are two categories of key dealing [14] and [15]: The first is symmetric encryption that a sender and a receiver must share a common key, secret to the two parties only, and a trusted third party distributing the key to the two in a secured channel. The second is asymmetric encryption that uses a pair of keys for a transaction, a sender using a receiver's public key to encrypt the message and then the receiver decrypting the message with its private key; no key distribution is necessary, however, the private key must never be revealed to anyone. The two approaches, each, possess weaknesses. Symmetric encryption involves risky key distribution while asymmetric requires high computation complexity not suitable for fast image processing. A third approach, proposed by David L. Pepyne et al, trades off the two approaches mentioned above [16]. It deploys a simple protocol for secure point-to-point communication, namely SPRiNG. SPRiNG uses synchronized pseudo random number generation to generate authenticator variables and fresh encryption keys on a per

frame basis. It significantly reduces the amount of key exchange by synchronizing pseudo random number generation so that a random number is used as a one-time session key for communication, which involves occasional key distribution to pass the seed between the two parties. Zhou performs a thorough analysis of SPRiNG, demonstrating its superior in terms of security level and computational complexity comparing with other security protocols [17]. III.
DEMONSTRATION OF A PRIVACY ATTACK

This section demonstrates that an attacker would be able to restore a license plate from a single image with illegible license plate data obtained from a real world Traveler Information System such as masstraveler.com as shown in Figure 1. With license plate recognition software such as MRF-based learned super-resolution [4] in hand, an attacker obtains a license plate by first gathering the training data as indicated along the left-side path of Figure 1. The training data include two sets of images: high resolution images and their corresponding low resolution images. The low resolution images are derived by down sampling and blurring. Both image pairs are further sub divided into patches of size 3X3 pixels. The training set is represented as a Markov Random Field (MRF). An MRF is simply a set of random variables with a Markov property that are organized in a network. In this network, the lower resolution patches are connected to their corresponding high resolution patches while the high resolution patches are connected to their nearest neighbor in the high resolution space. To build a model, two functions are computed known as observation and compatibility. The observation function models the relationship between a low resolution patch and high resolution patch while the compatibility function models the relationship between the high resolution patches. These two functions are required for estimating the maximum posterior probabilities of the relationship between a low resolution patch and several high resolution patches. To test a model, the matching of the test low resolution patch to the most likely corresponding high resolution is 1:10.

Figure 1. Privacy Attack Following the right-side path of Figure 1, the attacker takes a low resolution, blurred image from a Traveler Information System. He runs a license plate detection algorithm to extract the license plate region. There are

36

several license plate detection algorithms in the field of object recognition [7], [8], and [18]. One approach is to look at the vertical edge of the image because of the text-like characters in the plates [18]. However, in our application where the vehicles are far away, and blur caused by the speed. The edge densities of the characters fail. Also, the images have characters in other regions of the image that are not license plates. We propose a new method that involves a color and dimension based approach. Known license plate pixels in the RGB color space are collected, as shown in Figure 2. A RGB pixel comprises of three intensity values: Red, Green, and Blue. The range of these intensity values is [0 255]. A color is determined based on the combination of these intensity values.

(a)

(b)

(c) (d) Figure 3. License Plate Pre-Detection The license plate region captured is then divided into patches of size 3x3 pixels. The patches are compared with the training set to see the closely matched patches for each patch in the license plate region. From this, the attacker can transform the low resolution image to high resolution images by using the high resolution patches, and therefore reconstructing the image so as to recognize the license plate. Our objective in this paper is to defeat license plate detection/recognition software by protecting those identity revealing features of the image. We are looking for a mechanism that is fast, irreversible and conserves the visual perception of the image to facilitate easy viewing of images in the website. We propose a steganograpghic approach in protecting the license plate in the next section. IV.
DISGUISE: RESTORE LICENE PLATE WITH PRIVACY

Figure 2. Color Space of License Plate Pixels The minimum and maximum thresholds of the Red, Green, and Blue values of a pixel are derived from known regions that contain license plates. A region is a collection of pixels. These minimum and maximum thresholds are used to determine the likelihood of license plate pixels. The color image is converted to a binary image based on these criteria. An eight neighborhood connected component operation is performed on this binary image in order to segment the image into various regions. These various regions comprise of real license plate location as well as possible fake license plate locations. Further image filtering is necessary to further narrow down the license plates locations. Table 1 lists the criteria on the parameter settings used in our experiment.

Table 1. Parameter Settings

This section presents the architecture of Disguise, a privacy-preserving license plate image processing system that enables privacy preservation in the images of license plates as well as car features while allows identification of particular license plates with legitimacy similar to a specific search warrant. The design issues of each component are also discussed. A. Disguise Architecture Figure 4 depicts the architecture of Disguise. A video camera captures a sequence of low-resolution images at lowrate, called "Plain Image Frames" containing cars with blurred license plates, one black sedan with an illegible license plate for example. From each frame, Disguise detects the license plate using the algorithm proposed in Section III above. Disguise then applies a Steganographic function to convert each Plain Image Frame into its disguised image called "Privacy Protected Image" with cars being distorted and license plates being blacked out. The sequence of Privacy Protected Images is transported via the Internet for public view in real time. A law enforcement agency must obtain the synchronized random number before it applies the Inverse Steganographic function to uncover the original Plain Image Frame, which is processed by one of License Plate Recognition software packages presented in Section II.

Figure 3 shows the pre-detection results from a sequence of the image frames where their corresponding regions of the license plate are detected. It indicates a couple of false positives because these objects have similar pixel values like the license plate pixels. The problem can be resolved by using subsequent frames to detect what regions are always still in motion since the cars are constantly moving. Over time, we can further eliminate the false positives with these criteria. Due to the page limit, we leave the demonstration of effective license plate detection using correlations within a video sequence for our sequel paper.

37

approaches are more favorable in image privacy than cryptographic methods hindered by the massive size of the video data [9]. Choice on a synchronization protocol considers the tradeoff between privacy level and computational complexity. The random seed for synchronized random number generation at both the sender, a webcam, and the receiver, a law enforcement agency, actually represents the super-key used in a symmetric encryption. Similar, but less frequent, key distribution applies to such a synchronization protocol, effectively balancing privacy level and computational complexity.

Figure 4. Disguise Architecture A simple function to distort a car is to change the vehicle's color by circular shift left, which can be inversed by circular shift right, with the same times specified in the random number. More sophisticated distortion involves altering a vehicle's shape confined by the same category such as sedans or 8-axled trucks. A simple Steganographic function to hide a license plate could deploy a pair of image frames, called "Dark Frame" and "Light Frame" that black/wash out the license plate with two unified colors, specified respectively in the higher-order half of each frame's corresponding pixel value, while each lower-order half holding half of the original pixel value being disguised by the random number. Note that the different lower-order values only affect the hue of each unified color. Inverse Steganographic function extracts the halves containing the information, uncovers the data with the synchronized random number, and assembles back to the original pixel value. This is shown in figures 5 and 6.

V.

EXPERIMENT RESULTS

Figure 5. Steganography

Figure 6. Inverse

In the experiment, we run our detection algorithm to locate license plate regions in the images. Figure 7 shows an example license plate region. We preserve privacy in the plate region through the following ways. Each red, green, and blue component in each pixel is concatenated into one block. Two image frames are created: a light frame and a dark frame. A RGB pixel is white in color when (R, G, B) = (255, 255, 255) and is dark when (R, G, B) = (0, 0, 0). We apply a steganographic approach where we hide high half block on the low half of the dark image frame, and the low half block in the light image frame. Hiding the information in the image frames is not sufficient because this operation will be publicly known. Therefore, any individual who knows this operation can easily reverse engineer. In order to resolve this issue, a synchronized random seed is used to generate a unique sequence of random numbers, shared by both sender and receiver, which are used to perform an exclusive-OR on the pixels for hiding each pixel in the frames. The resulting light and dark frames are distorted as seen in Figures 8 and 9, respectively. Note that if we had only used one random number to exclusive-OR through, the image will only result in a shift of color and the license plate could still be seen. Figure 10 demonstrates an attempted recovery of the image with a different seed. Privacy preservation is ensured because the attacker cannot recover the plates without the right seed. Figures 11 and 12 present the processes of hiding and revealing license plates, respectively. The SetSeed function ensures that the random number generator is reset with a specified seed. The period sign (.) is a concatenation operation.

B. Component Design The key to effectively thwart illegitimate License Plate Recognition is for the Steganographic function to generate Privacy Protected Images with sufficient degree of pseudoanonymity [19] and [20]. Therefore, simply changing a vehicle's color does not make it blend with its group, which offers no pseudo-anonymity. Altering a vehicle's shape to confine with its category, however, would significantly increase its pseudo-anonymity. While blackout or washout is proven the most effective de-identification mechanism due to its total loss of information [8], one has to consider its applicability in video surveillance for legitimate License Plate Recognition by providing information in a secured way. Steganographic Figure 7. License Plate Region

Figure 8. Light Frame

Figure 9. Dark Frame

Figure 10. Attacker Proof

38

[3]

[4]

Figure 11. Hiding


[5] [6]

[7]

Figure 12. Revealing

[8] [9]

VI.

CONCLUSION [10] [11]

This work is the first attempt to preserve privacy in license plate image processing. Current misconception that low-resolution low-rate video data with blurred images do not raise privacy concerns is challenged by the advances in License Plate Restoration software. We have demonstrated the ability of License Plate Detection and Recognition software to identify license plates from those images eligible to the human eye. Recent study also shows that ad hoc deidentification methods, such as pixilation to reduce the number of distinct pixel values by substituting a block of pixel values with their average, a tactic often seen on TV, cannot thwart modern image recognition software. We propose a system for privacy-preserving license plate image processing, called Disguise, that achieves the dual purposes of protecting individual privacy with pseudo-anonymity and ensuring society safety by allowing only law agencies to reidentify suspects with steganography. Our solution is validated using a real world Traveler Information System. Future work includes prototyping the solution. Systematic testing and theoretical proving are under the way to measure the effectiveness of the solution. We will also study the component design and their sensitivity to the system integration. ACKNOWLEDGMENT The authors thank the anonymous reviewers for their valuable comments to improve the paper, John Collura and Michael Plotnik for insightful discussion, James Schleicher for guiding the tour to the Regional Traveler Information Center (RTIC) at the University of Massachusetts, Daiheng Ni for introducing us masstraveler.com, and Honggang Wang for the course in multimedia communications.

[12]

[13]

[14] [15] [16]

[17]

[18]

[19]

[20]

European Journal for the Informatics Professional, Vol. 11, pp. 32-37, 2010. Sweeney L., "Privacy-preserving bio-terrorism surveillance," Proceedings of AAAI Spring Symposium, AI Technologies for Homeland Security, 2005. Sweeney L., "AI technologies to defeat identity theft vulnerabilities," Proceedings of AAAI Spring Symposium, AI Technologies for Homeland Security, 2005. Xue, "Single image super resolution for license plate," Proceedings of the Sixth International Conference of Natural Computation, 2010. Wu, W. et al, "Low-resolution license plate images restoration based on MRF," Journal of Application Research of Computers (in Chinese), Vol.27 No. 3, March 2010. Sam, "Rapid license plate detection using modest AdaBoost and template matching," International Society of Optical Engineering, 2010. Lalonde, "Unconstrained license plate detection using the Hausdorff distance," International Society of Optical Engineering, 2010. Yang, M., M. Trifas, G. Francia, and L. Chen, "Cryptographic and steganographic approaches to ensure multimedia information security and privacy," International Journal of Information Security and Privacy, Vol. 3, Iss. 3, 2009. Newton, "Preserving privacy by de-identifying face images," IEEE Transactions of Knowledge and Data Engineering, 2005. Chirayath, V., Using Entropy as a Measure of Privacy Loss in Statistical Databases, Master's Thesis, University of Texas at El Paso, Computer Science Department, 2004. Carrillo, "Compression independent reversible rncryption for privacy in video surveillance," EURASIP Journal on Information Security, 2009. Socek, D., H. Kalva, S.S. Magliveras, O. Margques, D. Culibrk, and B. Furht, "New approaches to encryption and steganography for digital videos," Journal of Multimedia Systems, 2007. Kurose, J.F. and K.W. Ross, Computer Networking: A Top-Down Approach (5th Edition) -- Section 8.2, Addison Wesley 2009. Stallings, W., Cryptography and Network Security: Principles and Pracitce (4th Edition), Presntice Hall 2006. Pepyne, D.L., Y-C Ho, and Q. Zheng, "SPRiNG: synchronized random numbers for wireless security," IEEE Wireless Communications and Networking Conference Record (Cat. no.03TH8659), 2003. Zhou, T., Q. Yu, and H. Liu, "Comparison of wireless security protocols," Proceedings of the International Conference on Computer, Communication and Control Technologies, Orlando, Florida, pp.9499, July 31, August 1-2, 2003. Abolghasemi, V. and A. Ahmadyfard, "An edge-based color-ided method for license plate detection," Image and Vision Computing, Vol. 27, pp. 1134-1142, 2009. Mahmoud, H. et al, "Novel technique for steganography in fingerprints images: design and implementation," Journal of Image Processing, 2008. Torres, J.M., "Critical Success Factors and Indicators to Improve Information Systems Security Management Actions," Handbook of Research on Information Security and Assurance, pp. 467-482, 2009.

REFERENCES
[1] Pfitzmann, A. and M. Hansen, "Anonymity, unobservability, pseudonymity, and identity management: a proposal for terminology. dud.inf.tudresden.de/literatur/Anon_Terminology_v0.34.pdf. Accessed February 2011. Tsohou, A., C. Lambrinoudakis, S. Kokolakis, and S. Gritzalis, "The importance of context-dependent privacy requirements and perceptions to the design of privacy-aware systems," UPGRADE: The

[2]

39

You might also like