Professional Documents
Culture Documents
paper Focus: PCI DSS compliance Use Cases that can benefit from tokenization How tokenization eases PCI DSS compliance and increases security How tokenization relies upon but differs from encryption Assessing tokenization deployment options and tradeoffs, and why tokenization may not be cheap to build Checklist of questions for prospective tokenization buyers to ask before they commit to a solution
author
abstract
This guide is designed for PCI DSS compliance officers, PCI Qualified Security Assessors, and other security professionals who are evaluating tokenization as a technology to reduce risk and minimize the cost of PCI DSS compliance. The guide describes how tokenization works in practice and how retailers and other businesses can deploy it to reduce both their risk of a cardholder data compromise and their cost of PCI compliance. It also addresses how tokens are constructed, how they differ from but at the same time rely upon strong encryption, and what are the tokenization deployment options. We conclude with a list of questions your organization should address before committing to any particular tokenization approach.
walter Conway Payment Card Industry Qualified Security Assessor (QSA) at 403 Labs LLC.
Executive Summary
This Buyers Guide describes how tokenization can help enterprises achieve compliance with the Payment Card Industry Data Security Standard (PCI DSS). At its most basic level, tokenization is the process by which we replace a valuable piece of information with a meaningless number or token. In this Buyers Guide the valuable piece of information is the primary account number (PAN), but in a different context it could just as easily be a Social Security Number, drivers license number, birth date, medical information, or any other piece of personally identifiable information (PII). Tokenization has two attractive benefits. It can reduce an enterprises PCI scope and, therefore, cost of PCI compliance. Tokenization also can reduce the enterprises risk of a costly and brand damaging payment card data breach. Unfortunately, while there is a lot of information in the marketplace and a broad array of tokenization providers, there is as yet no agreed industry standard for what constitutes effective tokenization. The purpose of this Buyers Guide is to describe tokenization, explain how it works specifically in the context of protecting payment card data, and explore implementation options. We conclude this Buyers Guide with a list of questions that business and IT professionals need to address before selecting the tokenization solution that best meets their enterprises needs. This paper is aimed at business and IT professionals who are responsible for risk management and PCI compliance, and who are interested in approaches that can simplify the process and reduce the cost of maintaining PCI compliance.
table 1: pCi Data Security Standard - high Level Overview Build and Maintain a Secure Network protect Cardholder Data Maintain a Vulnerability Management program implement Strong access Control Measures regularly Monitor and test Networks Maintain and information Security policy 1. Install and maintain a firewall configuration to protect cardholder data 2. Do not use vendor-supplied defaults for system passwords and other security parameters 3. Protect stored cardholder data 4. Encrypt transmission of cardholder data across open, public networks 5. Use and regularly update anti-virus software or programs 6. Develop and maintain secure systems and applications 7. Restrict access to cardholder data by business need to know 8. Assign a unique ID to each person with computer access 9. Restrict physical access to cardholder data 10. Track and monitor all access to network resources and cardholder data 11. Regularly test security systems and processes 12. Maintain a policy that addresses information security for all personnel
Tokenization can reduce an enterprises PCI scope by replacing cardholder data with meaningless tokens, thereby shrinking an enterprises PCI scope, i.e., the number of people, processes, and systems that are subject to PCI. Reducing scope simplifies the effort and reduces the cost of becoming and remaining PCI compliant. The key principle is that tokens have no intrinsic relation to a payment card PAN, and therefore tokens should be out of PCI scope. They are not PANs. The degree of benefit (i.e., reduced PCI scope and increased security) will depend on the particulars of each enterprises own cardprocessing environment and practices. How much an enterprises scope can be reduced depends on many factors. While most of this Buyers Guide focuses on implementation, readers should also review guidance from the PCI Security Standards Council (PCI SSC, or PCI Council), which released its Information Supplement: PCI DSS Tokenization Guidelines in August 2011. In particular, the guidelines draw a distinction between tokens that are used in back office applications and those that can be used to initiate a transaction. They refer to the latter type as high value tokens. All
of the information in this Buyers Guide is appropriate for any tokens including these high value tokens. Every enterprise should consult with their acquirer or card processor to determine if they have such high value tokens, and if they do they need to determine whether their tokenization implementation is sufficiently secure to take these high value tokens out of PCI scope. Beyond reducing PCI scope, tokenization increases security because it can reduce an enterprises overall risk of a payment card data breach. Computer hackers focus on locating and stealing payment card data (i.e., the PAN, cardholder name, card expiration date), which they can use themselves or sell to others in a global criminal secondary market. Tokenization concentrates the storage of cleartext PAN data in a secure token vault. It thereby reduces the risk of an expensive and reputation damaging cardholder data breach. While it may seem that putting all your PCI eggs in one basket (i.e., the token vault) may create an attractive target for hackers, the reality is that you now have a single, secure location that you can monitor closely and manage securely.
When evaluating the positive benefits of tokenization for your enterprise, you need to consider the particulars of your own environment. As they say with new cars: your mileage may vary.
Applications
In-scope
Figure 2 represents that same payment card transaction, but here we have replaced the PANs in the database with surrogate values or tokens. In this case, the token is generated internally (hence the term internal tokenization), and the PAN data are securely stored in a token vault. This implementation removes the post-purchase applications (and the token database) from PCI scope. These out of
scope areas are shown in green. The tokenization engine that generates the tokens is in scope for PCI, as is the secure token vault that stores the tokens and associates them with the original PANs. Figure 1 is a simplified description of a real environment, which may have tens or even thousands of servers and end points that access and store PAN data.
Tokenization can effectively remove all these servers and end points from the enterprises PCI scope. The payoff is a smaller PCI scope and, therefore, reduced PCI compliance expenses including the costs of an outside assessment.
Tokens Tokens
Tokens
Applications
the enterprises PCI scope further as described in Figure 3, although there will be additional costs (e.g., fees charged by the tokenization provider) and business issues (e.g., loss of control, third-party business risk) to consider. As we will explore later, service provider tokenization has its own particular
business considerations (e.g., difficulty changing service providers) and risks (e.g., service provider business stability).
Tokens Tokens
Tokens
Third-Party Environment Third-Party Token Engine PAN, Token In-scope Token Vault Out-of-scope 3rd Party Applications
In this case the enterprises PCI scope (again, the boxes in blue) is reduced to the POS and the actual payment application that processes and transmits the payment card transaction for authorization. In practice, the third party (in orange) would be either a specialized tokenization vendor or even the enterprises own payment processor or acquirer. The difference between this and Use Case 1 (internal tokenization) is that the tokenization
engine and secure token vault are the responsibility of the service provider, and these systems and processes are in their not the enterprises PCI scope. Figure 4 is a variation on this second use case. Here the enterprise sends the PAN data from the POS directly to the tokenization service provider. The service provider first authorizes the transaction and then returns only the token to the
enterprise. As above, the tokenization process and the token vault are in the service providers PCI scope, but now the service provider is processing the authorization, too. This implementation has the greatest potential impact on reducing PCI scope (in blue), although similar cost and business risk issues would apply here as in Use Case 1.
Third-Party Environment Third-Party Token Engine PAN, Token In-scope Token Vault Out-of-scope 3rd Party Applications
Token Construction
Tokenization is a data security technology in which valuable information is replaced by strings of random characters or tokens. The tokens themselves have no value outside of the particular context for which they are designed. All of us have used tokens. Some of the more familiar examples of tokens include subway tokens, casino gambling chips, discount coupons, and store gift certificates or gift cards. In each case the token has value in a particular setting (the subway) or location (one casino or one store), but everyplace else the token has no value. The same principle applies to using tokens to replace payment card data. It is critical that any token used to replace a PAN have no relationship mathematically or otherwise to that original PAN. Using the analogy above, there should be no way to convert a casino chip back to legal tender without going to the casino cashier. This point illustrates the central difference between tokenization and encryption. Encryption is a two-way process. What can be encrypted can be decrypted. This is why encrypted PAN data are in PCI scope while properly constructed tokens representing PANs are not. This principle is illustrated in Table 2, below.
table 2: paNs, hashes, encryption, and tokens iteM Primary Account Number (PAN) truncated paN (Different from masked) hashed paN (One way; renders PAN unreadable) encrypted paN (More characters than the PAN and structurally different) token (Like PAN in length and character type, but randomly derived) exaMpLe 4123 4567 8901 2345 4123 45XX XXXX 2345 2fd4e1c6 7a2d28fc 9Ojr73h3d^&hh#&HFH&##ED*HD# 9483 7266 3928 9819 pCi SCOpe In PCI scope Out of scope Out of scope In PCI scope Out of scope
We begin with the actual PAN. That is always in PCI scope, and you store it (electronically or on paper) you must protect it as described in the PCI DSS. Truncating the PAN is one way to remove the data from scope. With truncation you retain only the first six and/or last four digits of the PAN. PCI does not consider a truncated PAN to be in scope. It is important that we not confuse truncation which means deleting all but first six/ last four with masking which means storing the full PAN, but displaying only the first six/last four digits. Another way to remove PAN data from scope is with a secure one-way hash. Encrypted PANs, however, are still in PCI scope. The reason is that encryption is a reversible function what can be encrypted can be decrypted. The one exception to this rule is if the enterprise has no ability to decrypt the data. In that one, special case (which we describe below) the encrypted data may be considered out of PCI scope.
Lastly we have a random token. The token may be the same length and share many characteristics of a PAN, but there can be no mathematical relation between the token and the associated PAN. That is, for tokenization to be effective there should be no way mathematically to reverse engineer a token and get back to the original PAN. Each of these approaches has advantages and disadvantages. Truncation clearly removes the data from scope, but truncated data may not be useful for many back office applications like fraud prevention or loyalty programs. Both hashed and encrypted values can contain alphabetic and non-numeric characters, which make either approach difficult to use with existing applications. Most legacy applications require a value that more closely resembles a PAN. All of which brings us to the inherent advantage and attractiveness of a token solution. A token can be made to resemble a payment card, including
sharing some of the digits from the original PAN (e.g., preserving the last four). That means the enterprise does not necessarily have to rewrite back office or post purchase applications. Tokens can, therefore, potentially be used for repeat purchases, recurring payments, and even chargebacks and refunds in addition to the post-processing applications described above.
Building a token
The single most important requirement in constructing a token is that it not be reversible. That is, knowing only the token, it should be computationally impossible to reverse engineer a token and get back to the original PAN. This requirement guarantees the token has value only in the enterprises own environment, and once outside the token is a meaningless string of digits. The best tokens are generated randomly. They may be generated using a hardwarebased random number generator or a 7
pseudo-random number generator based on the SHA family of hash algorithms. As noted above, you have some flexibility in generating tokens. You may, for example, preserve the last four digits of the original PAN, and generate the first 12 digits randomly. Once generated, the token has no meaningful value outside the enterprises environment. You can create tokens with other methods, even using a sequence number, possibly combined with the last four digits of the PAN. Regardless of the approach, the one constant is that recovery of the original PAN must not be computationally feasible given knowledge only of the token.
vice versa. For this reason a tokenization system might intentionally avoid producing tokens that either pass a Luhn algorithm check or start with the digits 3, 4, 5, or 6. Most enterprises will want to retire and possibly re-use tokens at some point. As the number of new tokens grows and legacy transaction data are tokenized, the process of generating new tokens and checking them against the list of tokens currently in use may get increasingly involved. This could lead to unacceptable processing delays at the POS.
Format-preserving tokens
As their name implies, format-preserving tokens look like a PAN. They will have 16 digits, and they may even be constructed to pass a Luhn algorithm. The Luhn algorithm (also known as a Modulus-10 or Mod-10 check after the mathematical process behind the Luhn algorithm) is a formula for validating the authenticity of a PAN by calculating the last or check digit based on the previous 15 digits. Format-preserving tokens are all but required in a payment application context since so many existing systems were built to accommodate a value that looks like a payment card number. The risk with format-preserving tokens is a collision with a valid PAN. One way to avoid such collisions (which must be avoided according to the guidance from Visa referenced later in this Buyers Guide) might be to perform an anti-Luhn process to require the last digit of the token is not a valid check digit. Another approach is to restrict the range of the first digit of the token to avoid the ISO/IEC 7812 range set aside for banking (ATM), travel, and payment cards. This would mean no token would begin with a 3, 4, 5, or 6. As noted below, token collision is not exclusive to collisions with live PANs.
token Collisions
While 12 (assuming you want to preserve the last 4) or even the full 16 digits allows for a large number of tokens, that number is not infinite. If we further restrict the feasible space by preserving additional digits (e.g., the first six or a subset of them), restricting the token to passing (or not passing) a Luhn algorithm, and not colliding with a valid PAN (there goes any token starting with a 3, 4, 5 or 6), we begin to confront the very real possibility of running out of tokens. Your tokenization system needs to distinguish between a token and a genuine PAN. The reason is to keep you from spreading tokens to systems expecting a real PAN, and importantly
point-to-point encryption
Point-to-point encryption (sometimes wrongly called end-to-end encryption) is the process by which a third party encrypts cardholder data from one point (e.g., the merchant) to another point (usually the card processor or acquirer). Only the third party can decrypt the data. The effect is to remove the data between those two points from the retailer or merchants PCI scope since they have no ability to access the cleartext cardholder data. Some people describe point-to-point encryption as a substitute for or alternative to tokenization. It is actually something quite different. While it can accomplish the same goal, namely reducing the enterprises PCI scope, point-to-point encryption neither allows the merchant to access the PAN data nor provides a token as a replacement. Point-to-point encryption can be used in conjunction with and supplement a tokenization system. For example, a third party could provide an encrypting card reader to encrypt the PAN data at the point of sale, authorize the transaction, and then create a token and return it to the merchants POS system. Point-to-point encryptions role here would be to support a tokenization solution, not replace it.
PCI Security Standards Council FAQ #10359; note that all the PCI requirements for strong encryption still apply to the outside entity managing the encryption PCI Security Standards Council FAQ #5384
10
Token collision is a potential risk. A large processor will have thousands of clients for whom they are generating tokens. They may re-use a particular token with different customers. Re-using a token is fine so long as the processor segregates each customers records securely to preserve token data integrity. Any enterprise considering a processor-based solution will want to investigate this topic with their processor(s). In addition to fees, enterprises exploring processor solutions should look at transaction speed (will there be a delay while the token is generated and substituted for the PAN?), data retention (how long are tokens and PANs retained?), and actual token generation (is it a random process?). They also should also ask themselves what they will do if or when they change processors: e.g., how will they painlessly migrate their old transaction data and tokens to the new processor; and how will they ensure the new and old tokens are compatible? These are two more topics to investigate well in advance of signing any contract.
Generating tokens
Tokens should be random. The minimum requirement is that if someone knows only the token, it must not be computationally feasible to get back to the PAN. It may be possible to use hashing or cryptographic function to create a token that meets this test, however most enterprises will find that only a randomly generated formatpreserving token both meets this test and generates a token that works in their existing applications.
11
compromised insiders as well as external hackers utilizing strong access and authentication controls. This principle is illustrated in Figure 5 describing how attackers internal or external could exploit security vulnerabilities to gain access to the tokenization process or the token vault. In practice, this means the software controlling access to the vault should be resilient against attacks and provide strong authentication and authorization controls. In most cases the enterprise will want to ensure the authentication and authorization integrates with the existing identity management infrastructure already in place.
The enterprise is responsible for the token vault whether it is hosted internally or at a PCI compliant service provider. The maxim you can outsource a process, but you cannot outsource responsibility still applies. Therefore whether you outsource tokenization or implement an internal solution, you will ultimately be responsible for its security and its PCI compliance. Figure 5 also illustrates the need to spend some time determining how you will protect the token vault. It is worth repeating that these considerations apply whether you host the vault internally or use a third party. We can expect attackers to focus their efforts on breaching the token vault and its
Token Vault
In-scope Out-of-scope
12
associated systems since that is where PAN data are centralized. Attackers will attempt malicious application requests or launch social engineering attacks to gain access. Therefore you need to assess how the enterprise will ensure only trusted applications and trusted users can access cleartext PAN data. One implementation option might be to have the token vault return only encrypted PANs. In this case only an approved application with the proper cryptographic key would be able to access cleartext PAN data. If the vault does return cleartext PANs, then additional throttling controls should be in place to prevent any user (or attacker) accessing an inordinately large number of PANs or even every PAN in the vault at one time. Hackers are very sophisticated (and motivated!), and allowing access to all PANs in the vault without some type of runtime control is risky.
Due to misinterpretation of Visa dispute processing rules, some acquirers require their merchants to unnecessarily store full Primary Account Numbers (PANs) for exception processing to resolve disputes. To clarify, Visa does not require merchants to store PANs, but does recommend that merchants rely on their acquirer / processor to manage this information on the merchants behalf. Visa also recommends that acquirers / processors evolve their systems to provide merchants with a substitute transaction identifier to reference transaction details (in lieu of using PANs).5 Before spending a lot of time and money exploring tokenization, make sure that you have a business need. It may be that an alternative approach is preferable, including not storing PAN data in the first place.
integrating tokens
In addition to generating tokens and securing the token vault, we need to address how to implement tokens into the enterprises existing applications and databases. If the tokens cannot be integrated then there will be no PCI scope reduction. Enterprises need to plan for how the tokenization engine will support existing data center protocols and how it will interface with database systems and related post-purchase applications.
Visa Best Practices for Primary Account Number Storage and Truncation, July 14, 2010
13
We should note that PCI all but requires this automated approach when it states: at least annually and prior to the annual assessment, the assessed entity should confirm the accuracy of their PCI DSS scope by identifying all locations and flows of cardholder data and ensuring they are included in the PCI DSS scope.6 The guidance goes on to require the QSA to verify that no cardholder data exists outside of the currently defined cardholder data environment. The only reliable way to locate all your PAN data is with an automated and thorough search. Anyone considering tokenization should use this procedure (or the results of your QSAs assessment) to be certain you have found all the data that you need to tokenize as well as all the applications that use the data.
The requirement that you always be able to distinguish between a token and a real PAN makes this question particularly important. Many software application vendors are working on tokenization options or at least token compatibility. You will want to investigate token compatibility with your application vendor(s) before committing to any particular tokenization program.
14
Does the vendor have the capacity to meet my business needs? Be sure they can provide tokens fast enough so your payment process is not affected. Also investigate token life, token re-use, and safeguards in place to prevent token collisions. what if i want to change vendors? Can i get my data back? This situation arises either when you change processors (a fairly common occurrence) or you just want to change vendors. You will want to include contract terms that support a smooth transition to your new vendor. There is no substitute for a strong service level agreement (SLA).
15
Tokenization Checklist
The table below may be helpful when reviewing internal and service provider tokenization options and when comparing vendor solutions. table 3: tokenization Checklist tOKeNizatiON taSK Or FeatUre preparatiON Have you found all your PAN data? Did you use an automated tool to be sure? Have you documented user requirements for tokens, as well as token and PAN access? Have you documented all POS and back office systems that will be impacted? Have you documented your current PCI scope? tOKeN GeNeratiON Does the solution generate tokens using hardware-based random number or similar process? If not, what process is used (e.g., sequence numbers, hashing) and will it meet the PCI non-reversibility test for tokens? Does the solution support single-use tokens? Does the solution support multiple-use tokens? What is the token lifetime? How is it determined? Is it configurable? Can the solution tokenize digitized legacy data (i.e., your current PAN databases)? What file types are compatible? That is, can the solution tokenize legacy data on paper, in MSWord or pdf documents, in your data warehouse, or in databases of electronically scanned documents? Can the solution provide tokens without introducing latency at the point of sale? tOKeN prOCeSS Where in the payment process does tokenization occur? What options are available that will minimize PCI scope? Is the solution compatible with existing POS equipment? Is the solution compatible with existing payment applications? Is the solution compatible with existing downstream or back office applications and business processes? Is the solution compatible with your detokenization business requirements (e.g., back office applications)? How much will the solution actually reduce your PCI scope? Does your QSA agree with this conclusion? Other iSSUeS Is security expertise a core strength of the solution provider (or the enterprise if solution is internal)? Is the provider financially sound? Have you updated your risk assessment (PCI Requirement 12.1.2) to reflect tokenization and potential risk if tokenization vendor or application is compromised? Have you updated security awareness training (PCI Requirement 12.6) to reflect tokenization? Will the vendor acknowledge in writing their responsibility for protecting your PAN data (PCI Requirement 12.8)? Have you updated your incident response plan (PCI Requirement 12.9) to reflect tokenization (e.g. tokenization failure; vendor failure; physical or logical breach)? prOViDer reSpONSe
16
paper Sponsorship
The Application Security and Identity Protection (ASIP) group sponsored the production of this Buyers Guide in order to facilitate broader understanding of use of tokenization technology for reducing PCI-DSS compliance scope. Intel has brought to market Intel Expressway Tokenization Broker, a solution that lowers costs and dramatically simplifies PCI DSS compliance and administration for organizations across all industry types that accept, capture, store, transmit or process credit card or debit card data. For more information on Intel Expressway Tokenization Broker visit: www.intel.com/go/identity
17
More information
Resource Site: www.intel.com/go/identity
Americas: 878-948-2585
E-mail: asipcustomercare@intel.com
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTELS TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked reserved or undefined. Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice.Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intels Web site at www.intel.com. Copyright 2011 Intel Corporation. All rights reserved. Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and other countries. *Other names and brands may be claimed as the property of others. Printed in USA Please Recycle 325985-001