You are on page 1of 10

Systematic Control and Management of Data Integrity

Ji-Won Byun
Computer Science Purdue University 656 Oval Drive, West Lafayette Indiana, USA

Yonglak Sohn
Computer Engineering Seokyeong University 16-1 Jeongneung-dong Seoul, Korea

Elisa Bertino
Computer Science Purdue University 656 Oval Drive, West Lafayette Indiana, USA

byunj@cs.purdue.edu ABSTRACT

syl@skuniv.ac.kr

bertino@cerias.purdue.edu

Integrity has long been considered a fundamental requirement for secure computerized systems, and especially todays demand for data integrity is stronger than ever as many organizations are increasing their reliance on data and information systems. A number of recently enacted data privacy regulations also require high integrity for personal data. In this paper, we discuss various issues concerning systematic control and management of data integrity with a primary focus on access control. We rst examine some previously proposed integrity models and dene a set of integrity requirements. We then present an architecture for comprehensive integrity control systems, which has its basis on data validation and metadata management. We also provide an integrity control policy language that we believe is exible and intuitive.

Categories and Subject Descriptors


D.4.6 [Operating Systems]: Security and ProtectionAccess Controls; K.6.5 [K.6 Management of Computing and Information Systems]: Security and ProtectionUnauthorized access

General Terms
Management, Design, Security

Keywords
Integrity, Access Control, Metadata Management, Validation, Policy Languages

1. INTRODUCTION
It has long been recognized that integrity, together with condentiality and availability, is a fundamental requirement for secure computerized systems. Although integrity of a system explicitly requires integrity in every component of the system (e.g., users, programs, and data), in this paper we focus particularly on integrity This material is based upon work supported by the National Science Foundation under Grant No. 0430274 and the sponsors of CERIAS.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. SACMAT06, June 79, 2006, Lake Tahoe, California, USA. Copyright 2006 ACM 1-59593-354-9/06/0006 ...$5.00.

of data1 . Indeed, todays demand for data integrity is stronger than ever. As many organizations are increasing their reliance on data (and DBMS) for daily business and critical decision making, data integrity is arguably the most critical issue today. Without ensuring data integrity, the usefulness of data becomes diminished as any information extracted from them cannot be trusted with sufcient condence. Prevalent collection and use of personal information also calls for adequate solution for integrity. As we are literally living in a society where individuals are evaluated heavily based on their records, any failure to ensuring integrity in personal information often leads to serious privacy incidents [13]. This particular concern has also been recognized and addressed by a number of legislations [11, 22, 23]. It is also important to observe that data integrity can be undermined not only by errors introduced by users and applications, but also by malicious subjects who may inject inaccurate data into a database with the goal of deceiving other subjects. Despite the signicance of the problem, however, theoretical/technical solutions available today for integrity are still very limited. A key difculty comes from the fact that unlike condentiality and availability, the concept of integrity is difcult to grasp with a precise denition. In fact, integrity often means different things to different people [30]. The most widely accepted denition of integrity is perhaps the prevention of unauthorized and improper data modication [3, 31]. This denition also seems to coincide with the primary goal of Clark and Wilsons approach, preventing fraud and error in the commercial environment [8]. Another well-known interpretation of integrity concerns with the quality or trustworthiness of data [6, 28]. According to our understanding, this seems to be the denition on which Bibas integrity model is based [5]. Inspection of mechanisms provided by database management systems (DBMS) suggests yet another view of integrity. Many commercial DBMS today enable system administrator to express a variety of conditions, often referred to as integrity constraints, that data must satisfy [25]. Such constraints are used mainly for data consistency and correctness. This multi-faceted concept of integrity makes it challenging to adequately address integrity, as different denitions require different approaches. For instance, Clark and Wilson addressed the issue of improper data modication by enforcing well-formed transaction and separation of duty [8], whereas Bibas integrity model prevents possible data corruption by limiting information ow among data objects [5]. On the other hand, many of current DBMS ensure data consistency by enforcing various constraints, such as key, referential, domain, and entity constraints [25]. In order to provide a comprehensive approach to the problem of
1 Hereafter, we refer integrity of data as data integrity or simply integrity, interchangeably.

101

integrity, we thus need a multi-faceted solution. Such a solution must mirror the generally accepted security approach according to which we need to provide tools and mechanisms for: preventing and controlling security breaches; monitoring and validating systems for detecting possible security incidents; and for recovering from security incidents, as no security mechanism, or combination of them, can offer complete protection. We believe that we need to specialize such an approach to integrity. Also, viable solutions to integrity must take into account the fact that integrity requirements may vary depending on the organizations and on a large number of factors. Therefore, we do not need integrity systems with built-in policies; we need exible systems supporting the specications and enforcement of application-dependent integrity policies. A comprehensive solution to integrity must thus support: The specication and enforcement of data acceptance policies, stating which data can be entered in the database by which subjects (users or applications) under which circumstances. Acceptance policies represent an important form of prevention of integrity violations and attacks. Current access control mechanisms provide some support for enforcing such policies; however, they need to be provided with an extensive set of metadata information concerning both subjects and data objects. Relevant examples of metadata include: under which role and and for which purpose a subject has inserted some data; the origin of the entered data; which application program has manipulated the data. The specication and enforcement of validation policies, stating how often the data have to be controlled once they have been entered in the database. Although acceptance policies may do a good job in avoiding the introduction of low integrity data, one still has to deal with the possibility that integrity be degraded or compromised later on. Validation policies can be considered a form of auditing, according to which data can be periodically controlled with respect to integrity. A validation policy language should allow one to specify: which data have to be validated; how often the data have to be validated, and by whom (possibly also by multiple subjects); which application-dependent events may trigger data validation. Note that in general validating all data may be very expensive and not required by the application. The development of mechanism for recovering from integrity violations and attacks. Such a mechanism should enable the system to react, possibly in real-time, to integrity violations. For instance, it may stop the user or application program introducing the erroneous data, assess and repair the damage, and, perhaps most importantly, prevent the spread of errors. Our main goals in this work are to 1) better understand the concept of integrity, 2) identify meaningful integrity requirements, and 3) design a comprehensive framework in which various integrity denitions and requirements can be specied and enforced. We note that although this paper focuses primarily on access control, integrity cannot be assured by access control alone. Many other mechanisms, such as transaction manager and user authentication system, are also required. Moreover, a solution for integrity management must be supplemented with data validation process as data integrity depends on various external factors such as time or changes on external data. The management of integrity thus requires continuous control and monitor of data in their whole life cycle, from the moment they are introduced to the system to the moment they are deleted from the system. As such, a design for integrity management systems requires to identify and combine necessary components so that they can together provide a comprehen-

sive solution to integrity control and management. Some key contributions of our work are summarized as follows. We identify and examine various requirements for data integrity. We present a comprehensive architecture for integrity management systems, which is aimed to support both integrityrelated access control and data validation. We introduce a notion of metadata template with which various types of metadata that are suitable for integrity requirements can be specied. We provide a exible integrity control policy specication language that is able to support various integrity requirements. The remainder of this paper is organized as follows. In Section 2, we examine various approaches to integrity and draw some meaningful requirements for data integrity. We provide an overview of our integrity management system architecture in Section 3 and present our integrity control policy specication language in Section 4. In Section 5, we provide a usage scenario illustrating how integrity requirements are expressed in our policy specication language. We then survey related work in Section 6 and conclude in Section 7 with some suggestions for future work.

2.

INTEGRITY REQUIREMENTS

The rst and most essential task in designing an integrity management system is to precisely identify what requirements integrity entails. This seemingly simple task is indeed challenging as there is no steadfast consensus on what is meant by integrity. Although integrity is generally understood as the prevention of unauthorized and improper modication of data [3], this denition is open to many interpretations as the term improper could mean many things [30]. In order to address this problem, we have examined various integrity models proposed in the literature. The requirements for integrity control system, that we believe are essential, are summarized as follows. 1. Control of information-ow 2. Data verication 3. Prevention of fraud and error 4. Autonomous data validation Control of information-ow prevents higher integrity data from being contaminated (or inuenced) by lower integrity data, and data verication ensures that only veried data are provided to certain transactions. Prevention of fraud and error is necessary to ensure that only legitimate data are introduced to information systems, and autonomous data validation tries to maintain and/or enhance integrity (or condence) of data, independently from data access. Although the selection of these requirements is subjective in some degree, we are convinced that they accommodate most meaningful integrity requirements for current commercial information systems. We also note that enforcing each of these requirements alone is not sufcient to preserve data integrity. In order to preserve integrity to some satised extent, a systematic method to orchestrate all of these requirements is in absolute need. In the following sections, we elaborate these requirements individually by examining related integrity models (if existing). We also point out some shortcomings of such exiting models whenever appropriate.

102

2.1 Control of Information-Flow


The notion of information-ow is based on two intrinsic operations, retrieval and modication. Information-ow represents an information transfer path which is a sequence of objects o1 , . . . , on+1 and a corresponding sequence of subjects s1 , . . . , sn such that si retrieves oi and modies oi+1 , and then si+1 retrieves oi+1 and modies oi+2 and so on for all i, 1 i n [6]. Integrity models concerning with the notion of information-ow include the Strict Integrity model and the Low Water-Mark model [5]. These models abstract the system as a set of subjects S, a set of objects O, and a set of integrity levels I. The set of integrity levels is a hierarchically ordered set and interpreted as degrees of trustworthiness or condence on subjects and objects. The relationship operators, < and , represent the dominance relationships between integrity levels, which is returned by the function iL : S O I. For instance, iL(e1 ) iL(e2 ) means that the integrity level of e2 is higher than or equal to that of e1 . Among subjects and objects of different integrity levels, the models ensure that information never ows from low to high integrity levels along the information transfer path. The strict integrity model [5] is the simplest integrity model with the notion of information-ow and has the following rules: No-read-down rule: s S is allowed to read o O if and only if iL(s) iL(o). No-write-up rule: s S is allowed to write to o O if and only if iL(o) iL(s). According to the no-read-down rule, subjects retrieve only data which have been classied on equal or higher integrity levels, and hence information is never allowed to ow from an object to a higher integrity subject. The no-write-up rule, unlike the no-readdown rule, allows subjects to modify objects if and only if objects are classied at equal or lower integrity levels than subjects. As the information transfer path is composed of a sequence of retrieve-rst and modify-next pairs, information never ows from low to high integrity levels. The distinct feature of the strict integrity model is that it gives abundant capability for modication to subjects cleared at high integrity levels but adheres a very strict standpoint for their retrievals. Subjects of low integrity levels have the symmetrically opposite capability. As an extension of the strict integrity model, the low water-mark model [5, 12] enforces the following rules: No-write-up rule: s S can modify o O if and only if iL(o) iL(s). Integrity-revision rule: Whenever s S reads o O, then iL(s) becomes M inimum(iL(s), iL(o)). Unlike the strict integrity model, the low water-mark model does not enforce the no-read-down rule; i.e., it simply lets subjects to retrieve data regardless of their integrity levels. The integrityrevision rule, however, operates as a complementary tool to the no-read-down rule and prevents any upward information ow. The integrity-revision rule enforces subjects integrity levels to be degraded in accordance with their retrievals, and the no-write-up rule connes subjects modications to the objects whose integrity levels are lower than or equal to subjects integrity levels which have been revised. The low water-mark model thus removes the heavy restriction of retrieval from the strict integrity models but still leaves tight control over data modications. One of the main drawbacks in these models is that data modication is not supervised at all as long as the subjects obey the specied rules. That is, once a subject is assigned a particular integrity level, the subject is absolutely trusted at the given level. For

instance, the fact that a subject is given the highest integrity level means that he is completely trusted and expected to make no errorneous/fruadulent attempts at all times. Therefore, unless it can be guaranteed that every subject can be trusted and that the integrity levels are never wrongly assigned to subjects or objects, these integrity models become somewhat meaningless.

2.2

Data Verication

In [8], Clark and Wilson propose an integrity model which is radically different from Bibas integrity models. One of the essential goals of their model is to make sure that only veried data are supplied to well-formed transactions. The model classies data as Constrained Data Items (CDIs), and Unconstrained Data Items (UDIs), according to the application of integrity controls. It also denes two sets of procedures, Integrity Verication Procedures (IVPs), and Transformation Procedures (TPs). TPs transform the status of data. Provided that TPs are well-formed transactions and receive only valid data, transformed data are guaranteed to preserve their valid states. In order to provide valid data to TPs, IVPs check whether CDIs conform the integrity constraints before CDIs are supplied to TPs. A certier of IVPs associates CDIs with corresponding IVPs. Some TPs may also take UDIs as inputs. In such cases, TPs either reject UDIs or transform them into CDIs. Although the Clark-Wilson model provides a realistic framework for integrity control systems, it also has several drawbacks. It classies data integrity only into valid and invalid. Such binary classication may not be sufcient for realistic information systems that are required to deal with various types of data; e.g., oftentimes useful information can be extracted from a group of untrustworthy data. Another shortcoming arises from the simple assumption that data preserve their valid states as long as they are initially veried by IVPs and only accessed by TPs. Although this assumption may be reasonable for certain types of data, it cannot be applied to data whose integrity depends on dynamic factors such as time or realworld facts. The integrity of such data may change regardless of access or initial verication. Moreover, the Clark-Wilson model does not address how to handle CDIs that IVPs fails to verify. That is, it is not clear how non-veriable CDIs should be handled. As only veried CDIs are provided to TPs, one may think that nonveriable CDIs should be simply deleted. However, it is too simplistic to assume that any non-veriable CDI item can be deleted, as other data or perhaps the system may depend on the CDI items. Thus, although the notion of IVP is necessary to address data integrity, it does not provide a complete solution.

2.3

Prevention of Fraud and Error

Another essential goal of the Clark-Wilson model is to prevent frauds and errors by enforcing separation of duty, which is a foundational principle in computer security. The model requires that a user (i.e., system user) be associated with a TP and a set of CDIs so that the TP is allowed to access the CDIs only on behalf of the associated user. In addition, it requires that only the certier of a TP (i.e., security administrator) may change the list of users and CDIs associated with the TP. In other words, the model enforces the principle of separation of duty between system users and system administrators. The enforcement of separation of duty has also been extensively investigated in the role-based access control model, and various forms of separation of duty have been proposed [10, 21, 27, 29]. Simon and Zurko [32] categorize them into strong exclusion and weak exclusion. The strong exclusion [21] requires that no user be allowed to perform two or more roles; thus, the roles are strongly exclusive. Although this requirement is trivial to enforce, it is too

103

rigid to be used in any realistic information system. The weak exclusion, unlike the strong one, admits a user to different roles. Such a exible acceptance strategy leads further variations as follows. The operational separation of duty [10] allows roles to have common users. However, it does not allow the users to perform a task as long as the union of all the actions, that have been permitted to the common users, cover all the actions that are necessary to complete the task. On the other hand, the dynamic separation of duty [10] allows users to use more than one role but disallows their requests to perform the roles at the same time. The object-based separation of duty [10] allows users to perform more than one role but disallows them from acting on the object on which they have already acted on. The history-based separation of duty [27] focuses on order-dependent actions that users perform on data objects. We believe that the work done by the access control community, especially in the role-based access control model, provides comprehensive solutions to integrity control as well. Therefore, in order to fully address the issue of data integrity, close collaboration between the conventional access control and the integrity control system is required.

Access Controller

Integrity Controller Integrity Validator Integrity Policy Repositor

Access Request Access Control Results

Conventional Access Controller Integrity Metadata Repositor

Integrity Policy Supplier

Figure 1: Architecture for integrity management system conned to specic operations or a specic policy. It accommodates not only the policies based on the previous models but also the ones based on newly devised models. As shown in Figure 1, the conventional access controller is supplemented by the integrity controller. Subjects located outside the access controller send access requests and receive access control results. Although the system is now extended with integrity controller, the subjects do not need to adjust their access plans substantially to the extended functionality. They only need to prepare to receive results which also have been extended with conrmations that may reect the notion of integrity. The integrity controller is composed of integrity validator, which carries out integrity validation for access requests and data objects in database, integrity policy repository and integrity metadata repository, which maintain and manage various information (about policies, users, and data) for the integrity validator. The information related to integrity policy is supplied by the integrity policy supplier. The integrity validator carries out its functions according to the integrity policy that have been provided in advance. Such functions are invoked either by arrivals of access requests from the conventional access controller or by data integrity validation procedures which run autonomously to ensure data integrity in database. In the case of an access request, the integrity validator sends a conrmation message (i.e., an integrity control result) to the conventional access controller. The conventional access controller passes this message to subjects outside the access controller as the nal access control decision. The following subsections provide the detailed descriptions for the core components of the integrity controller depicted in Figure 1.

2.4 Autonomous Data Integrity Validation


Most existing integrity models focus mainly on the control of data access. Although access control is undeniably crucial for integrity management, we argue that access control alone is not enough to guarantee data integrity. Data objects in a system need to be (often continuously) monitored and validated independently from access control; we call this process as autonomous data integrity validation. Notice that this notion is different from data verication which is triggered only by data access. Another difference is that while data verication only determines whether or not data are maintaining the required level of integrity (or satisfy predened specications), data validation not only veries the integrity of data but also tries to enhance the integrity of data if necessary. For instance, consider an information system that stores the credit rates of some business companies. Such data need to be reevaluated in accordance with any change to data resources (e.g., real-world facts or data in external information systems) which have been used for credit evaluation. The challenge is that it is hard to anticipate and be informed of changes to such data resources. Hence, synchronous updates on the corresponding internal data objects are difcult to be performed. We believe that this issue must be adequately addressed by integrity management system through data validation. That is, the system must make the best effort to maintain the consistency between the internal data and the external world, and any success or failure must be reected in the usage of the data (perhaps in terms of condence on data). This requirement can be supplemented with other requirements in order to be more meaningful. For instance, with the notion of information-ow that employs multiple levels of integrity, the data integrity validation process can run and reassign data objects to the most up-to-date integrity levels. Also, together with the notion of verication, data integrity validation can signicantly lighten the workload of IVPs and enhance their features whenever appropriate.

3.2

Integrity Policy Supplier

As illustrated in Figure 2, the task of the integrity policy supplier is to manage and supply the information required for integrity control, such as integrity policies and metadata values of controlled targets, to the integrity policy repository and the integrity metadata repository. The policy specication langauge (PSL) interpreter transforms integrity policies into structured data for easy manipulation and stores the data in the integrity policy repository and the integrity metadata repository. Details about the metadata structure and integrity policy language are provided in Section 4.

3. SYSTEM ARCHITECTURE
In this section, we present a system architecture which is designed to support the integrity requirements described in the previous section. We rst provide a system overview while introducing necessary components, and we elaborate on the main components.

3.3

Integrity Validator

3.1 System Overview


The system architecture presented in this section has not been

The integrity validator is a core component of the integrity controller. It is composed of access integrity validator, data integrity validator, and integrity validation procedures library. The access integrity validator runs when it receives an access request from the conventional access controller. It then identies the target of integrity control based on the access request and asks the related integrity metadata values from the integrity metadata repository. Based on the metadata values received, it then asks the integrity

104

Integrity Policy Repository Integrity Metadata Repository PSL Interpreter

Integrity Policy Supplier

Integrity Policies written with Policy Specification Language Integrity Policy Supplier

Integrity Validator

Integrity Policy Manager

Integrity Policy Descriptions

Figure 2: Integrity Policy Supplier

Integrity Policy Repository

Integrity Metadata Repository

Figure 4: Integrity Policy Repository


Integrity Policy Repository

Access Request Integrity Control Result

Access Integrity Validator

Integrity Validator

Data Integrity Validator

Integrity Validation Procedures Library

Integrity Metadata Manager

Integrity Policy Supplier

Integrity Validator

Integrity Metadata Properties

Figure 3: Integrity Validator

Integrity Metadata Repository

policy repository for the associated integrity policies. With the metadata values and policies, the access integrity validator performs integrity validation for the access request. After the validation, it returns a validation result (i.e., an integrity control result) to the conventional access controller. The result is either allowed or denied, and according to the result, some metadata may be updated2 . The conventional access controller receives this result and sends it to the subject that has submitted the access request. Unlike the access integrity validator, the data integrity validator plays roles such as verifying data before being accessed, conrming modications to data, or autonomously improving data integrity. Whenever the data integrity validator validates a data item, it retrieves the metadata associated with the data item and its corresponding policies. It invokes validation procedures that may access necessary data items in the database. Note that we assume that those procedures are completely trusted computing base components. Hence, the access requests produced by such validation procedures do not need to be controlled recursively by the integrity validator. The integrity validation procedures library maintains various validation procedures that are invoked by the data integrity validator.

Figure 5: Integrity Metadata Repository descriptions when it receives a policy reference request from the integrity validator.

3.5

Integrity Metadata Repository

3.4 Integrity Policy Repository


The integrity policy repository is composed of integrity policy manager and integrity policy descriptions. The integrity policy manager registers, updates, deletes integrity policies that are supplied by the integrity policy supplier. The policy descriptions are composed of descriptive information such as enforcement rules, constraints, and integrity validation process. For instance, a policy description may be an enforcement rule such as the no-readdown or the no-write-up in Bibas strict integrity model. We provide details about such integrity policies in Section 4. The integrity policy manager also searches and returns appropriate policy Note that such updates may be rolled back by the transaction manager if necessary (e.g., aborted transaction).
2

Basically, integrity metadata values are the information that is necessary in order to enforce specied integrity policies. For instance, in order to implement the notion of verication in the ClarkWilson integrity model, every data item may have a corresponding integrity metadata value which indicates whether it is CDI or UDI. On the other hand, if the information-ow needs to be controlled as in Bibas integrity model, the integrity metadata values of every data item must include its trust (or condence) level. The integrity metadata repository manipulates metadata values through integrity metadata manager. The integrity metadata manager registers and manages the integrity metadata values of data. It also processes the update requests by the integrity policy supplier or the integrity validator. Another major function of the integrity metadata manager is to nd and return appropriate integrity metadata values upon reference requests from the integrity validator, which uses the returned integrity metadata values, together with integrity policy descriptions, to either make an access control decision or perform appropriate integrity validation.

4.

INTEGRITY CONTROL POLICIES

In this section, we rst describe how integrity metadata are specied and managed in our framework. We then present our integrity control policy language which supports both data validation and integrity-related access control.

4.1

Metadata Specication

Data integrity can be determined by various factors; for instance, one can evaluate the integrity of a particular data item based on the subject who created the data, the source from which the data is ob-

105

tained, or the value of some other data items that are related to the data3 . Such factors can vary depending on the type of data and/or the system requirements, and precisely specifying such factors for each data item is crucial for integrity control. Thus, we introduce the notion of metadata template, which is the basis of the specication and enforcement of our integrity control policy. In our framework, prior to dene integrity control policies for certain data, a metadata template must be dened for each target data type. Metadata templates are essentially the pre-dened, specic descriptions (i.e., attributes) of data, which are relevant to the integrity of data. Metadata templates are also dened for the subjects (i.e., users) based on their roles to describe various attributes of the subjects, which are necessary to make integrity related access control decisions. In addition to dene a set of attributes for a data type or a role, metadata templates also specify how dened attribute should be initialized and managed. More specically, each attribute in a metadata template is associated with a specic method which determines the value of the attribute; that is, an attribute is registered with a default value, a designated function, or a system variable such as $USER or $TIME. Except for the attributes that are registered with default values, attribute values must be updated only through the registered procedure or system variable. We note that such controlled management of the metadata attributes is necessary to guarantee the integrity of metadata values. Denition 1. (Metadata template) Let OT be the set of data types and R be the set of roles existing in the system. A metadata template for a particular data type ot i OT or a particular role rj R is specied as follows. MD-TEMPLATE template-ID FOR target { attr1 : attribute-description; attrn : attribute-description; } where target is either ot i or rj and represents the entity that is associated with the specied metadata template. attri , i = 1, . . . , n, is the identication of the i-th attribute, and attributedescriptioni , i = 1, . . . , n, is the registered method for the i-th attribute, which may be a specic value, a function, or a systemvariable. 2 When a new data item is introduced to a system, an instance of metadata template (i.e., a metadata object) is created for the data item, according to the metadata template specied for the type of the data item. Then this metadata object is associated with the data item throughout its life-cycle; that is, whenever an access to the data item is requested, the metadata object is retrieved and possibly updated according to the related integrity control policies. Similarly, when a subject activates a role, a metadata object is instantiated from the metadata template specied for the role. This metadata object is associated with the subject and used by the integrity controller until the subject deactivates the role.

trol Policy (ACP) and Data Validation Policy (DVP). ACP is essential for integrity control as modications to data may have a direct impact on the integrity of data. ACP is also necessary for addressing the issue of undesirable information ow (e.g., [5]) through a series of retrievals and modications. We note that as discussed in Section 3, our integrity control system is supplementary to the conventional access control mechanism. Therefore, the purpose of ACP being discussed here is to prevent improper accesses, not unauthorized accesses. That is, ACP does not deal with whether or not users have proper privileges to access data, but only deals with whether or not data are properly accessed by authorized users. The other key component of our integrity control policy is DVP, which governs continuous process of monitoring and/or enhancing the integrity of data. Indeed, our notion of DVP can be considered similar to Integrity Verication Process (IVP) introduced in [8]. However, as we discussed in Section 2, the notion of IVP alone is not sufcient to fully address the issue of data integrity. First, IVP ensures the integrity of data only at the initial stage. That is, once a certied IVP conrms that a particular data item is in a valid state (supposedly when the data is rst introduced to the system), the integrity of the data is ensured by allowing accesses only to certied Transformation Procedures (TPs). Although this assumption may be applicable to some data, the integrity of data often depends on dynamic factors such as time or real-world events. Thus, comprehensive integrity control systems require continuous monitoring and checking for the integrity of data. Another key issue overlooked in [8] is that the notion of IVP does not address the case in which the verication of data fails. In other words, it is not clear what actions need to be taken when the verication procedure fails to conrm that a data item is in a valid state. Although deleting such a data item can be a solution, this may critically degrade the system usability. Thus, integrity control systems need to support user-specied recovering strategies from such verication failures. We now provide our integrity control policy language which we believe is generic and intuitive and sufciently addresses the issues discussed above. Denition 2. (Access Control Policy (ACP)) Let OT be the set of data types existing in the system and ot i OT be a data type. Let R be the set of roles existing in the system and rj R be a role. Let ot i .attrs be the set of attributes specied in the metadata template of ot i . Similarly, let rj .attrs be the set of attributes specied in the metadata template of rj . Then an ACP which governs access to the instances of ot i by subjects with role rj is specied as follows. AC-POLICY ACP -ID FOR (ot i , rj ) { WHEN AC-Event1 , . . . , AC-Event ; IF Condition; THEN Decision1 : Action1 , . . . , Actionm ; ELSE Decision2 : Action1 , . . . , Actionn ; } AC-Eventk , k = 1, . . . , , represents an access request {Read, Insert, Update, Delete}. Condition is a set of boolean-expression primitives which may be conjuncted, disjuncted, or negated with the boolean operators , , and , respectively. A boolean-expression primitive is of the form (x y), where x (or y) is attrp ot i .attrs, attrq rj .attrs, a constant, or a function that returns true or f alse, and {<, , >, , =, =}. Decisionk , k = 1, 2, is an access control decision which is one of {Allow, Deny}. icy is specied for a table (or a set of columns in a table), and the specied policy is enforced for every tuple in the table.

4.2 Integrity Policy Specication


Like metadata templates, an integrity control policy is specied for a particular data type and enforced on all its instances 4 . Our integrity control policy consists of two types of policies: Access Con3 Data integrity can be determined solely based on the content (value) of the data; e.g., the data value must be in some specic range. Such constraints are already supported by current DBMS, and in this paper we focus on auxiliary information about the data, other than data content itself. 4 For example, in the relational data model, an integrity control pol-

106

Actionk , k = 1, . . . , m (or n), represents an action to be taken as a consequence of the corresponding access control decision. An action is either a procedure invocation or a metadata update. 2 Denition 3. (Data Validation Policy (DVP)) Let OT be the set of data types existing in the system and ot i OT be a data type. Let ot i .attrs be the set of attributes specied in the metadata template of ot i . A DVP for data type ot i is specied as follows. DV-POLICY DV P -ID FOR ot i { WHEN Event1 , . . . , Event ; IF V alidation-procedure; THEN Action1 , . . . , Actionm ; ELSE Action1 , . . . , Actionn ; } Eventk , k = 1, . . . , , represents either an access request {Read, Insert, Update, Delete} or a user-dened event such as a specic time or a particular situation that triggers the specied validation policy. V alidation-procedure is a designated function which validates the data instances of ot i . It returns true if the validation succeeds; otherwise, it returnsf alse. Actionk , k = 1, . . . , m (or n), represents an action to be taken as a consequence of the data validation. An action is either a procedure invocation or a metadata update. 2 The semantic of our integrity control policy is as follows. The WHEN-clause species particular event(s) that triggers the specied policy. ACPs are triggered only by access requests while DVPs may be triggered by either access requests or some user-dened events (e.g., specic times or particular changes in the system). When an access request triggers both a DVP and an ACP, the DVP is enforced before the ACP. It is worth noting that although it is possible to validate a data item only upon an access request, this could create an unnecessary and/or undesirable workload especially for systems with real-time constraints or systems with frequent data accesses. In such a case, DVP should be dened independent from access requests and enforced only in some special events. The IF-clause in ACP contains a condition that checks various metadata attributes in order to determine the integrity of the data. After evaluating the condition, either the THEN-clause or the ELSE-clause is executed. Each THEN-clause and ELSE-clause contains an access control decision which may be either allow or deny, and also a set of actions that should be taken subsequently. Possible actions include updating metadata attributes or invoking necessary procedures. The IF-clause in DVP contains a data validation procedure which returns the result of the data validation. Like ACP, each of the THEN-clause and ELSE-clause in DVP specify a set of actions that should be taken according to the result of the validation procedure.

analyze them, and produce its assessments. More specically, Data Collectors (DC) produce Collected Data (CoD), and Stock Analysts (SA) analyze CoD and produce Analytical Data (AnD). Both CoD and AnD are used by SA to produce the nal Assessment Data (AsD), which are referenced by legitimate Clients (CL). Due to the nature of its business, IEM considers the integrity of data a top priority at all times. The integrity requirements of IEM are summarized as follows. 1. (Information-ow) Every DC and SA is assigned a trust level based on his records of performance and analytical accuracy. As the trust levels may change dynamically by the management, the trust levels should be retrieved using a designated function, getTrustLevel($USERID), whenever needed. 2. (Information-ow) A DC can create or modify CoD items unless its trust level equals 0. When a DC creates or modies an CoD item, the trust level of the DC must be reected on the condence level of the CoD item. 3. (Data verication) CoD can be decisive factors in the stock value assessment. Thus, if the condence level of a CoD item is less than a specic level, c, the item must be veried by a predened verication procedure, verifyCoD(this), before it is referenced by SA. 4. (Information-ow) SA may also create CoD items if it is necessary for their analysis, and such a CoD items condence level is determined by the trust level of the SA who creates it. However, in order to create CoD, SA must have the trust level higher than a specic level, t. 5. (Information-ow) The condence level of an AnD item is determined by the trust level of the SA who creates or modies the AnD item. 6. (Information-ow) Some SAs (whose trust levels are less than a specic level, ) are in their training, and they can create AnD items, but should not modify any AnD item that has the condence level greater than . 7. (Information-ow) The condence level of AsD is determined by the trust level of the SA who creates or modies AsD. 8. (Prevention of fraud and error) There could be a possibility of legitimate but malicious/erroneous AsD if AsD can be produced by only one SA. Hence, each AsD item must be conrmed by two or more designated SAs (using a predened procedure, getConrmationAsD(this)) before it can be referenced by CL. 9. (Autonomous data validation) Changes on certain external or internal data may inuence the integrity of AsD items. As SA cannot be expected to always monitor such data, they are monitored autonomously and any change is informed as a special event, ChangeOnData. Upon such an event, AsD should be revalidated by a predened function, revalidateAsD(this). In the case where an AsD item cannot be revalidated in timely manner, the AsD item should not be read by any CL. 10. (Information-ow) An AsD item can be updated only by SA whose trust levels are greater than or equal to the condence level of the AsD item. When an AsD item is updated, the condence level is also updated to the trust level of the SA who modies the AsD item. 11. (Information-ow) CL sometimes want to access all AsD regardless of the condence levels, and sometimes they want to access only good quality AsD. Based on their preference,

5. USAGE SCENARIO
In this section, we provide a ctional usage scenario, illustrating how metadata templates and integrity control policies are specied and enforced to address various integrity requirements.

5.1 Integrity Requirements


IntegrityEqualsMoney (IEM) is a nancial company whose goal is to provide its customers with the accurate assessment of the future stock values for the worlds leading companies. In order to accomplish its goal, IEM collects relevant data from many sources,

107

CLs preferred level is determined (using a predened function, getPreferredLevel($USERID)), and CL should not be able to access any AsD item with the condence level less than the preferred Level. 12. (Prevention of fraud and error) When any access to AsD is denied, the information about the denied access request must be recorded in the system log (using a predened procedure, writeToLog()). 13. (Miscellaneous) Data can be deleted only by system administrators with a special privilege.

AC-POLICY ACP -R3 FOR (CoD, SA) { WHEN Read; IF (CoD.conf idenceLevel c) (CoD.verif ied = true); THEN Allow: ; ELSE Deny: ; }

A CoD item can be read by SA only if its condence level is greater than or equal to c or it has been veried successfully (Requirement 3).
AC-POLICY ACP -R4 FOR (CoD, SA) { WHEN Insert; IF (SA.trustLevel > t); THEN Allow: (CoD.conf idenceLevel SA.trustLevel); ELSE Deny: ; }

5.2 Metadata Template and Integrity Control Policy Specication


We now describe how the metadata templates and integrity control policies are specied for the roles (DC, SA, and CL) and the required data types (CoD, AnD, and AsD) to meet the integrity requirements in the previous section. Note that the policies presented below concern only improper accesses, as unauthorized accesses are already denied by the conventional access control system. Roles: DC, SA, CL
MD-TEMPLATE template-DC FOR DC { trustLevel: getT rustLevel($U SERID); } MD-TEMPLATE template-SA FOR SA { trustLevel: getT rustLevel($U SERID); } MD-TEMPLATE template-CL FOR CL { pref erredLevel: getP ref erredLevel($U SERID); }

SA can create a CoD item only if its trust level is greater than t; the condence level of the CoD item is determined by the trust level of the SA who creates it (Requirement 4). Data: AnD
MD-TEMPLATE template-AnD FOR AnD { conf idenceLevel: 0; // a default value } AC-POLICY ACP -R5 FOR (AnD, SA) { WHEN Insert; IF (SA.trustLevel 0); THEN Allow: (AnD.conf idenceLevel SA.trustLevel); ELSE Deny: ; }

Whenever a subject activates either DC or SA role, a trust level is assigned to the subject using the specied function (Requirement 1). Similarly, when the CL role is activated, a preferred level is assigned to the subject using the specied function (Requirement 11). Data: CoD
MD-TEMPLATE template-CoD FOR CoD { conf idenceLevel: 0; // a default value verif ied: f alse; // a default value } DV-POLICY DV P -R3 FOR CoD { WHEN Read; IF verif yCoD(this); THEN (CoD.verif ied true); ELSE (CoD.verif ied f alse); }

Any SA can create an AnD item; the condence level of such an item is determined by the trust level of the SA who creates it (Requirement 5).
AC-POLICY ACP -R5.6 FOR (AnD, SA) { WHEN U pdate; IF (SA.trustLevel > ) (AnD.conf idenceLevel ); THEN Allow: (AnD.conf idenceLevel SA.trustLevel); ELSE Deny: ; }

An SA can modies any AnD item if its trust level is greater than . However, if its trust level is less than or equal to , then it can modies only AnD items with the their condence levels less than or equal to (Requirement 5 and 6). Data: AsD
MD-TEMPLATE template-AsD FOR AsD { conf idenceLevel: 0; // a default value conf irmed: f alse; // a default value validited: true; // a default value } DV-POLICY DV P -R9 FOR AsD { WHEN ChangeOnData; IF revalidateAsD(this); THEN (AsD.validated true); ELSE (AsD.validated f alse); }

Whenever a CoD item is about to be read, the CoD item is rst veried by the specied function (Requirement 3).
AC-POLICY ACP -R2 FOR (CoD, DC) { WHEN Insert, U pdate; IF (DC.trustLevel = 0); THEN Allow: (CoD.conf idenceLevel DC.trustLevel) ; ELSE Deny: ; }

Only DC whose trust level is non-zero can create or modify CoD items, and the condence level of a CoD item is determined by the trust level of the DC who creates or modies it. (Requirement 2).

When the ChangeOnData event occurs, AsD is revalidated by the specied function. Whether or not the revalidation succeeds is reected on the AsD (Requirement 9).

108

DV-POLICY DV P -R8 FOR AsD { WHEN Read; IF getConf irmationAsD(this); THEN (AsD.conf irmed true); ELSE (AsD.conf irmed f alse); }

When an AsD item is about to be read, the AsD item gets conrmations from other SA, using the specied function. Whether or not the AsD is successfully conrmed is reected on the AsD (Requirement 8).
AC-POLICY ACP -R7.12 FOR (AsD, SA) { WHEN Insert; IF (SA.trustLevel 0); THEN Allow: (AsD.conf idenceLevel AS.trustLevel) ELSE Deny: writeT oLog(); }

that separated mechanisms are required for the enforcement of these policies. A model for achieving data integrity in commercial environments is also proposed. The model by Clark and Wilson has two key notions: well-formed transactions and separation of duty. A well-formed transaction is structured such that a subject cannot manipulate data arbitrarily, but only in constrained ways that ensure internal consistency of data. Separation of duty attempts to ensure the external consistency of data objects: the correspondence among data objects of different subparts of a task. This correspondence is ensured by separating all operations into several subparts and requiring that each subpart be executed by a different subject. Policy Specication. A lot of work has been devoted to security policies, and a number of policy specication languages have been proposed. [20] proposes to categorize them according to three levels: high-level policies, specication-level policies, and low-level policies. High-level policies can be applied to business goals, service level agreements, trust relationships, or natural language statements. They are not enforceable and they must be rened into the lower levels. Specication-levels policies relate to specic services or objects and their interpretation can be automated. They are specied by a human administrator to provide abstractions of low-level policies with a precise format. Low-level policies, sometimes referred to as congurations, are device congurations, security mechanism congurations, directory schema entries and so on. The policies applicable to our work are the ones categorized at specication-level policies. From the user perspective, a policy specication language [26, 15, 16] provides a better way to specify policies as it provides considerable exibility compared to other approaches. However, it has an intrinsic burden of compromising the ability to analyze policy specications. Rule-based policies [17, 19, 33] are specied as sequences of rules of the form, if condition then action else action. Logic-based approaches [18, 24, 7, 2] are driven by the need of analyzing the policy specication and hence are not easily interpreted by humans. Policy Management Architectures. Some work on security policy management architecture has also been reported. The approach by Hayton et al. [14] addresses the issue of enforcing access control policies based on roles where the access rights of a user are grouped into roles to which a user can be assigned using credentials. In [4], Beznosov and Deng suggest a way to implement RBAC models on CORBA security service. Greenwald et al. [1] dene an architecture that compiles various high-level policy specications into a common low-level policy interoperability layer. The common policy layer is used to implement the highlevel policies onto a variety of mechanisms. The architectures described above, however, largely ignore issues related to the storage of policies in repositories although they are important in any policy management architecture.

When an SA creates an AsD item, the condence level of the AsD item is determined by the trust level of the SA (Requirement 7). If an access is denied, the information about the denied access request should be recorded in the system log (Requirement 12).
AC-POLICY ACP -R10.12 FOR (AsD, SA) { WHEN U pdate; IF (SA.trustLevel AsD.conf idenceLevel); THEN Allow: (AsD.conf idenceLevel SA.trustLevel), ELSE Deny: writeT oLog(); }

An SA can update an AsD item only if its trust level is greater than or equal to the condence level of the AsD item (Requirement 10). If an access is denied, the information about the denied access request should be recorded in the system log (Requirement 12).
AC-POLICY ACP -R8.9.11.12 FOR (AsD, CL) { WHEN Read; IF (AsD.conf irmed = true) (AsD.validated = true) (AsD.conf idenceLevel CL.pref erredLevel); THEN Allow: ; ELSE Deny: writeT oLog(); }

A CL can read an AsD item only if the AsD item is conrmed (Requirement 8) and validated (Requirement 9), and its condence level is greater than or equal to his preference level (Requirement 11). If an access is denied, the information about the denied access request should be recorded in the system log (Requirement 12).

6. RELATED WORK
Integrity Models. To the best of our knowledge, Biba [5] was the rst to address the issue of integrity in information systems. His approach is based on a hierarchical lattice of integrity levels, and integrity is dened as a relative measure that is evaluated at the subsystem level. A subsystem is some sets of subjects and objects. An information system is dened to be composed of any number of subsystems. Biba regards integrity threat as that a subsystem attempts to improperly change the behavior of another by supplying false data. A drawback of [5] is that it is not clear how to assign appropriate integrity levels and that there are no criteria for determining them. In [8], Clark and Wilson make a clear distinction between military security and commercial security. They then argue that security policies related to integrity, rather than disclosure, are of the highest priority in commercial information systems and

7.

CONCLUSION

In this paper, we have discussed the issue of data integrity. We analyzed various integrity models proposed in the literature and identied a set of meaningful requirements for data integrity. We also presented an architecture for comprehensive integrity management systems, which is based on metadata templates and integrity control policies. Our integrity control policy supports two important types of integrity management; integrity-related access control and data validation. We presented a exible policy specication language able to support various integrity-related access control and data validation requirements.

109

We believe the discussion in this paper provides a concrete framework for integrity management systems, but much more work remains to be done. Our future work includes implementing our integrity management system on the top of a DBMS. This will enable us to evaluate our system in more practical terms, such as performance and scalability, and we expect to discover many more interesting issues. We are also planning to deeply investigate the issue of data validation. This will require us to extend the scope of our work to various areas such as data cleansing or sanitization. We believe such an extension is crucial in order to fully address the issue of data integrity.

8. REFERENCES
[1] M. Greenwald, A. Keromytis, S. Ioannidis and J. Smith. Scalable security policy mechanisms for the internet. Technical Report MS-CIS-01-05, University of Pennsylvania, 2001. [2] G. Ahn and R. Sandhu. The RSL99 language for role-based separation of duty constraints. In the Fourth ACM Role-Based Access Control Workshop, 1999. [3] E. Bertino and R. Sandhu. Database security - concepts, approaches, and challenges. IEEE Transaction on dependable and secure computing, 2005. [4] K. Beznosov and Y. Deng. A framework for implementing role-based access control using CORBA security service. In the Fourth ACM Role-Based Access Control Workshop, 1999. [5] K.J. Biba. Integrity considerations for secure computer systems. Technical Report TR-3153, Mitre, 1977. [6] M. Bishop. Computer Security: Art and Science. Addison-Wesley, 2003. [7] F. Chen and R. Sandhu. Constraints for role-based access control. In the First ACM/NIST Role Based Access Control Workshop, 1995. [8] D. Clark and D. Wilson. A comparison of commercial and military computer security policies. In IEEE Symposium on Security and Privacy, 1987. [9] N. Damianou. A Policy Framework for Management of Distributed Systems. PhD thesis, The Imperial College of Science, London, 2002. [10] D. Ferraiolo, J. Cugini, and R. Kuhn. Role-based access control(RBAC): Features and motivations. In Computer Security Applications Conference, 1995. [11] Organization for Economic Co-operation and Development (OECD). OECD guidelines on the protection of privacy and transborder ows of personal data, 1980. Available at www1.oecd.org/publications/e-book/9302011E.PDF. [12] T. Fraser. LOMAC: Low water-mark integrity protection for COTS environments. In IEEE Symposium on Security and Privacy, 2000. [13] S. Garnkel. Database Nation: The Death of Privacy in the 21st Century. OReilly, 2000. [14] R. Hayton, J. Bacon, and K. Moody. Access control in an open distributed environment. In IEEE Symposium on Security and Privacy, 1998. [15] M. Hitchens and V. Varadharajan. Tower: A language for role based access control. In the Policy Workshop, 2001.

[16] J. Hoagland, R. Pandney, and K. Levitt. Security policy specication using a graphical approach. Technical Report CSE-98-3, University of California, Davis, 1998. [17] R. Bhatia, J. Lobo, and S. Naqvi. A policy description language. In the Sixteenth National Conference on Articial Intelligence, 1999. [18] S. Jajodia, P. Samarati, and V. Subrahmanian. A logical language for expressing authorisations. In IEEE Symposium on Security and Privacy, 1997. [19] M. Kohli and J. Lobo. Policy based management of telecommunication networks. In the Policy Workshop, 1999. [20] J. Moffet and M. Sloman. Policy hierarchies for distributed systems management. IEEE JSAC Special Issue on Network Management, 11(9):14041414, December 1993. [21] M. Nash and K. Poland. Some conundrums concerning separation of duty. In IEEE Symposium on Security and Privacy, 1990. [22] United State Department of Justice. The federal privacy act, 1974. Available at www.usdoj.gov/foia/privstat.htm. [23] United State Ofce of Management and Budget. Guidelines for ensuring and maximizing the quality, objectivity, utility, and integrity of information disseminated by federal agencies, 2002. Available at http://www.whitehouse.gov/omb/fedreg/reproducible.html. [24] R. Ortalo. A exible method for information system security policy specication. In the 5th European Symposium on Research in Computer Security (ESORIC), 1998. [25] R. Ramakrishnan and J. Gehrke. Database Management Systems. McGraw-Hill, 2000. [26] A. Ribeiro, A. Zuquete, and P. Ferreira. SPL: An access control language for security policies with complex constraints. In Network and Distributed Security Symposium (NDSS), 2001. [27] R. Sandhu. Transaction control expressions for separation of duties. In the 4th Aerospace Computer Security Conference, 1988. [28] R. Sandhu. Terminology, criteria and system architectures for data integrity. In the NIST Invitational Workshop on Data Integrity, 1989. [29] R. Sandhu. Separation of duties in computerized information systems. In the IFIP WG11.3 Workshop on Database Security, 1990. [30] R. Sandhu. On ve denitions of data integrity. In the IFIP WG11.3 Workshop on Database Security, 1993. [31] R. Sandhu and S. Jajodia. Integrity mechanisms in database management systems. In NIST-NCSC National Computer Security Conference, 1990. [32] R. Simon and M. Zurko. Separation of duty in role-based environments. In the 10th Computer Security Foundation Workshop, 1997. [33] Y. Snir, Y. Ramberg, J. Strassner, R. Cohen, and B. Moore. Policy QoS information model, 2003. Available at ftp://ftp.rfc-editor.org/in-notes/rfc3644.txt.

110

You might also like