Professional Documents
Culture Documents
This document is the property of: Pegasystems Inc. 101 Main Street Cambridge, MA 02142-1590 (617) 374-9600, fax: (617) 374-9620 www.pega.com PegaRULES Process Commander Document: TITLE GOES HERE Software Version: 5.1 Posting Date: October 2006
Contents
Introduction...............................................................................................................2 Mapping Classes to Database Tables ....................................................................4 About Classes and Tables..................................................................................4 Verifying Class-to-Table Maps ...........................................................................5 Listing All Classes Mapped to one Table ...........................................................6 Computing Row Counts by Class.....................................................................10 The Work Object Data Model.................................................................................11 Tables for Work Objects ...................................................................................11 pc_work Table ..................................................................................................13 pc_history_work Table......................................................................................14 pc_assign_workbasket , pc_assign_worklist and pr_assign Tables................15 pc_link_attachment Table.................................................................................15 pr_link Table .....................................................................................................16 pc_link_folder table...........................................................................................16 pc_data_workattach Table ...............................................................................17 pc_index_workparty Table................................................................................18 pc_data_uniqueid Table ...................................................................................18 pr_data Table ...................................................................................................19 Case 1: Purging All Work Objects ........................................................................20 Lookup List Cache............................................................................................20 Case 2: Purging Selected Work Objects ..............................................................22 Selection...........................................................................................................22 Purging .............................................................................................................25 Case 3: Using Date-Range Partitions ...................................................................29 Case 4: Archiving ...................................................................................................30 Summary and Redundant Data .............................................................................31
Introduction
In a production Process Commander application, the volume of data for work objects typically grows to require hundreds of megabytes or terabytes of database storage. If information from older work objects is no longer needed, or is needed only infrequently, it makes sense to remove selected older data from the live production database, possibly storing it in a separate archive system. Removing selected work objects can provide these benefits:
Improved performance, because database operations operate on tables with fewer rows. Reduced space requirements for the database. Reports and analyses run against an archive or shadow system do not affect production operations.
You can implement such purging and archiving with standard relational database operations. However, because information about a work object is stored in several tables, and not all Process Commander properties are available as individual columns, planning is required. Care is needed to ensure the internal consistency and integrity of the database after purging or archiving operations are completed. This article describes the PegaRULES database tables that hold work object details and the relationships among them, and provides sample SQL statements for specific situations. (Database administrators will need to adjust these sample SQL statements to meet specific architectural and business needs.) References to tables and properties correspond to the database schema as initially installed in Version 5.1SP1, though the approach is generally applicable to Version 4.2SP4 and later. Your applications may introduce additional tables or additional relationships among the tables. In that case, you must adapt the specific SQL and techniques presented here to your needs. Caution: Before using any of techniques described in this article, back up the entire PegaRULES database.
In this article, purging means deleting data permanently, with no means for restoring it in the future. Archiving means removing data from the primary PegaRULES database, allowing for a possible future restore operation, or allowing it to be accessed (for searching or reporting purposes, not updates) in a separate database and separate Process Commander system.
One database table may contain rows corresponding to instances of various Process Commander classes. For each row, the column named pyObjClass identifies the Process Commander class. Instances of the Data-Admin-DB-Table class define a many-to-one mapping between classes and database tables. For example, a DataAdmin-Database-Table record maps the class Data-Admin-OperatorID to the table pr_operators. When an object is saved (committed to the database), a search algorithm uses the class name to search for a table, using the pattern of dashes and letters in the class name to find a Data-Admin-DB-Table record that applies, or choosing the default table pr_other if none is found. The unique key to most tables is a single column named pzInsKey. However, the internal structure of this key varies from table to table. Most tables contain a column named pzPvStream, known as the Storage Stream, typically declared as a BLOB or Binary Large Object data type. This contains all non-key values (properties) of a Process Commander object, compressed into a single value. The internal format of pzPvStream various from Process Commander release to release. Data values inside the pzPvStream column can be accessed only through Process Commander facilities. Columns other than the pzPvStream column and the pzInsKey column are known as exposed columns, and hold individual string, date, and number values. These can be scanned, selected, and searched through
ordinary SQL statements. Each column name corresponds to a rule within Process Commander known as a property. For example in the pr_operator table, the column pyUserName holds the users full name as a text string.
Process Commander uses an internal locking mechanism separate from the database locking facility. For maximum safety, purging or archiving operations should take place only at a time when no users (or background processes) are accessing the database. Every table contains a column that holds the date and time that the row was first saved (committed). In most tables, the column is named pxCreateDateTime. DateTime values are in UTC format to the nearest millisecond, following the pattern YYYYMMDDTHHMMSS.ZZZ GMT, for example 20060616T151041.093 GMT for June 6, 2006 at 15:10:41 London time. (In some cases, Process Commander rounds the values as they are saved into the database.) Process Commander employs an internal indexing mechanism (controlled by Rule-Data-Index rules) that is similar in purpose but independent of database indexes. Because of these principles, if you store data from Process Commander in a similarly structured database (but without Process Commander software), you can report only on exposed properties. Data values for properties that are available only in the pzPvStream column are inaccessible. Further, you cannot simply move rows in one PegaRULES database table, even if both tables have the same columns and structure, without careful coordination with changes to Data-Admin-DB-Table maps. Additionally, the Process Commander mapping requires that all instances of a class be in one table. You cannot store current work objects in one table and older work objects (of the same class) in a second table.
1. Expand the Class Explorer tree control to locate the class name. (Adjust your Class Explorer preferences if some classes are not presented.) 2. Click the icon at the left of the name to view the Class rule form in the workspace. 3. Click the Basic tab if necessary to make it front most. 4. Click the Test Connectivity button. Process Commander identifies the table in a pop-up window display.
4. The Step 3 display lists all the classes mapped with that table (whether any rows of that class are present or not.) In this example, the pc_work table may contain rows from any of 30+ classes. Currently, this table contains 79 rows.
10
Purging and Archiving Work Objects The Work Object Data Model
11
identifier for each party (participant), such as an account number, Social Security Number, e-mail address, or telephone number.
pc_data_unique. Supports unique work object numbering pr_data Holds unsent correspondence for work objects.
In addition to the tables listed above, the PegaRULES database includes a few views and numerous indexes that involve the above tables. If you if you modify the contents of the tables, the database software maintains correct contents of views and indexes. The Entity-Relationship chart on the following page shows the relevant relationships among these tables.
12
Purging and Archiving Work Objects The Work Object Data Model
13
All the columns listed are exposed properties holding a text value. (The tables contain many additional columns not shown.) PK identifies the unique key for each table, the pzInsKey column. As noted above, the structure and content of values for this column differ greatly from table to table. FK identifies a column and property that Process Commander uses as a foreign key to access a row of another table. These are interpreted by software, they not identified as foreign keys in the database schema.) Numbers on the arrows indicate the multiplicity of the relationship. For example, one work object (pc_work table) may have many different work object history records (pc_history_work table). All tables but one contain the pxCreateDateTime column, recording the date and time (in UTC format, previously known as Greenwich Mean Time) that the row was first saved. (In the pc_history_work table, this column is named pxTimeCreated.)
pc_work Table
Work objects are typically stored in the pc_work table, or another table with a similar structure. The unique pzInsKey value for a work object consists of a work pool name (sometimes called a class group) and a work object ID. The pc_work table is related to all the other database tables presented in the diagram.
14
PK. Primary key PEGASAMPLE ! W-15 Work type (class). PEGASAMPLE-TASK. Work object ID, W-15. Work object ID, W-15. Work object status. The values that start with Resolved indicate that a work object is resolved and no further updates are expected. Resolved-Completed Date and time (in GMT format) that the work object was resolved. 20041109T204536.146 GMT FK. For a work object that is a member of a cover, identifies the pzInsKey value of the covering work object. This is a many-to-one relationship. PEGASAMPLE- ! C-145 For a work object that is a cover, an integer that records the number of member work objects that are open (unresolved).
pyResolvedTimeStamp
pxCoverInskey
pxCoveredCount
pc_history_work Table
This table contains detailed history of a work object, noting who updated the work object and on what date and time. The key structure links the date and time that the history instance was created to the work object. The two parts of the key are also available as separate columns, pxHistoryForReference and pxTimeCreated. Every work object has at least one history record in this table, recording the date and time the work object was created. Over time, the history of a work object may grow to contain dozens or hundreds of records. The pc_history_work table has a many-to-one relationship with the pc_work table, linked by the pxHistoryForReference value. For
Purging and Archiving Work Objects The Work Object Data Model
15
example, the pxHistoryForReference value PEGASAMPLE ! W-15 identifies a row as one of the history rows for the work object W-15.
pc_link_attachment Table
Entries in the pc_link_attachment table associate a work object with an attachment. Each row corresponds to an attachment, identified by the date and time that the attachment was saved. However, this table records only associations; the body of an attachment is in a separate table named pc_data_workattach. This table includes two foreign key references. Column pxLinkRefFrom identifies the work object to which this attachment belongs. Column pxLinkRefTo holds a date and time value that matches a key of the pc_data_workattach table.
16
One work object may have none, one, or many attachments. Each attachment is associated, through this table, with one work object.
pr_link Table
No standard Process Commander classes are mapped to the pr_link table. However, if your application uses the Link-Object method to create instances of a custom class derived from the Link- class, they are stored in this table by default. The schema and columns of this table are similar to those of the pc_link_attachment table. If your application contains custom links, the linked-to objects may be stored in a custom table, identified by a pzInsKey that is the date and time value the link was committed. (Procedures below do not include any custom tables in your application.)
pc_link_folder table
Instances of this table exist only when the application uses folders, a many-to-many relationship among work objects. In contrast to the cover relationship, folders can be members of folders, and a work object can belong to multiple folders at once. Folders provide quick access to the member work objects. Process Commander enforces no additional constraints or relationships between the folder work object and its member work objects, although your applications may introduce them. This table includes the following columns that hold exposed properties:
Column
pxObjClass
Class of this object, LINK-FOLDER. Initial portion of the pzInsKey value. FK Identifies the work object that belongs to a folder. Formed from the class group name (work pool) and the work object ID. Middle portion of the pzInsKey value. FK. Identifies the folder work object. Final portion of the pzInsKey value. Many different work objects can be linked to the same folder work object.
pxLinkedRefFrom
pxLinkedRefTo
Purging and Archiving Work Objects The Work Object Data Model
17
Column
pxLinkedClassFrom
Work type (class), such as PegaSample-Task, of the work object that belongs to the folder.
The key of a Link-Folder instance consists of the pxObjectClass value and two work object references, pxLinkedRefFrom and pxLinkedRefTo., separated by a single exclamation point. For example, the value:
LINK-FOLDER PEGASAMPLE ! W-15 PEGASAMPLE ! F-208
indicates that work object W-15 belongs to the folder work object F-208. Both are in the PEGASAMPLE work pool. If your application uses folders, purging work objects while maintaining a folder structure for any work objects not purged may depend on application-specific details that are not addressed in this article.
pc_data_workattach Table
This table holds the contents of attachments, each associated with a work object. One work object may be linked to none, one, or many attachments. This table is related to the pc_link_attachment table, a 1-1 relation based on the date and time value in the pzInsKey value. Note that no column in the pc_data_workattach table identifies the work object ID. Five classes can hold attachments: Data-WorkAttach-File Holds an uploaded file in Base64 encoding. The file before encoding may be as large as 25 megabytes. Data-WorkAttach-Note Holds a brief text note of 1024 characters or less. Data-WorkAttach-Snapshot Holds a Base64-encoded TIFF image, a screen capture. Data-WorkAttach-URL Holds a URL link. Data-WorkAttach-ScanDocument Holds a Base64-encoded TIFF image produced by a document scanner
18
File attachments contain an uploaded file (converted to characters using Base64 encoding and stored in the pzPvStream column value) that may be up to 25 megabytes in original size. (Through a setting in your Process Commander prconfig.xml file, you can change this limit size to a higher or lower value.)
pc_index_workparty Table
A work party in a work object is a person, account, business or other organization that is in some way involved in the business process, and who may receive correspondence. The pc_index_workparty table supports rapid search for work objects that contain a specified party. (A standard Declare Index rule adds and deletes entries in this table.) The pzInsIndexKey column provides a foreign key reference to the pc_work table. This is a many-to-one relationship: one work object may have one or more than one work parties, and so one may have or more rows of this table.
pc_data_uniqueid Table
Entries in this table assure unique work object numbers, by recording the highest value already assigned. The key to this table consists of a work object prefix. For instance, a row with key C- records the number of the last work object that was created on this system and used a C- prefix. This prefix may be used only by one application or one work type, or may be used by many applications and many work types. (In earlier Process Commander releases, the organization name was also part of the key. Your system may reflect this earlier approach.) In most cases, you should not alter the contents of this table when purging work objects, unless you are deleting all work objects, or all work objects with a specific prefix, and wish to restart numbering at 1.
Purging and Archiving Work Objects The Work Object Data Model
19
pr_data Table
The pr_data table contains rows corresponding to instances of various classes. Four values of the pxObjClass column indicate rows that hold correspondence, outgoing messages sent to a party.
Correspondence records in this table may be sent (transmitted or printed) or unsent. To avoid possible later confusion, it is a good practice to retain (not purge) work objects when correspondence remains unsent. (Typically, correspondence is printed or sent out promptly after it is generated, so it would be unusual to find unsent correspondence for work objects resolved some time ago An instance of the Assign-Corr class a row of the pr_assign table mentioned above is present and linked to the pr_data table. Once the correspondence has been sent, the Assign-Corr instance is deleted. One work object may have none, one, or many correspondence items.
20
Purging and Archiving Work Objects Case 1: Purging All Work Objects
21
3. Click the DeleteLookupList button. 4. Repeat the steps above for each server node in the cluster.
Either technique can be used while the system is available and in use.
22
Selection
This sample SQL selects, but does not delete, the rows of tables involved with work objects resolved between two dates.
/* select attachments from pr_data (probably correspondence) that are linked to work objects of a given work pool that were resolved in a given date range */ select pr_data.pzInsKey as DataKey from pr_data where Exists( select AttachLinkTable.pzInsKey from pc_link_attachment AttachLinkTable, pc_work WorkTable where WorkTable.pzInskey = AttachLinkTable.pxLinkedRefFrom and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyStatusWork like 'Resolved%' and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14' and (AttachLinkTable.pxLinkedRefTo = pr_data.pzInsKey) ) /* select attachments from pc_data_workattach that are linked to work object of a given work pool that were resolved in a given date range */ select pc_data_workattach.pzInsKey as WorkAttachKey from pc_data_workattach where exists( select AttachLinkTable.pzInsKey from pc_link_attachment AttachLinkTable, pc_work WorkTable where WorkTable.pzInskey = AttachLinkTable.pxLinkedRefFrom and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyStatusWork like 'Resolved%'
Purging and Archiving Work Objects Case 2: Purging Selected Work Objects
23
and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14' and (AttachLinkTable.pxLinkedRefTo = pc_data_workattach.pzInsKey)) /* select attachment links that are linked to work objects of a given work pool that were resolved in a given date range */ Select pc_link_attachment.pzInsKey as LinkAttachKey, pc_link_attachment.pxLinkedRefFrom as WorkItemReference from pc_link_attachment where Exists( select WorkTable.pzInskey from pc_work WorkTable where WorkTable.pzInskey = pc_link_attachment.pxLinkedRefFrom and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyStatusWork like 'Resolved%' and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14') /* select history for work objects of a given work pool that were resolved in a given data range */ select pc_history_work.pzInsKey as HistoryKey, pc_history_work.pxHistoryForReference as WorkItemReference from pc_history_work where exists ( select WorkTable.pzInsKey from pc_work WorkTable where WorkTable.pyStatusWork like 'Resolved%' and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14' and pc_history_work.pxHistoryForReference = WorkTable.pzInsKey ) /* select workparty indices for work objects of a given work pool that were resolved in a given data range */ select pc_index_Workparty.pzInsKey as PartyIndexKey, pc_index_Workparty.pxInsIndexedKey as WorkItemReference from pc_index_Workparty where
24
exists ( select WorkTable.pzInsKey from pc_work WorkTable where WorkTable.pyStatusWork like 'Resolved%' and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14' and pc_index_Workparty.pxInsIndexedKey = WorkTable.pzInsKey ) /* Select resolved work objects from a given date range, that do not have assignments*/ select pc_work.pzInsKey as WorkItemKey, pc_work.pxObjClass as WorkItemClass from pc_work where pc_work.pxObjClass like 'PegaSample%' and pc_work.pyStatusWork like 'Resolved%' and pc_work.pyResolvedTimeStamp between '2006-10-13' and '200610-14' and not exists( select AssignWorklist.pzInsKey from pc_assign_worklist AssignWorklist where pc_work .pzInsKey = AssignWorklist.pxRefObjectKey ) and not exists( select AssignWorkbasket.pzInsKey from pc_assign_workbasket AssignWorkbasket where (AssignWorkbasket.pzInsKey = pc_work .pzInsKey) ) and not exists ( select AssignGeneric.pzInsKey from pr_assign AssignGeneric where AssignGeneric.pzInsKey = pc_work .pzInsKey )
You can report on the selected rows, to confirm that the SQL selected the expected ones.
Purging and Archiving Work Objects Case 2: Purging Selected Work Objects
25
You may also use vendor-supplied archiving tools for example as Data Transformation Services from Microsoft or Import-Export Services from Oracle to save the selected rows in an archival format.
Purging
After you are satisfied with the selected rows, you can repeat the operations above but using DELETE rather than SELECT. (The need for a final COMMIT depends on the database software and your use of transactions.) Order is important when doing deletions. This sample SQL deletes rows in a recommended order:
1. Attachments and any other linked objects 2. Links 3. History 4. Indexes 5. Assignments (not included in this example, typically none exist ) 6. Work objects that are not covers or folders 7. Covers (not addressed in this example) 8. Folders (not addressed in this example)
/* delete attachments from pr_data (probably correspondence) that are linked to work objects of a given work pool that were resolved in a given date range */ delete from pr_data where Exists( select AttachLinkTable.pzInsKey from pc_link_attachment AttachLinkTable, pc_work WorkTable where WorkTable.pzInskey = AttachLinkTable.pxLinkedRefFrom and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyStatusWork like 'Resolved%'
26
and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14' and (AttachLinkTable.pxLinkedRefTo = pr_data.pzInsKey) ) /* delete attachments from pc_data_workattach that are linked to work objects of a given work pool that were resolved in a given date range */ delete from pc_data_workattach where exists( select AttachLinkTable.pzInsKey from pc_link_attachment AttachLinkTable, pc_work WorkTable where WorkTable.pzInskey = AttachLinkTable.pxLinkedRefFrom and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyStatusWork like 'Resolved%' and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14' and (AttachLinkTable.pxLinkedRefTo = pc_data_workattach.pzInsKey)) /* delete attachment links that are linked to work objects of a given work pool that were resolved in a given date range */ delete from pc_link_attachment where Exists( select WorkTable.pzInskey from pc_work WorkTable where WorkTable.pzInskey = pc_link_attachment.pxLinkedRefFrom and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyStatusWork like 'Resolved%' and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14') /* delete history for work objects of a given work pool that were resolved in a given data range */ delete from pc_history_work where exists ( select WorkTable.pzInsKey from pc_work WorkTable where WorkTable.pyStatusWork like 'Resolved%'
Purging and Archiving Work Objects Case 2: Purging Selected Work Objects
27
and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14' and pc_history_work.pxHistoryForReference = WorkTable.pzInsKey ) /* delete workparty indices for work objects of a given work pool that were resolved in a given data range */ delete from pc_index_Workparty where exists ( select WorkTable.pzInsKey from pc_work WorkTable where WorkTable.pyStatusWork like 'Resolved%' and WorkTable.pxObjClass like 'PegaSample%' and WorkTable.pyResolvedTimeStamp between '2006-10-13' and '2006-10-14' and pc_index_Workparty.pxInsIndexedKey = WorkTable.pzInsKey ) /* delete resolved work objects from a given date range, that do not have assignments*/ delete from pc_work where pc_work.pxObjClass like 'PegaSample%' and pc_work.pyStatusWork like 'Resolved%' and pc_work.pyResolvedTimeStamp between '2006-10-13' and '200610-14' and not exists( select AssignWorklist.pzInsKey from pc_assign_worklist AssignWorklist where pc_work .pzInsKey = AssignWorklist.pxRefObjectKey ) and not exists( select AssignWorkbasket.pzInsKey from pc_assign_workbasket AssignWorkbasket where (AssignWorkbasket.pzInsKey = pc_work .pzInsKey) ) and not exists ( select
28
******
29
30
Case 4: Archiving
You can hold resolved work objects, including history and attachments, in a shadow database on another Process Commander system. This allows occasional access or the possibility of restoring some or all of the data later. To create an archive database consisting only of the archived work objects: 1. Create a copy of the entire database; this becomes an archive. 2. On the production system, purge the work objects no longer needed, including attachments, history, and correspondence. 3. Install a Process Commander server that provides access to the archive. Note that the archive database tables have the same schema as the production database. Note: You cannot archive work objects by moving the rows for some work objects to another table in the production Process Commander system. Process Commander requires that all rows with a single pxObjClass value (identifying the Process Commander class name) be in a common table. (The pxObjClass value also appears in various other columns and properties within the tables discussed in this note.)
31