Professional Documents
Culture Documents
Bill Minor - IBM Toronto Lab bminor@ca.ibm.com TLU- 1243A Data Servers - DB2 for Linux, UNIX, Windows
Highlights
The cost of disk storage represents a significant portion of the overall expense associated with large database systems. Once purchased, managing that storage can significantly add to the total cost of ownership. Effective management and utilization of disk space is instrumental in keeping your database Real Estate costs in check. The goals of this presentation are to:
Provide intimate details into the reorg utility Provide an overview of Data Management in DB2 Highlight customer usage scenarios including best practices, monitoring, tuning, autonomics and troubleshooting Illustrate the role of reorganization in new Viper features such as Table Partitioning, Data Row Compression,and Large RIDs
Agenda
DB2 Real Estate Overview of Reorganization Table Compression Page and Extent Size Selection DMS Tablespace Architecture Registry Variables High Water Mark Large Record Identifiers (RIDs) Log Space Consumption
3
A Confession!
I am not a Realtor, Financial Analyst, Investment Advisor, Stock Trader, Card Counter, Poker Tour Champ, One does not have to be an expert to realize that investing in real estate is a significant proposition By analogy your DB2 storage is a critical and valuable investment. Just as there are many facets/intricacies/strategies when dealing with Real Estate, so to with your management of DB2 Storage.
DB2 Reorganization
Many changes to table data (INSERTs/UPDATEs/DELETEs) can affect the physical organization of table and index data to the point where performance is adversely affected
Goals of REORG: Defragment or compact data onto fewer data pages Physically recluster data into the same logical sequence as an index Eliminate pointer-overflow records DB2 9 - build a (new) compression dictionary and to compress the rows in the table using the compression dictionary Conversion to Large Rids Schema changes
The result: Access to a reorganized object can be done with minimal I/O and bufferpool misses as well as with maximum prefetcher effectiveness i.e. maintain or improve query performance
7
(the default)
OFFLINE: Table available for read only access during reorg up to copy phase
Examples: db2 reorg table staff index inx1_staff inplace allow write access db2 reorg table emp inplace pause on dbpartitionnum(10 to 100) db2 reorg table emp_resume longlobdata db2 reorg table department resetdictionary db2 reorg table payroll index pr1 use tempspace1
Processing Modes:
Reclustering via table scan sort (default) or index scan (via INDEXSCAN clause) Space reclamation (compaction) via table scan
reorged, XML data is not "reorged", only empty pages are removed
10
11
A reorg may be required because clustering index isn't well clustered so a table scan sort will give better I/O characteristics (may be slower for sparse tables where index itself is somewhat small)
3x TEMP
SORT
2x TEMP
TDASPILL
SHADOW TDASPILL
T1
TDAMERGE
T1
TDAMERGE SHADOW
USERSPACE1
TEMPSPACE1
USERSPACE1
TEMPSPACE1
13
Attributes:
Minimal extra storage requirement Incremental: benefit of effects seen immediately No iterative log processing phase Table quiesce for object 'switch over' at end can be avoided Think of it as a Trickle Reorg
14
Reclustering:
db2 reorg table t1 index i1 inplace
vs.
TIME
Space Reclamation:
db2 reorg table t1 inplace
VACATE PAGE RANGE: MOVE & CLEAN to make space free space FILL PAGE RANGE: MOVE & CLEAN to fill space
Backward scan starts at end, fills holes earlier in table identified by simultaneous forward scan
15
F1: 100 * OVERFLOW / CARD < 5 The total number of Overflow records in the table should be less than 5% F2: 100 * (Effective Space Utilization of Data Pages) > 70 There should be less than 30% free space in the table F3: 100 * (Required Pages / Total Pages) > 80 The number of pages that contains no rows at all should be less than 20% of the total number of pages in the table
16
Avail-ability By default, table is available for read until phase 3) Can select no access
optionally reorganized
Inplace Reorg
No truncation
reorganized
By default, table is available for R/W access If/when truncate is done, table is available for read access
17
- IX Tablespace Lock - U Table Lock - Upgrade to Z Table Lock for Copy Phase -IX Tablespace Lock - IS Table Lock - X Alter Table Lock -S Row Lock on rows moved/cleaned - Upgrade to S Table Lock to prepare for Truncation -Special Z Table Lock for drain/wait on Truncate
18
Classic Reorg
Fully supported (can be invoked on all or specified DB partitions) Fully supported (can be invoked on all or specified DB partitions)
Fully supported
Inplace Reorg
Not supported
19
db2pd tool
db2pd -db SAMPLE -reorgs file=reorg_pd.out (db2pd -db SAMPLE -tcbstats) (db2pd -db SAMPLE -mempools)
LIST HISTORY
db2 list history reorg all for SAMPLE
Table REORG
("T"=Table,"I"=Index)
Op Obj Timestamp+Sequence Type Dev Earliest Log Current Log Backup ID REORG -- --- ------------------ ---- --- ------------ ------------ -------------G T 20070313103729 N S0000023.LOG Status ---------------------------------------------------------------------------Table: "BMINOR "."STAFF2" NOTE: "Comment" ---------------------------------------------------------------------------field reports REORG Comment: REORG Done Status only for Start Time: 20070313103729 'online' case. For End Time: 20070313103729 Log file being 'offline' it specifies written to when reclustering index ---------------------------------------------------------------------------REORG completed and temp space ids.
21
DBPARTITIONNUM OBJECTTYPE SQLCODE START_TIME -------------- ---------- ----------- -------------0 T - 20070303152449 1 record(s) selected. myhost: db2 connect to sample completed ok
Database Connection Information Database server SQL authorization ID Local database alias = DB2/AIX64 9.1.2 = BMINOR = SAMPLE
db2_all "db2 connect to sample; db2 select dbpartitionnum,objecttype,sqlcode,start_time from table'(('sysproc.admin_list_hist'())' as listhistory where operation=\\'G\\'"
DBPARTITIONNUM OBJECTTYPE SQLCODE START_TIME -------------- ---------- ----------- -------------100 T -964 20070303152118 1 record(s) selected. myhost: db2 connect to sample completed ok
CONS:
Large space requirement: shadow copy approach so need approximately twice as much space as the original table Limited access: read-only until Replace/Copy phase All-or-nothing process Can only be stopped by the app or user who understands how to stop the process
Recommendation: Choose this method if you can reorganize tables during a maintenance window
24
CONS:
Slower than Classic method (~10-20x) Only allowed for tables with type-2 indexes Cannot reorganize LONG/LOBs Indexes are maintained, not rebuilt, so index reorganization may subsequently be required Requires more log space
Recommendation: Choose this method for 24x7 operations with minimal maintenance windows
25
If the table contains mixed row format because the table value compression has been activated or deactivated, an offline table reorganization can convert all the existing rows into the target row format
If the table is partitioned onto several database partitions, and the table reorganization fails on any of the affected database partitions, only the failing database partitions will have the table reorganization rolled back
The granularity of table reorg is at the Database Partition level not the Table Range Partition level
Table Ranges are reorg sequentially one after the other and global indexes rebuilt once all ranges have been reorganized
26
only by the load and table reorg. Range is from 0 to 99% with default value of 0
(For MDC tables, clustering is maintained on the columns that you specify as arguments to the ORGANIZE BY
DIMENSIONS clause of the CREATE TABLE statement. However, REORGCHK might recommend reorganization of an MDC table if it considers that there are too many unused blocks or that blocks should be compacted) APPEND mode tables If the index key values of these new rows are always new high key values for example then the clustering attribute of the table will try to place them at the end of the table. Having free space in other pages will do little to preserve clustering. Hence, placing the table in append mode may be a better choice than a clustering index
Automatic Dictionary Creation on Table Growth (TLU-1242A) Dictionary created as table is populated and reaches a certain threshold in size (Viper II)
27
28
Online Index Reorg : Table Partitioning and MDC Notes and Limitations
MDC REORG with ALLOW WRITE not supported Note: ALLOW READ is supported Table Partitioning Supports ability to reorg individual indexes (as opposed to ALL indexes of a table) Supported in all availability modes (ALLOW NONE, ALLOW READ, ALLOW WRITE) Natural thing to do, since with table partitioning, each index for the table is in its own storage object (and OLIR operates on a storage object basis) Also supports REORG INDEXES ALL in ALLOW NONE
29
Ensure the tablespace is large enough for the shadow/ghost object/index Remember for Reorg, the shadow object will contain all indexes, so will require (very approximately) the same amount of space as the current index object on the table For Create, the ghost index will simply require the space for the newly created index Use LARGE tablespaces
Ensure you commit as soon as possible after index creations Minimizes time table S lock held
30
-ALLOW NO ACCESS: Z lock on table -ALLOW READ ACCESS: S lock on table -ALLOW WRITE ACCESS:IN lock on table -S drain lock for each index (all writers must be aware) -S lock at end to perform final catch-up -Quiesce concurrent writers: Z lock to perform index switch
31
F5:
100 * (Space used on leaf pages / Space available on non-empty leaf pages) > MIN(50, (100 - PCTFREE))
Less than 50% of the space reserved for index entries should be empty
F6:
(100 - PCTFREE) * (Amount of space available in an index with one less level / Amount of space required for all keys) < 100
Determine if recreating the index would result in a tree having fewer levels
F8: 100 * (Number of pseudo-empty leaf pages / Total number of leaf pages) < 20
The number of pseudo-empty leaf pages should be less than 20 percent of the total number of leaf pages
32
33
EMPTY TABLE
INDEX
35
TABSCHEMA TABNAME TABTYPE DBPARTITIONNUM AVAILABLE DATA_OBJECT_P_SIZE DATA_OBJECT_L_SIZE INDEX_OBJECT_P_SIZE INDEX_OBJECT_L_SIZE LONG_OBJECT_P_SIZE LONG_OBJECT_L_SIZE LOB_OBJECT_P_SIZE LOB_OBJECT_L_SIZE
XML_OBJECT_P_SIZE XML_OBJECT_L_SIZE INDEX_TYPE REORG_PENDING INPLACE_REORG_STATUS LOAD_STATUS READ_ACCESS_ONLY NO_LOAD_RESTART NUM_REORG_REC_ALTERS INDEXES_REQUIRE_REBUILD LARGE_RIDS LARGE_SLOTS DICTIONARY_SIZE
36
ADMIN_GET_TAB_COMPRESS_INFO ( )
Column TABSCHEMA TABNAME DBPARTITIONNUM DATA_PARTITION_ID COMPRESS_ATTR DICT_BUILDER DICT_BUILD_TIMESTAMP COMPRESS_DICT_SIZE EXPAND_DICT_SIZE ROWS_SAMPLED PAGES_SAVED_PERCENT BYTES_SAVED_PERCENT AVG_COMPRESS_REC_LENGTH Data Type VARCHAR(128) VARCHAR(128) SMALLINT INTEGER CHAR(1) VARCHAR(30) TIMESTAMP BIGINT BIGINT INTEGER SMALLINT SMALLINT SMALLINT
New to Viper II
37
page size * extent size == space per block for MDC tables Very, very important to prevent sparse blocks/cells
38
39
create table t1
insert into t1
Tablespace Header First Extent of SMPs Object Table Extent Extent Map for T1
First Extent of Data Pages for T1
xx xx T1 z
xxx xx T1 zz
xxxx xxxx x T1 T2 EMP T1 DAT T1 DAT T1 yy EMP T2 DAT T2 DAT T1 DAT T2 zzz
3
4 5
EMP T1 DAT T1
6 7
31968
8 9
40
Tablespace Header
xx
Tablespace Header
xx xx
First Extent of SMPs Object Table Extent Object Table Extent Object Table Extent
41
42
REORG of a table within the DMS tablespace that the table resides in
HWM affects:
Redirected
Restore - redefinition of containers allowing tablespace to shrink in size; cannot be shrunk lower
than HWM
Dropping
or reducing the size of container via ALTER TABLESPACE only affects extents above the HWM
T1
DMS PERM TABLESPACE
HWM
43
db2dart /DHWM Displays detailed tablespace information including which extents are free, which are in use and what object is using them as well as information about the object holding up the HWM db2dart /LHWM provides guidance as to how the HWM might potentially be lowered If DMS table data object holding up HWM then 'offline' REORG of table within the DMS tablespace that the table resides can be used to lower the HWM if enough free extents exist below the HWM to contain the shadow copy
If DMS index object holding up HWM, index reorg may be able to reduce HWM
Viper II: ALTER TABLESPACE REDUCE and Online Backup will remove these
44
Before DB2 9
RID is 4 bytes, 3 byte page number and 1 byte slot number Default table space data type was REGULAR Tables (data part) could not be placed in LARGE table spaces
DB2 9
New 6 byte RID, 4 byte page number and 2 byte slot number Infrastructure - runtime, sections, sort, log records, locks all large RID Default table space data type for DMS table spaces is now LARGE Tables can now be placed in LARGE table spaces Indexes contain regular or large RIDs only, based on the table space type where the table data is stored; it has nothing to do with the type of table space where the index resides
45
For tables in LARGE table spaces (DMS only). Also all SYSTEM and USER temporary table spaces For tables in all tablespace types: regular, temporary, DMS, SMS
Maximum rows per page by page size Page Size REG TBSP Min Rec Length 14 30 62 127 REG TBSP Max Records 251 253 254 253 LARGE TBSP Min Rec Length 12 12 12 12 LARGE TBSP Max Records 287 580 1165 2335
4 KB 8 KB 16 KB 32 KB
New tables created will fully support large RIDs, both page and slot numbers Previously existing tables continue to be restricted to ~255 rows/page and to 3 byte page numbers until a reorganization of the table or indexes occur
SQL1236N Table "<table-name>" cannot allocate a new page because the index with identifier "<index-id>" does not yet support large RIDs
BEST PRACTICE: Perform the ALTER TABLESPACE during upgrade/migration Be pro-active in rebuilding indexes on tables (or reorganizing tables) afterwards
47
The table will not support >255 rows (slots) per page until the table itself has been reorganized with the classic/offline REORG TABLE
SELECT TABNAME, TABSCHEMA, DBPARTITIONNUM FROM TABLE (ADMIN_GET_TAB_INFO( '', '' )) AS T WHERE LARGE_SLOTS = P
Can my table benefit from large slots (more rows per page)?
SELECT TABSCHEMA, TABNAME, AVGROWSIZE FROM SYSCAT.TABLES
If the (average row size - 2) for a table is smaller than the minimum record length for the page size used, then there could be storage benefits when converting the table space to large and reorganizing the table to enable large slots
48
50
Fred
500
10000
Plano
TX
24355
John
500
10000
Plano
TX
24355
11011010100101001100100101001010101011010010101010110101010101010
01
110110101001010011
51
1. Full XOR logging (length change) with changed column at/near beginning of row
Fred 500 10000 Plano TX 24355 Frank 500 10000 Plano TX 24355
11011010100101001100100101001010101011010010101010110101010101010
01
2. Full XOR logging (length change) with changed column at/near end of row
500 10000 Plano TX 24355 Fred 500 10000 Plano TX 24355 Frank
110110101001010011
01
52
Fred
500
10000
Plano
TX
24355
John
500
12345
Plano
TX
24355
11011010100000000000000001001001
2. Partial XOR logging (no length change) with no gap between columns being changed
500
Fred
10000
Plano
TX
24355
500
John
12345
Plano
TX
24355
1101101010010100110101
53
54
Summary
Real Estate is a BIG investment Knowing details about your DB2 Real Estate will allow you to better leverage that investment With DB2 9 (Viper) and Viper II (DB2 9.5) significant new functionality has been developed to help with the management of storage Going forward, one can expect the trend to continue
55
THANK YOU!!!
. Your Feedback is greatly appreciated.
56