Professional Documents
Culture Documents
Version 1.0
John Zheng
Center of Excellence, 2001
Interwoven, Inc.
1195 Fremont Avenue
Sunnyvale, CA 94087
Tel: 408-774-2000
Fax: 408-774-2002
www.interwoven.com
© 2001 Interwoven, Inc. All rights reserved. Interwoven, TeamSite, OpenDeploy, SmartContext and the logo are
registered trademarks of Interwoven, Inc., which may be registered in certain jurisdictions. DataDeploy, the tagline and
service mark are trademarks of Interwoven, Inc. All other trademarks are owned by their respective owners.
CONSULTING SERVICES
Table of Contents
1 INTRODUCTION ...........................................................................................................3
2 BACKGROUND ............................................................................................................3
1 INTRODUCTION
This document describes the concepts and issues behind an effective backup system
for TeamSite. We will look at traditional backup methods and their limitations, and
examine some specialize solutions required for larger implementations.
2 BACKGROUND
Most TeamSite customers are able to use conventional backup methods and be able to
backup their TeamSite system in a limited time window. However, more and more
customers have very large backing stores, or have high availability requirements that
make conventional backup techniques ineffective. While Interwoven is not in the
business of providing a backup solution to its customers, this document provides some
guidance on what customers should look for when considering an enterprise wide
solution.
iwfreeze +<seconds>
2. Backup <iw-store>
iwfreeze --
3.1.2 ISSUES
There can be various issues with this procedure depending on a system ’s
requirements:
a. This is unavoidable if the system was fairly busy right before the
backup starts. The system may have dirty cache points that need to
flushed out to disk before the backing store can be frozen. The st atus
of the cache points can be determined with <iw-home>/bin/iwstat – c,
for example:
[c:\]e:\iw-home\bin\iwstat – c
Status: Server running
shows a system with 10,000 dirty points and 5000 to flush, quite a
busy system indeed! Systems with a large cachesize setting in iw.cfg
can take a long time to flush all the cache points when the system is
extremely busy.
a. Some sites with large numbers of files or a rich version history can
takes days to backup all files in iw-store.
a. When the backing store is frozen, the server can operate in read-only
mode only, and users are unable to edit files in TeamSite.
4. Backing store constantly grows, and backups can take longer and longer
each week.
a. For many backing stores, it’d take much longer than 6-8 hours to
backup all files in <iw-store>. For some customers, there’s concern
the allowable time window will quickly be surpassed as the backing
store grows each month.
6. Conventional backup methods useful only for backing stores les s than 10-100
Gigs
1. The allocation unit size, inode/fragment size, or cluster size of the file system.
On NT, be sure to format NTFS using allocation units of either 512 or 1024 bytes. This
is achieved from the command line with format /A:512 d:, or format /A:1024 d:. You
can also use Windows Explorer or the Disk Management module in the Computer
Management MMC, and tell them to format using an Allocation unit size of either 512 or
1024 bytes.
On Solaris, use either 512 or 1024 byte inode and fragment sizes when you create an
ufs partition using newfs. For example, newfs -i 512 -f 512 <device>, or newfs -i 1024 -f
1024 <device>. The inode and fragment size pose two different problems on UFS
partitions. In an UFS partition, each unique file takes up an inode, so the backing store
partition should be configured to have a large number of inodes by telling newfs to use
a small number of bytes per inode with the -i command line option. The fragment size
is the smallest amount of disk space a file will occupy. Having a fragment size that ’s
too big means a large number of small files will waste a lot of disk space.
With Veritas, make sure the file system uses 1K clusters instead of the default 8K
clusters.
Throughout the rest of this paper, the terms, file allocation unit size, fragment size, and
cluster size are synonymous.
where Growth is how much the backing store will grow, n is the number of files added, f
is the fragment size of the file system, nSize is the average size of new files, m is the
number of files modified, and mSize is the average size of modified files. For example,
a backing store with a 1K fragment size that has 30,000 new files, and 10,00 0 modified
files that are 64K on average would grow by:
in a month. Please note that these calculations are applicable only up to TeamSite
5.0X. Future versions of TeamSite will have a different backing store structure that will
require a different formula.
where Increase is the increase in size of the backing store, n is number of files
submitted and published, nSize is the average size of those files, f is the fragment size,
d is the number of directories touched, and dSize is the average size of directory
metadata files. For example, submitting and publishing 1000 new files that are 64K on
average, and assuming these files span 10 directories where dSize is 2K on average,
your backing store (with 1K fragment size) would increase by:
The data files account for 64MB of the growth, so the metadata files account for only
3MB of growth is this example. From the example, we can see that publishing a lot of
editions in a branch will result in a moderate growth of the backing store. A related
factor is that a directory with a large number of items will have a bigger directory
metadata size, and each time that directory is touched, the backing store wil l grow a bit
more than in this example. Furthermore, both of these factors, publishing a large
number of editions, and having a large number of items in a directory, can cause
severe performance problems over time. As a result, Interwoven recommends keep ing
less than 1000 editions per branch and less than 500 files per directory, but keep the
edition count under 500 if there are 1000 files per directory. Directories over 1000
items should be avoided.
Again, the above calculations apply only to TeamSite 5 .0X and earlier, and future
version of TeamSite will have a different backing store format that will require a different
formula.
On Windows NT and Windows 2000, the numbers you want to keep track of are as
follows:
1. Run chkdsk on the backing store drive to find out how many allocation units
are being used, verses how many are available on the disk. It will also tell
you how big the allocation unit is on the drive.
2. Right click on the folder that contains the backing store, and click Properties.
After the tally completes, you should get statistics about how the Size verses
Size on disk. The difference between these two numbers is the disk space
being wasted due to having files smaller than an allocation unit taking up a
whole allocation unit.
1. Run df – F ufs -o i to find out the inodes used verses inodes free. This will let
you know if you are in danger of running out of inodes on your UFS partition.
2. Run df -k on the backing store partition to get statistics about disk space
used.
3. Run df -g on the backing store partition to figure out the fragment size
After several weeks, you should be able to get a pretty good idea whether allocation
units are running out on Windows, or inodes are running out on Solaris, or the backing
store is simply growing due to a lot of new or modified files in the backing store.
2. Allow the ability to freeze the backing store prior to a backup and to unfreeze
the backing store after a backup is finished
You can freeze the backing store by running a batch file using at or a Windows
Scheduled Task just prior to a scheduled backup, and running another batch file just
after a scheduled backup. The following is a list of commercial software to look into
that should satisfy the above requirements:
- Veritas NetBackup
- Legato
- CA Arcserve
Most of these commercial backup tools should work fairly well with TeamSite. Windows
2000’s built-in backup tool should work but may not be as feature rich as other
commercial backup tools.
Commercial software that should work with Solaris includes Legato and Veritas. Also,
the built-in tools ufsdump and volcopy can be used to backup or make a copy of the
backing store. Beware that tools like tar, which do a file system copy, can take
especially long to read the backing store. Sector by sector tools like ufsdump and
volcopy don’t need to read files through the filesystem, and can provide a fixed,
predicable amount of time to backup a file system regardless of the number of files in
the file system.
2. Break off the third mirror, keeping the other two mirrors for fault tolerance
3. Unfreeze the backing store with iwfreeze -- (users can start using TeamSite in
r/w mode again)
5. Reconnect the third mirror when backup has completed to resync with the
active mirrors.
The capability to have a third mirror with Veritas can be found on their web site at:
-http://www.veritas.com/us/products/volumemanagernt/prodinfo.html (NT)
Please consult with Veritas to confirm whether they support 3 -way mirrors with Volume
manage for Solaris.
On Solaris, Veritas volume manager supports data snapsh ots. The procedure for
backing up a snapshot follows:
3. Unfreeze the backing store with iwfreeze -- (users can start using TeamSite in
r/w mode again)
Veritas describes the snapshot capability of Volume manager for Solaris and Veritas file
system for Solaris at:
-http://www.veritas.com/us/products/volumemanager/prodinfo.html
-http://www.veritas.com/us/products/filesystem/prodinfo.html
There is a potential bug with the Veritas file system’s snapshot capability which causes
the backing store to grow abnormally fast. This bug is supposed to be addressed with
VxFS 3.4.
The procedure to backup a NetApp filer hosting TeamSite’s backing store follows:
3. unfreeze the backing store with iwfreeze – (users can resume editing files in
TeamSite)
10 WRAP UP
We have seen the proper procedure to backup TeamSite, and the effectiveness of
conventional backup methods for smaller sites. Sites that need near 24x7 operation, or
have a backing store that is too large to backup in the allowable time window need to
use specialized solutions. These include options such as Veritas 3 -way mirroring,
Veritas data snapshots, and NetApp snapshots. Or eilly has a good book on Unix
Backup and Recovery, which can be found at their website,
http://www.oreilly.com/catalog/unixbr/