Search This Blog

Saturday, November 19, 2011

DAOS Best Practices

When it comes to configuring the Domino Attachment and Object Service (DAOS), you may be asking yourself - and us in turn - what's the right way to set it up? For example, is there an optimum “Minimum Size” setting? Should the repository go under the data directory or on its own drive? What's the best “Deferred Deletion Interval” in relation to my backup and restore schedule? This guide, and the documents it references, attempts to answer these questions of individual, site-specific configuration in general terms with guidelines for adapting and modifying them based on measurements made against your particular environment.


Where To Locate Your DAOS Base Path Repository



By default the DAOS repository resides under the server's data directory and defines a single container as indicated by the “DAOS Base Path” setting in the DAOS tab of the server document. So, on Windows for example, if you use the default, “DAOS”, and your data directory is C:\Lotus\Domino\Data, the full path to the repository would be C:\Lotus\Domino\Data\DAOS. For Domino 8.5, only this one container may be specified.

However, this default location, chosen simply for being well-known, may not be the most efficacious. Some things to consider:
    1. What is the total capacity of all file attachments? With only one container in Domino 8.5, flexibility may be important when choosing the best DAOS base path. You'll want to be sure you have significant storage capacity or the ability to reconfigure a logical drive as space needs increase. Use the Domino Attachment and Object Service Estimator to plan for your storage requirements. 2. What I/O costs do I expect to incur? DAOS base path I/O is significantly less than that of Domino's Data directory. In benchmark tests, DAOS repository I/O was 94% less than that of the server's Data directory. Lower performing storage (a NAS device, for example) can be used here. 3. Can I use lower cost or external storage devices? In many cases, you might find attachments are infrequently accessed -- for example, when they're part of old email messages collecting the proverbial dust in one's inbox. In these environments, locating DAOS on lower cost storage (tier 3) devices may be indicated. On the other hand, if full text indexing, agents, or other applications make heavy use of the consolidated attachments, “lower cost” storage may cost you in performance. Note: Externalizing the DAOS repository in this manner does not mean you can map multiple Domino servers to the same container. This is an unsupported configuration as of this publication and could very well lead to data loss due to encryption with the server's key. NLO files cannot be shared across Domino servers. Note: Modifying the location of the DAOS repository at a later time is allowed and requires that you first change the “DAOS Base Path” field on the DAOS tab in the server document, stop the Domino server and then relocate the existing subdirectory structure with its NLO files to the new location. On server restart, the modification will take place seamlessly. 4. Why is it recommend to not locate the DAOS Base Path under Domino's Data directory? Many of Domino's tasks, including fixup, compact and the admin client, scan Domino's Data directory and will also scan the NLO files if the files are located under that sub-directory. The scanning of NLO files for tasks other than DAOSMGR will add additional unneeded overhead to the processing done by Domino's tasks.

Optimum Minimum Size For Participation



By default, the minimum size setting for an attachment to make use of DAOS is 4096 bytes. While we recommend using 64000 as the lowest value you should use here (1048576 on iSeries), there are a number of things to consider when determining the best DAOS minimum size setting for your system.
    1. Do not set the minimum size lower than the default setting. Due to attachment file overhead, setting the minimum size to anything lower than the default size would actually be less efficient than storing the attachment in the NSF file. 2. Set a minimum size that is a multiple of your file system's disk block size. By choosing a minimum size that is a multiple of the disk block size, you optimize disk usage. To ascertain the disk block size for your file system, on a Windows NTFS, use “fsutil fsinfo ntfsinfo ” and take note of the “Bytes Per Cluster”. This is the disk block size. On Solaris, you could use df -g and take note of the block size. On AIX you need be super user to do determine block size, then use lsfs -q and look for block size. On Linux you also need to be super user to find the block size, then use df -k to determine the device name of your filesystem and the uses dumpe2fs | grep 'Block size' to determine the block size. 3. Take note of possible limitations on number of files. The smaller you make the setting, the more attachments will qualify for DAOS consolidation. The larger you make the setting, the fewer will qualify. In Domino 8.5, the DAOS repository allows for one container with up to 1,000 subcontainers, each with a maximum of 40,000 NLO files. Thus the storage capacity of DAOS is limited to 40 million distinct objects. This is a significant number of files, so if you expect to come anywhere close to approaching it, you should check the limits on your backup and restore solution, as some applications and file systems have limitations on maximum number of files. Refer to your operating system and/or backup application guidelines. 4. The current recommendation for the lowest value you should use on IBM iSeries is 1048576 (1M) to avoid overwhelming the filesystem and backup utilities.

To get an idea of how many files various settings would generate, you can run them through the Domino Attachment and Object Service Estimator.

The ultimate goal with this setting is to minimize the number of files in your DAOS repository and maximize the amount of disk space saved.

Deferred Deletion Interval



DAOS automatically deletes NLO files that are no longer being referenced by any databases. This deletion of NLO files is known as “pruning” and occurs at the specified “Deferred Deletion Interval.”

Establishing a useful “Deferred Deletion Interval” for your server involves a few considerations, primary among them your backup and restore schedule. You want to ensure that NLO files which are no longer needed remain in existence at least as long as your backup cycle. In this way, they will not be deleted before the next backup.

A secondary consideration is the size of attachments typically stored in the repository for your server. If they are usually quite large, you may want to have them cleaned up as quickly as possible after there are no longer any references to them.

If your deferred deletion interval is set too high, NLO files which are no longer needed will continue to take up valuable space in the repository. If your deferred deletion interval is set too low, you could be deleting NLO files that have not yet been backed up, thus making it difficult, and in some cases not even feasible, to restore them. It is important to find a balance that satisfies both your backup schedule and the system integrity of a neatly pruned environment.

Pruning



Pruning can also be manually triggered to override the automatic deferred deletion interval. The administrator can issue the console command “tell daosmgr prune x” to forcibly delete unreferenced NLOs that are x days old. This will recover the disk space still being used by unreferenced NLO files immediately rather than waiting for the automatic deferred deletion interval to do so. When performing this action, you must consider your backup cycles. As with setting the deferred deletion interval too low, pruning too soon could delete NLO files that have not yet been backed up.

When A Notes Database Should Use DAOS



There are several good reasons to select an NSF file for participation in DAOS consolidation:
  • It contains or is likely to have multiple copies of the same attachments.
    Even a single NSF can benefit from DAOS consolidation.
  • It resides on a server where the same attachments appear across multiple NSF files.
    If others are also referencing attachments present in your database, why not share?
  • It contains very large attachments.
    In this case, it may not matter how many other NSF files hold the attachments in question. If they're large enough, the simple step of storing them outside the NSF can make common operations against that database much faster.

While DAOS can always benefit your data, DAOS has less benefit under the follow conditions:
  • Databases have lots of small attachments.
    Attachment consolidation is less efficient in this scenario due to disk blocking. You can, however, eliminate this issue by adjusting the minimum size setting upward.
  • There is little or no attachment duplication across databases.
    Backup would still benefit due to extracting static data, but you would have little disk space reduction.
  • Databases contain few or no attachments.
    In 8.5, DAOS stores only file attachment data.
  • Databases need to be quickly portable.
    Because DAOS object files (NLOs) cannot be shared across Domino servers, it is more difficult to move DAOS-enabled Notes databases from server to server.

It's not necessary for all databases on a server to leverage DAOS, but for those that do, the savings in both space and time, as for example, in much accelerated compact operations, are significant.

Mailbox



Although DAOS will work in any configuration, it operates most efficiently when it is enabled on both the mail.box files and individual mail files. Enabling transaction logging and DAOS on mail.box will enable the Router to optimize the delivery of DAOS based attachments. This can result in significant I/O savings for the case where the same attachment is sent to multiple recipients on the same Domino server.

I/O Activity for mail delivery of email with an attachment
DAOS Enabled
Document WritesAttachment Read(s)Attachment Write(s)Comments
MAIL.BOX
    Mail files




No
No
1 + N
1 + N
1 + N

Yes
No
1 + N
1 + N
1 + N

No
Yes
1 + N
1 + N
1 + N

Yes
Yes
1 + N
1
1
    Maximum reduction for both I/O & CPU
* N is the number of recipients

Note:
If you have multiple mail.box files, you must enable transaction logging and DAOS on all of them to leverage DAOS object copy optimization, which streamlines delivery of attachments.

Since Domino creates new mailboxes as needed, you should also set these properties on the mail.box template. If you choose not to enable DAOS or transaction logging on the mail box, DAOS will still be used by any DAOS-enabled mail files. Using DAOS on mail.box(es) only affects the optimized routing (delivery) of attachments.

When an incoming document is received at mail.box, it is stored there until it is delivered to the individual mail file(s) of the recipient(s). Several results are possible:
  • If DAOS is not enabled anywhere, the document will be stored in mail.box, and the attachment will be stored inline. As the document is delivered to the recipient mail file(s), the document and the contents of the attachment are read from mail.box, and the document is written to the mail file with the attachment inline. Total I/O cost to deliver to N recipients: 1 + N doc writes, 1 + N attachment reads, 1 + N attachment writes.
  • If DAOS is enabled on both mail.box and destination mail file(s), any attachments in that document will be extracted and converted to NLO files as it is being written to mail.box. The document and DAOS ticket are written to the destination mail file(s). IMPORTANT: In the case where both mail.box and the mail file are DAOS-enabled, the contents of the attachment will not be written again as the document is delivered, only a reference to the existing NLO file will be copied. Total I/O cost to deliver to N recipients: 1 + N doc writes, 1 attachment read, 1 attachment write.

    If DAOS is enabled on mail.box but not on the destination mail file(s), any attachments in that document will be extracted and converted to NLO files as it is being written to mail.box. Since the destination mail file(s) is/are not using DAOS, the attachment must be stored inline, and the contents of the attachment will be read out of mail.box (which has it stored in DAOS) in order to do that. Total I/O cost to deliver to N recipients: 1 + N doc writes, 1+N attachment read, 1+N attachment writes.
  • If DAOS is not enabled on mail.box, but is for the destination mail file(s), the attachment will be stored inline in mail.box. As the document is delivered to the recipient mail file(s), the contents of the attachment are read out of the mail.box document, and a temporary NLO file is created for each destination mail file so that a checksum can be calculated. If an NLO file with the same checksum already exists, the temporary file is deleted. In the case of N recipients, this process will be repeated N times, even though only one NLO file will remain at the end of the process. Total I/O cost to deliver to N recipients: 1 + N doc writes, 1+N attachment read, 1+N attachment writes. IMPORTANT: In this case, although the end result (a single NLO file per unique attachment) is the same, the I/O cost is significantly increased over the case where mail.box is enabled for DAOS.

Mail Journaling



For 8.5, it is recommended that you not enable DAOS on the mail journal (mailjrn.nsf).

Encryption



By default, DAOS employs encryption to safeguard its repository. This setting is separate from encryption settings that apply to an NSF or document. The encryption is done with the server key, so the resulting NLO files can be read only on a server that uses that same key. This may be a consideration for backup or redundant server setup.

The performance hit for DAOS encryption is negligible; testing showed a 5% CPU increase with no change in I/O versus unencrypted data. However, if your organization has reason to disable it, we've provided the server notes.ini setting DAOS_ENCRYPT_NLO, which can be set to zero to affect that change. To determine the current status of encryption, use the ”sh stat daos” command from the server console.

There are some storage area network devices that have the ability to deduplicate files. If DAOS_ENCRYPT_NLO=0 is not specified, deduplicating will not occur across Domino servers. The encryption will make even duplicate files unqualified for deduplicating because they will encrypted with each server's key.

Compression



While DAOS is compatible with compression, there are a couple of points to remember:
    1. It's possible for an attachment to disqualify itself from DAOS consolidation by compressing to a size smaller than the Minimum Size setting. 2. The same attachment, undergoing different compression types, LZ1 versus Huffman versus no compression, will be seen by DAOS as different objects and will, therefore, be shared only with others of like type.

Resynchronization of the Catalog



In order to ensure that NLO files are not physically deleted when there are still ticket holders referencing them, if there is any reason to question the accuracy of a reference count, DAOS puts itself in a safe mode whereby no deletes are allowed to proceed. This state is signaled to the administrator via the Domino Domain Monitoring systems and is reported as a “NEEDS RESYNC” state from the “tell daosmgr status catalog” server console command.

To perform this catalog resynchronization manually, type “tell daosmgr resync” from the server console.

Note: The duration of a resync can be significant and depends on the number of DAOS-enabled databases, the number of NLO files in your environment, and your system configuration. It can take several hours to complete causing its execution to overlap into normal business hours. Although Domino and DAOS are functional while a resync is in progress, there may be a degradation in performance while it is running.

Prior to 8.5.2, if you find it necessary to halt the resync operation, you can interrupt it by issuing “tell daosmgr quit” from the server's console. However, for continued operation of DAOS, immediately restart the DAOSMGR task using the console command “load daosmgr.” When it is convenient to continue the resync operation, issue “tell daosmgr resync” again from the server's console and resync processing will continue where it left off.

8.5.2 introduces many improvements to the performance of the resync operation. Resync will complete sooner in this release and improvements to keep the catalog in sync have been made. Resync can now be scheduled to work with in a time window by setting DAOS_RESYNC_START_TIME and DAOS_RESYNC_STOP_TIME. Setting a time window will eliminate the need to stop and start daosmgr when resyncing the catalog.

Typically, the only downside to being in "Needs Resync" state is that DAOS suspends pruning operations. Unless you are experiencing errors, it is usually better to wait to do a resync until the next available maintenance window.

Antivirus



Consider DAOS interaction with antivirus scans. It's critical the DAOS base path and .NLO file extension have the same anti-virus policy as the Domino data directory and .NSF file extension. If the two file types have different policies, consolidating attachments from an existing NSF using “compact -daos on -c” can result in NLO files being quarantined as they're extracted from the NSF. A user who then opens a quarantined attachment would see a “Missing NLO” error message.

A Domino add-in antivirus program is recommended so that the attachments are scanned as they pass through the server. You should not scan the NLO files directly. (Similarly, the transaction log files should be excluded from scans.)

The screenshot below shows how you'd configure Symantec Antivirus to exclude the .NLO file extension. From the Configure folder on the left, select File System Auto-Protect. Click the Exclusions button, then the Extensions button. Type “NLO” without the quotes and click Add.



Recommended versions of Notes Client and Domino Server



For the Notes Client anything before 8.5.1 or post 8.5.1 FP3 (including this version) should be used. For the Domino Server, the minimum version should be 8.5.1 FP3. For more information see the Technote 1446397 "Attachment corruption related to DAOS and Domino 8.5.1 ".

Ideally the Domino Server should be at 8.5.2 to gain both operational and performance improvements made to DAOS catalog synchronization as well as other DAOS improvements.

Worse practices



1. Too small a minimum participation size - getting too "greedy"
Limit the participation size to what will resulting in reasonable disk space amount with the least number of nlo files. Let's review the following output from the "Domino Attachment and Object Service Estimator":

DAOS Minimum Size versus number of NLO's and Disk Space:

0.0 KB will result in 2226347 .nlo files using 185.5 GB
64.0 KB will result in 1092894 .nlo files using 175.7 GB
128.0 KB will result in 708403 .nlo files using 163.6 GB
256.0 KB will result in 422087 .nlo files using 145.9 GB
512.0 KB will result in 219833 .nlo files using 120.2 GB
1.0 MB will result in 93628 .nlo files using 87.8 GB
2.0 MB will result in 36576 .nlo files using 56.6 GB
3.0 MB will result in 17499 .nlo files using 38.0 GB
4.0 MB will result in 9717 .nlo files using 26.3 GB
8.0 MB will result in 1576 .nlo files using 6.5 GB

While the theoretical maximum (first line) would generate approximately 2.2M files using 185 GB this would not be the ideal participation size for two important reasons.

The first reason is the benefit of DAOS can be realized without having to maintain 2.2M NLO files. A more reasonable participation size between 128k-256K would result in the number of NLO files in the range of 422,087 to 708,403. Taking an average of the two participation sizes, a value of 192K would result in approximately 500,000 NLO files with a disk space size of about 150 GB. The net result would be an 80 % total size yield with a little less than 1/4 of the theoretical maximum NLO files to maintain in your backup/restore procedures.

The second reason is that reducing the minimum participation size is more easily done then increasing the minimum participation size.

2. Deleting the daoscat.nsf and/or daos.cfg to "fix" problems.
The daoscat.nsf and the daos.cfg are vital files for DAOS. daoscat.nsf keeps data about the location of NLO files with reference counts. daos.cfg maintains data about the DAOS configuration. If one or both of these files are deleted, DAOS will rebuild the files; a processes that will take hours to complete. While the files are being recreated the location of individual NLO files and the reference count will be in transition which would result in intermittent access to attachments. Removal of corrupted daoscat.nsf should only be done when all other options for getting the file back have failed.

3. Long deferred deletion intervals set on a hub server.
When mail routing or replication hub servers are DAOS enabled, NLO files will be created and marked for deletion. Until Prune is run as scheduled by the "Deferred Deletion Interval", the files will exist in the DAOS Repository.

4. Failing to monitor the DAOS catalog state.
DDM events are generated for DAOS state changes including the state of the DAOS catalog. It is important to keep the DAOS catalog in a synchronized state to make sure prune can run and to keep access to the DAOS manager functioning correctly. Make sure to monitor the state of the DAOS catalog either with via DDM events or directly via the server console with "show daosmgr status catalog" and make sure the catalogState reports "SYNCHRONIZED".

5. No Backup/Restore procedures.
It is worth repeating the phrase timing is everything regarding backup and restore. It is important to time the backup of nsf and nlo files to be more frequent than the "Deferred Deletion Interval" because prune will remove the NLO files. Another important point is to not only have a well planned Backup/Restore procedure implemented, but to validate the restore procedure regularly to ensure that data can be restored before there is an emergency. For more information refer to the DAOS Backup and Restore

Transaction logging and DAOS


    System Database File
    Should DAOS be enabled on this database?
    Should Archive transaction logging be enabled on this Database?
    Should Circular transaction logging be enabled on this Database?
    Comments
    NAMES.NSF
No
Yes
Yes

    LOG.NSF
No
No
Yes
    If this is not transaction logged consider deleting it after a crash as it could impact startup time. It may require a fixup and this could take an extremely long time if log.nsf is very large.
    ADMIN4.NSF
No
Yes
Yes

    MAIL.BOX
Yes
Yes
Yes
    If you want to use DAOS and not transaction log mailbox bodies you can use the following in 8.51. NSF_DONT_LOG_MAILBOX_BODY=1 RM_NO_LOG_LARGE_OBJECTS=1 RM_NO_LOG_OBJECTS_IN_MAILBOX=1 
    DBDIRMAN.NSF
No
No
No

    CLUBUSY.NSF
No
No
Yes

    DDM.NSF
No
No
Yes

    STATREP.NSF
No
No
Yes

    STATMAIL.NSF
No
No
Yes

    CLDBDIR.NSF
No
No
Yes

    WEBADMIN.NSF
No
No
Yes

    CLDBDIR.NSF
No
No
Yes

    BUSTIME.NSF
No
No
Yes

    EVENTS4.NSF
No
No
Yes

    STATLOG.NSF
No
No
Yes

    REPORTS.NSF
No
No
Yes

    MTSTORE.NSF
No
No
Yes

    ACTIVITY.NSF
No
No
Yes

No comments:

Post a Comment