dt Test Recommendations

Test Considerations

Oftentimes, folks ask what is the best dt command lines to use. The answer usually depends what you are trying to test, but here are a few guidelines:

General Test Options

The following are options to consider for any type of device under test:

Choose a non-repeating data pattern or a prefix to create uniqueness:

pattern=iot - Encodes block number and seeds block with that LBA.
prefix="string" - Prefix each block with specified string.

Vary the I/O sizes, testing up to the maximum size supported:

min=value - Minimum record size.
max=value - Maximum record size.
incr=variable - Vary the I/O size between min/max values.

For sequential I/O, consider testing both forward and reverse directions:

iodir={forward|reverse} - Selects the I/O direction.

For random access devices, test both sequential and random I/O:

iotype={random|sequential} - Selects the I/O type.

To generate high I/O loads, consider using POSIX AIO or multiple processes:

aios=value - Queue this many POSIX Asynchronous I/O requests.
procs=value - Create multiple processes (more one this later).
slices=value - Create multiple slices (each proc was own region).

For extended test runs, select either multiple passes or specify a runtime:

passes=value - Execute multiple passes (a write pass and a read/verify pass).
runtime=time - Continue running until the specified runtime value is reached.

Note: Please use AIO's in lieu of multiple processes to reduce system resource consumption. AIO's are very similar to multiple threads for generating heavy I/O load, except the POSIX AIO subsystem is utilized to queue multiple reads and writes.

File System Testing

Here are a few options to consider for file system testing:

Select multiple processes to create multiple files.

procs=value - Each file has process ID (pid) appended for uniqueness.
Each process starts with a unique pattern, unless pattern= option used.
The parent dt monitors all its' children (subprocesses).
oncerr={abort|continue} - Controls the behaviour on child error.

Select I/O sizes to test byte boundaries and cross block boundaries:

min=1 max=256k incr=var - Varies I/O between these min/max sizes.

To defeat the effect of the file system buffer cache, use:

flags=direct - Enable direct I/O (DIO) (bypasses the buffer cache).
flags={fsync|rsync|sync} - To force write/read/both I/O sync to disk.
oflags=dsync - Sync data to disk during write operations.
oflags=trunc -Truncate an existing file on each file open.

To force buffer cache to invalidate its' cache for this file system:

Use: of=file disable=verify dispose=keep to write files.
Dismount (Unix umount) the file system then remount it.
Use: if=file to read previously written file (must manually remove).
The above sequence forces reads to come from physical media.
Writing files larger than the buffer cache is an alternative.

Reducing the buffer cache size will help decrease it's buffering effect.

Direct Disk Testing

Direct disk access is commonly referred to as raw (character) device testing on Unix based systems. Here are options to consider for raw testing:

Select multiple slices for heavier load and faster testing:

slices=value - Divide the disk capacity into value slices.
A separate process is created to operate on each region (slice).

Enable POSIX Asynchronous I/O (AIO) to generate higher I/O loads:

enable=aio - Enables AIO and with a default of 8 requests.
aios=value - Enables AIO and uses the specified value.
POSIX AIO can be used in conjunction with multiple slices.

Note: Linux only provides a block DSF, so DIO or raw(8) is recommended).

Shared (raw) Disk Testing

When performing shared concurrent disk testing from multiple hosts consider:

Divide the disk capacity up into multiple slices then choose slice:

slices=value - Divide the disk capacity into multiple slices (regions).
slice=value - Tell dt which slice for each host to operate on.

Ensure each block has a unique prefix to track the writing host:

prefix="%d@%h" - Defines the device name and host name as its' prefix.

Encode the logical block address (LBA) in each block to aid w/corruptions:

pattern=iot - Encodes the lba in little-endian format (32-bits).
enable=lbdata - Encodes the lba in big-endian format (32-bits).
Note: 32-bit LBA's wrap at 2TB with a 512-byte physical block size.

Performance Testing

Best performance will be realized with large block sizes:

bs=256k - Specifies a large fixed block size.

Queuing multiple I/O requests will provide better performance:

aios=value - Depending on system AIO limits, 32-64 AIO's (or more).
Note: For file system testing, increasing the cache size may help.

Best performance will be realized with data verification disabled:

disable=compare - Disables write buffer filling and read compares.
disable=verify - Disables the read/verify pass (writes data only).

Monitoring I/O Progress

A feature was added for detecting and reporting slow or no I/O progress. This feature is very useful for storage failover testing, since it lets you know how long it takes for I/O to resume. It can also be helpful for reporting hung I/O, assuming the required signal (SIGALRM) can be delivered to dt.

Two options control what's called the "no-progress I/O" feature:

alarm=time - Controls how often to check no-progress time.
noprogt=time - Reports no-progress message when time exceeded.
noprogtt=time - Controls the no-progress trigger script execution.
Note: alarm is delivered by a signal (non-interruptable I/O blocks signals).
The native Windows version uses a monitoring thread to avoid this issue.

What operations are monitored?

open(), read(), write(), fsync(), close(), and AIO's.
Note: fsync() is used with file system to flush data written.

Trouble-Shooting Aids

There are a couple of options to help with troubleshooting problems:

To determine when a block was written, blocks can be timestamped:

enable=timestamp - Writes a timestamp into 1st 4 bytes of each block.
With IOT or lbdata, the timestamp replaces the normal 32-bit LBA.
When a data corruption occurs, the time of the write is reported.

For file system testing, keeping a corrupted file for analysis is helpful:

dispose=keep will prevent deleting the output file.
dispose=keeponerror will keep the file only on errors.

To report previous I/O information and data on data corruptions:

history=value - Record I/O history in a circular buffer.
hdsize=value - Specify how much history data to record.
enable=htiming - Time each I/O request.
enable=hdump - Dump the history buffer at end of test.

Executing an external script during errors can be helpful:

trigger=cmd:script - Specifies a script to execute with arguments.
The trigger script can be used to trigger an analyzer or panic host.
The exit status from the trigger script controls dt's next action.
Beware: The same script can be executed for no-progress monitoring.
Note: Other trigger actions available but require Scu in your PATH.

Enabling program debug information can be helpful:

enable=debug - Enables debug of open()/reopen/close/EOF operations.
enable=edebug - Enables end-of-file (EOF) handling debugging.
enable=rdebug - Enables random I/O (seek) operation debugging.
enable=Debug - Enable all of the above plus read()/write() requests.

For long running tests, you may wish to emit a keepalive message:

alarm=time - Specifiying how often to emit the keepalive message.
keepalive=string - Specifies the keepalive string to format/emit.
pkeepalive=string - Specifies the per pass keepalive message.
tkeepalive=string - Specifies the total stats keepalive message.
Note: pkeepalive/tkeepalive override the standard statistics.

Recommended Tests

Generally, there's no single best command line for testing. Instead, using multiple command lines are necessary to create different I/O footprints and to utilize different API's (sync versus async, for example).

As mentioned above, variety is better than running the same fixed tests over and over. This includes varying the I/O sizes, the direction, and the I/O type, as well as using different data patterns. When doing file system testing with multiple passes (passes= or runtime= options), you may wish to truncate files between each pass via oflags=trunc to avoid overwriting blocks with the same data (although this is a non-issue when using IOT pattern, since pass is factored into the block seeding).

For direct (raw) disk I/O, there are several command lines to randomly select:

dt of=${DSF} aios=8 min=d max=256k incr=var pattern=iot prefix="%d@%h" iodir={forward|reverse} iotype={random|sequential}
dt of=${DSF} aios=8 min=d max=256k incr=d enable=lbdata prefix="%d@%h" iodir={forward|reverse} iotype={random|sequential}
dt of=${DSF} aios=8 bs=256k prefix="%d@%h" iodir={forward|reverse}
For sequential I/O, randomly choose: iodir={reverse|forward}
The AIO value and max transfer size should be user settable.
The max transfer size varies for each host operating system.

For file system testing, here are several command lines to randomly select:

dt of=${MountPoint} min=1 max=256k incr=var limit={FileSize} iodir={forward|reverse} iotype={random|sequential}

dt of=${MountPoint} min=d max=256k incr=var limit={FileSize} pattern=iot iodir={forward|reverse} iotype={random|sequential}

dt of=${MountPoint} min=5 max=256k incr=var limit={FileSize} enable=lbdata iodir={forward|reverse} iotype={random|sequential}

dt of=${MountPoint} bs=256k limit={FileSize} prefix="%d@%h"

For sequential I/O, randomly choose: iodir={reverse|forward}

For clustered file systems, adding prefix="%d@%h" is useful.

Note: min=5 is necessary to allow space for encoding the lba.

Send mail to admin of this page: Robin.Miller@netapp.com

To Robin's home page.

Last Modified: September 26th, 2008