metainit

 


 
 
 



NAME

metainit - configure metadevices

SYNOPSIS

/sbin/metainit -h /sbin/metainit [ generic options ] concat/stripe numstripes width component... [ -i interlace ] /sbin/metainit [ width component... [ -i interlace ] ] [ -h hot_spare_pool ] /sbin/metainit [ generic options ] mirror -m submirror [ read_options ] [ write_options ] [ pass_num ] /sbin/metainit [ generic options ] RAID -r component... [ -i interlace ] [ -h hot_spare_pool ] [ -k ] [ -o original_column_count ] /sbin/metainit [ generic options ] trans -t master [ log ] /sbin/metainit [ generic options ] hot_spare_pool [ hotspare... ] /sbin/metainit [ generic options ] metadevice-name /sbin/metainit [ generic options ] -a /sbin/metainit [ generic options ] softpart -p [ -e ] component size /sbin/metainit -r The metainit command configures metadevices and hot spares according to the information specified on the command line. Or, you can run metainit so that it uses configuration entries you specify in the /etc/lvm/md.tab file (see md.tab(4)). All metadevices must be set up by the metainit command before they can be used. If you edit the /etc/lvm/md.tab file to configure metadev- ices, specify one complete configuration entry per line. You then run the metainit command with either the -a option, to activate all metadevices you entered in the /etc/lvm/md.tab file, or with the metadevice name corresponding to a specific configuration entry. Note: DiskSuite 4.2.1 never updates the /etc/lvm/md.tab file. Complete configuration information is stored in the metadevice state database, not md.tab. The only way information appears in md.tab is through editing it by hand.

GENERIC OPTIONS

Root privileges are required for all of the following options except -h. -f Forces the metainit command to continue even if one of the slices contains a mounted file system or is being used as swap. This option is useful when configuring mirrors on root (/), swap, and /usr. -h Displays usage message. -n Checks the syntax of your command line or md.tab entry without actually setting up the metadevice. If used with -a, all devices are checked but not initialized. -r Only used in a shell script at boot time. Sets up all metadevices that were configured before the system crashed or was shut down. The information about previ- ously configured metadevices is stored in the metadev- ice state database (see metadb(1M)). -s setname Specifies the name of the diskset on which metainit will work. Without the -s option, the metainit command operates on your local metadevices and/or hotspares.

CONCAT/STRIPE OPTIONS

concat/stripe Specifies the metadevice name of the concatenation, stripe, or concatenation of stripes being defined. numstripes Specifies the number of individual stripes in the metadevice. For a simple stripe, numstripes is always 1. For a concatenation, numstripes is equal to the number of slices. For a concatenation of stripes, numstripes will vary according to the number of stripes. width Specifies the number of slices that make up a stripe. When width is greater than 1, the slices are striped. component The logical name for the physical slice (partition) on a disk drive, such as /dev/dsk/c0t0d0s2. For RAID5 metadevices, a minimum of three slices is necessary to enable striping of the parity information across slices. -i interlace Specifies the interlace size. This value tells Disk- Suite 4.2.1 how much data to place on a slice of a striped or RAID5 metadevice before moving on to the next slice. interlace is a specified value, followed by either `k' for kilobytes, `m' for megabytes, or `b' for blocks. The characters can be either uppercase or lowercase. The interlace specified cannot be less than 16 blocks, or greater than 100 megabytes. If interlace is not specified, it defaults to 16 kilobytes. -h hot_spare_pool Specifies the hot_spare_pool to be associated with the metadevice. If you use the command line, the hot spare pool must have been previously created by the metainit command before it can be associated with a metadevice. The hot_spare_pool must be of the form hspnnn, where nnn is a number in the range 000-999. Use /-h hspnnn when the concat/stripe being created is to be used as a submirror.

MIRROR OPTIONS

mirror -m submirror Specifies the metadevice name of the mirror. The -m indicates that the configuration is a mirror. submir- ror is a metadevice (stripe or concatentation) that makes up the initial one-way mirror. DiskSuite 4.2.1 supports a maximum of four-way mirroring. When defin- ing mirrors, first create the mirror with the metainit command as a one-way mirror. Then attach subsequent submirrors using the metattach command. This method ensures that DiskSuite 4.2.1 properly syncs the mir- rors. (The second and any subsequent submirrors are first created using the metainit command.) read_options The following read options for mirrors are available: -g Enables the geometric read option, which results in faster performance on sequential reads. -r Directs all reads to the first submirror. This should only be used when the devices comprising the first submirror are substantially faster than those of the second mirror. This flag can- not be used with the -g flag. If neither the -g nor -r flags are specified, reads are made in a round-robin order from all submirrors in the mirror. This enables load balancing across the submirrors. write_options The following write options for mirrors are available: -S Performs serial writes to mirrors. The first submirror write completes before the second is started. This may be useful if hardware is sus- ceptible to partial sector failures. If -S is not specified, writes are replicated and dispatched to all mirrors simultaneously. pass_num A number in the range 0-9 at the end of an entry defining a mirror that determines the order in which that mirror is resynced during a reboot. The default is 1. Smaller pass numbers are resynced first. Equal pass numbers are run concurrently. If 0 is used, the resync is skipped. 0 should be used only for mirrors mounted as read-only, or as swap.

RAID5 OPTIONS

RAID -r Specifies the name of the RAID5 metadevice. The -r specifies that the configuration is RAID5. -k For RAID5 metadevices, informs the driver that it is not to initialize (zero the disk blocks) due to exist- ing data. Only use this option to recreate a previ- ously created RAID5 device. -o original_column_count For RAID5 metadevices, used with the -k option to define the number of original slices in the event the originally defined metadevice was grown. This is necessary since the parity segments are not striped across concatenated devices. Warning for -k and -o: Use extreme caution when using the -k and -o options. When used, these options set the disk blocks to the OK state. If any errors exist on disk blocks within the metadevice, DiskSuite 4.2.1 might begin fabri- cating data. Instead of using these options, you might want to initialize the device and restore data from tape.

TRANS OPTIONS

trans -t master [ log ] trans specifies the name of the trans metadevice, which consists of master and log devices, or just a master device. The -t specifies that the configuration is a trans metadevice. If log is not specified when you create the trans metadevice, no logging can take place until a logging device is provided by using the metattach command. master and log can be simple, mir- ror, or RAID5 metadevices. They cannot be trans meta- devices. master should be a UFS file system. You can configure an existing file system for logging by creating a trans metadevice as follows: make the existing file system into the master trans device, then create the log device on a separate, unused slice. The minimum log size is 1 Mbyte of disk space. Under heavy sustained loads, small logs will detract from performance because old data must be copied from the log to the file system before new data can be logged. The maximum log size is 1 Gbyte. Large logs might increase performance. However, logs larger than 64 Mbytes can have negligible performance benefits.

SOFT PARTITION OPTIONS

softpart -p [-e] component size" 6 The softpart argument specifies the name of the soft partition. The -p specifies that the configuration is a soft partition. The -e specifies that the entire disk specified by component as c*t*d* should be repartitioned and reserved for soft parti- tions. The specified component will be repartitioned such that slice 7 reserves space for system (state database replica) usage and slice 0 contains all remaining space on the disk. Slice 7 will be a minimum of 2Mb, but could be larger, depending on the disk geometry. The newly created soft partition will be placed on slice 0 of the device. The component argument specifies the disk (c*t*d*), slice (c*t*d*s*), or meta device (d*) from which to create the soft partition. The size argument determines the space to use for the soft partition and can be specified in K or k for kilobytes, M or m for mega- bytes, G or g for gigabytes, T or t for terabyte (one terabyte is the maximum size), and B or b for blocks (sectors).

HOT SPARE POOL OPTIONS

hot_spare_pool [ hotspare... ] When used as arguments to the metainit command, hot_spare_pool defines the name for a hot spare pool, and hotspare... is the logical name for the physical slice(s) for availability in that pool. hot_spare_pool is a number of the form hspnnn, where nnn is a number in the range 000-999.

md.tab FILE OPTIONS

metadevice-name When the metainit command is run with a metadevice- name as its only argument, it searches the /etc/lvm/md.tab file to find that name and its corresponding entry. The order in which entries appear in the md.tab file is unimportant. For example, consider the following md.tab entry: d0 2 1 c1t0d0s0 1 c2t1d0s0 When you run the command metainit d0, it configures metadevice d0 based on the configuration information found in the md.tab file. -a Activates all metadevices defined in the md.tab file.
Example 1: Concatenation
All drives in the following examples have the same size of 525 Mbytes. This example shows a metadevice, /dev/md/dsk/d7, consisting of a concatenation of four slices. # metainit d7 4 1 c0t1d0s0 1 c0t2d0s0 1 c0t3d0s0 1 /dev/dsk/c0t4d0s0 The number 4 indicates there are four individual stripes in the concatenation. Each stripe is made of one slice, hence the number 1 appears in front of each slice. Note: The first disk sector in all of the above devices contains a disk label. To preserve the labels on devices /dev/dsk/c0t2d0s0, /dev/dsk/c0t3d0s0, and /dev/dsk/c0t4d0s0, the metadisk driver must skip at least the first sector of those disks when mapping accesses across the concatenation boundaries. Because skipping only the first sector would create an irregular disk geometry, the entire first cylinder of these disks will be skipped. This allows higher level file system software to optimize block allocations correctly.
Example 2: Stripe
This example shows a metadevice, /dev/md/dsk/d15, consisting of two slices. # metainit d15 1 2 c0t1d0s2 c0t2d0s2 -i 32k The number 1 indicates that one stripe is being created. Because the stripe is made of two slices, the number 2 fol- lows next. The optional -i followed by 32k specifies the interlace size will be 32 Kbytes. If the interlace size were not specified, the stripe would use the default value of 16 Kbytes.
Example 3: Concatentation of Stripes
This example shows a metadevice, /dev/md/dsk/d75, consisting of a concatenation of two stripes of three disks. # metainit d75 2 3 c0t1d0s2 c0t2d0s2 c0t3d0s2 -i 16k 3 c1t1d0s2 c1t2d0s2 c1t3d0s2 -i 32k On the first line, the -i followed by 16k specifies that the stripe interlace size is 16 Kbytes. The second set specifies the stripe interlace size will be 32 Kbytes. If the second set did not specify 32 Kbytes, the set would use the default interlace value of 16 Kbytes. The blocks of each set of three disks are interlaced across three disks.
Example 4: Mirroring
This example shows a two-way mirror, /dev/md/dsk/d50, con- sisting of two submirrors. This mirror does not contain any existing data. # metainit d51 1 1 c0t1d0s2 # metainit d52 1 1 c0t2d0s2 # metainit d50 -m d51 # metattach d50 d52 In this example, two submirrors, d51 and d52, are created with the metainit command. These two submirrors are simple concatenations. Next, a one-way mirror, d50, is created using the -m option wtih d51. The second submirror is attached later using the metattach command. When creating a mirror, any combination of stripes and concatenations can be used. The default read and write options in this example are a round-robin read algorithm and parallel writes to all sub- mirrors.
Example 5: Logging (trans)
This example shows trans metadevice, /dev/md/dsk/d1, with mirrors for the master and logging devices. This trans does not contain any existing data. # metainit d11 1 1 c0t1d0s2 # metainit d12 1 1 c0t2d0s2 # metainit d21 1 1 c1t1d0s3 # metainit d22 1 1 c1t2d0s3 # metainit d10 -m d11 # metattach d10 d12 # metainit d20 -m d21 # metattach d20 d22 # metainit d1 -t d10 d20 This example begins by defining four concatenations, d11, d12, d21, and d22. Next, mirror d10 is defined, followed by mirror d20. The mirrors are initially defined as one-way mirrors, then the second submirrors are attached later with the metattach command. Finally, the trans metadevice d1 is defined, with d10 as the master device and d20 as the log- ging device by using the -t option.
Example 6: RAID5
This example shows a RAID5 device, d80, consisting of three slices: # metainit d80 -r c1t0d0s2 c1t1d0s2 c1t3d0s2 -i 20k In this example, a RAID5 metadevice is defined using the -r option with an interlace size of 20 Kbytes. The data and parity segments will be striped across the slices, c1t0d0s2, c1t2d0s2, and c1t3d0s2.
Example 7: Soft Partition
The following example shows a soft partition device, d1, built on metadevice d100 and 100 Mbytes (indicated by 100M) in size: # metainit d1 -p d100 100M The preceding command creates a 100-Mbyte soft partition on the d100 metadevice. This metadevice could be a RAID5, stripe, concatenation, or mirror.
Example 8: Soft Partition on Full Disk
The following example shows a soft partition device, d1, built on disk c3t4d0: # metainit d1 -p -e c3t4d0 9Gb In this example, the disk is repartitioned and a soft parti- tion is defined to occupy all 9 Gbytes of disk c3t4d0s0.
Example 9: Hot Spare
This example shows a two-way mirror, /dev/md/dsk/d10, and a hot spare pool with three hot spare components. The mirror does not contain any existing data. # metainit hsp001 c2t2d0s2 c3t2d0s2 c1t2d0s2 # metainit d41 1 1 c1t0d0s2 -h hsp001 # metainit d42 1 1 c3t0d0s2 -h hsp001 # metainit d40 -m d41 # metattach d40 d42 In the above example, a hot spare pool, hsp001, is created with three disks used as hot spares. Next, two submirrors are created, d41 and d42. These are simple concatenations. The metainit command uses the -h option to associate the hot spare pool hsp001 with each submirror. A one-way mirror is then defined using the -m option. The second submirror is attached using the metattach command. /etc/lvm/md.tab Contains list of metadevice and hot spare configura- tions for batch-like creation.

WARNINGS

Multi-Way Mirror Do not use the metainit command to create a multi-way mir- ror. Rather, create a one-way mirror with metainit then attach additional submirrors with metattach. When the metat- tach command is not used, no resync operations occur and data could become corrupted. If you use metainit to create a mirror with multiple submir- rors, the following message is displayed: WARNING: This form of metainit is not recommended. The submirrors may not have the same data. Please see ERRORS in metainit(1M) for additional information. Write-On-Write Problem When mirroring data in Solstice DiskSuite 4.2.1, transfers from memory to the disks do not all occur at exactly the same time for all sides of the mirror. If the contents of buffers are changed while the data is in-flight to the disk (called write-on-write), then different data can end up being stored on each side of a mirror. This problem can be addressed by making a private copy of the data for mirror writes, however, doing this copy is expensive. Another approach is to detect when memory has been modified across a write by looking at the dirty-bit associated with the memory page. DiskSuite 4.2.1 uses this dirty-bit technique when it can. Unfortunately, this tech- nique does not work for raw I/O or direct I/O. By default, DiskSuite 4.2.1 is tuned for performance with the liability that mirrored data might be out of sync if an application does a "write-on-write" to buffers associated with raw I/O or direct I/O. Note that without mirroring, you were not guaranteed what data would actually end up on media, but multiple reads would return the same data. With mirroring, multiple reads may return different data. The following line can be added to /etc/system to cause a stable copy of the buffers to be used for all raw I/O and direct I/O write operations. set md_mirror:md_mirror_wow_flg=0x20 Setting this flag will degrade performance. The following exit values are returned: 0 Successful completion. >0 An error occurred. See attributes(5) for descriptions of the following attri- butes: ____________________________________________________________ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | |_____________________________|_____________________________| | Availability | SUNWmdr | |_____________________________|_____________________________| metaclear(1M), metadb(1M), metadetach(1M), metahs(1M), metaoffline(1M), metaonline(1M), metaparam(1M), metarecover(1M), metareplace(1M), metaroot(1M), metaset(1M), metastat(1M), metasync(1M), metattach(1M), md.cf(4), md.tab(4), mddb.cf(4), attributes(5) Solstice DiskSuite 4.2.1 Reference Manual

LIMITATIONS

Recursive mirroring is not allowed; that is, a mirror cannot appear in the definition of another mirror. Recursive logging is not allowed; that is, a trans metadev- ice cannot appear in the definition of another metadevice. Stripes, concatentations, and RAID5 metadevices must consist of slices only. Mirroring of RAID5 metadevices is not allowed. Soft partitions can be built on raw devices, or on stripes, RAID5, or mirrors. RAID5 or stripe metadevices can be built directly on soft partitions. SunOS 5.8 Last change: 11 May 2001 11