EMC Celerra   1 comment

How to Collect Celerra suport Material

nasadmin@seadcnsx1cs0 nar]$ /nas/tools/collect_support_materials
collect_support_materials[29843]: The collection script revision 2.8.4 has started.

Collecting /nas/log/*, /nas/log/webui/*, /nas/ConnectHome/*
and /nas/jserver/logs
Collecting output from server_log
Collecting output from internal commands
Collecting event log configuration files
Collecting files from .etc dir of each DM
Collecting Mirrorview DR logs
Collecting /var logs
Collecting upgrade logs
Collecting /etc files
Collecting /http/logs and /tomcat/logs
Collecting Celerra Manager tasks
Collecting cron files
Collecting Control Station process information and versions
Collecting /nas/jserver/debug_of_core* files
Now running material collection script for longer running commands.
Collecting complete nas dir listing
Collecting output from nas commands
Collecting RDF information
Collecting DHSM information
Collecting output from other CS commands
Collecting other files from /nas, /nas/site, /nas/sys,
/nas/rdf, and /nas/dos
Material Collection File:
/nas/var/log/support_materials_APM00070802955.091216_0908.zip has been generated.

**********************************************************************
Please include file /nas/var/log/support_materials_APM00070802955.091216_0908.zip
with materials submitted to EMC for problem investigation.
**********************************************************************

collect_support_materials[29843]: The collection script has finished successfully.
[nasadmin@seadcnsx1cs0 nar]$

To Enable DHSM on the Celerra filesystem for the Archiving solutions

[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -modify HTRProd -state enabled
HTRProd:
state                = enabled
offline attr         = on
popup timeout        = 0
backup               = passthrough
read policy override = none
log file             = on
max log size         = 10MB

Done

[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -c data -i 0
data:
state                = enabled
offline attr         = on
popup timeout        = 0
backup               = offline
read policy override = none
log file             = on
max log size         = 10MB
cid                 = 0
type                 = HTTP
secondary            = http://ccd.ad.wellcare.com/fmroot
state                = enabled
read policy override = none
write policy         = full
user                 = dhsm_user
options              = httpPort=8000 cgi=n

[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -c HTRProd -create -type http -second                                                                                                 euser
HTRProd:
state                = enabled
offline attr         = on
popup timeout        = 0
backup               = passthrough
read policy override = none
log file             = on
max log size         = 10MB
cid                 = 0
type                 = HTTP
secondary            = http://ccd.ad.wellcare.com/fmroot
state                = enabled
read policy override = none
write policy         = full
user                 = dhsm_user
options              = httpPort=8000 cgi=n

Done

[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -l
id      name
38      apps
39      data
40      dev
41      projects
42      reports
43      users
44      wc2sys
45      wc2vol1
46      wc4vol3
47      winlog
48      dept
3955    HTRProd
2868    fmatest
[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -c HTRProd -create -type http -second                                                                                                 ary ‘http://ccd.ad.wellcare.com/fmroot’ -httpPort 8000 -cgi n -user dhsm_user -p                                                                                                 assword nasserviceuser
[rgovind1@celera1-cs0 root_vdm_2]$ nas_fs -l | grep -i itpos
[rgovind1@celera1-cs0 root_vdm_2]$ nas_fs -l | grep -i itops
4699      y    1   0     2276      ITOPS               v2
[rgovind1@celera1-cs0 root_vdm_2]$ nas_fs -l | grep -i moveit
588       y    1   0     312       MoveIt              v2
[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -modify itops -state enabled
Error 3105: invalid filesystem specified
[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -modify ITOPS -state enabled
ITOPS:
state                = enabled
offline attr         = on
popup timeout        = 0
backup               = passthrough
read policy override = none
log file             = on
max log size         = 10MB

Done

[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -modify MoveIt -state enabled
MoveIt:
state                = enabled
offline attr         = on
popup timeout        = 0
backup               = passthrough
read policy override = none
log file             = on
max log size         = 10MB

Done

[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -c MoveIt -create -type http -seconda                                                                                                 ry ‘http://ccd.ad.wellcare.com/fmroot’ -httpPort 8000 -cgi n -user dhsm_user -pa                                                                                                 ssword nasserviceuser
MoveIt:
state                = enabled
offline attr         = on
popup timeout        = 0
backup               = passthrough
read policy override = none
log file             = on
max log size         = 10MB
cid                 = 0
type                 = HTTP
secondary            = http://ccd.ad.govind.com/fmroot
state                = enabled
read policy override = none
write policy         = full
user                 = dhsm_user
options              = httpPort=8000 cgi=n

Done

[rgovind1@celera1-cs0 root_vdm_2]$ fs_dhsm -c ITOPS -create -type http -secondar                                                                                                 y ‘http://ccd.ad.wellcare.com/fmroot’ -httpPort 8000 -cgi n -user dhsm_user -pas                                                                                                 sword nasserviceuser
ITOPS:
state                = enabled
offline attr         = on
popup timeout        = 0
backup               = passthrough
read policy override = none
log file             = on
max log size         = 10MB
cid                 = 0
type                 = HTTP
secondary            =  http://ccd.ad.govind.com/fmroot

state                = enabled
read policy override = none
write policy         = full
user                 = dhsm_user
options              = httpPort=8000 cgi=n

Done

Most performance concerns can be summarized by 4 questions:

  1. What am I getting?
  2. Is that what I should be getting?
  3. If not, why not?
  4. What, if anything, can I do about it?

Characterize workload in terms of

  • IOPS
  • Size (KB/IO)
  • Direction (read/write)

Performance triage domains

  • Host
  • IP network
  • Data mover
  • Fibre channel
  • Storage processors
  • More fibre channel
  • Disk drives

Celerra Volume Stack

Filesystem

Meta volume

Slice volume

Stripe volume

Basic volume (dvols)

Identify the protocol(s) in use

server_stats server_5 -summary cifs,nfs -interval 10 -count 6

server_stats server_5 -summary nfs -interval 10 -count 6

server_stats server_5 -table nfs -interval 10 -count 6

  • Operations will break down between v3Write, v3Create, etc.

server_stats server_5 -table fsvol -interval 10 -count 6

  • Correlates the filesystem with the meta-volumes
  • The percentage contribution of write requests for each meta-volume is shown (“FS Write Reqs %”)

server_stats server_5 -table dvol -interval 10 -count 6

  • Shows the write distribution across all volumes
  • AVM will work hard to prevent disk overlap for a filesystem
  • Slice your stripes, don’t stripe your slices (basically create the stripe across all volumes first, then slice those up as needed)
  • root_ldisk – log disk, high activity on this disk will mean lots of log activity in the server_log. The ufslog hit high threshold. But is it a problem?
    • Data mover memory includes inodes and data blocks
    • Data mover cache is write-through, meaning that data needs to be destaged from cache before it will acknowledge to the host. This is because the cache is not protected from power loss.
    • When writes are coming in, data blocks are updated, and inodes need to be updated.
    • Inode updates are writing to the ufslog staging buffer.
    • The staging buffer contains uxfs log transactions and then destages to disk.
    • Ufslog hit high threshold means in in-memory copies of uxfs log transactions which have already been written to disk could not be retired because the dirty meta data to which they point has not yet been flushed to the filesystem metavolume
    • This message indicates contention at the filesystem metavolume, not the ufslog volume.
    • If ufslog is an issue, the error message will be “staging buffer full, using next one”. One periodically is not an issue. It’s actually good that the buffer is being use. Only if you get a lot of these per second.

nas_disk -l | grep root_ldisk

APM000123456789-0001

navicli -h spa getlun 1 -rwr -brw -wch (read write rate, blocks read/written, write cache hit)

For IOPS, 8 threads of I/O will yield the greatest increases. 8 to 64 threads yields nominal improvement.

nas_fs -I fs1

nas_disk -I d38

  • Look at stor_dev (hex) and convert to decimal

navicli -h spa getlun 27 -rwr -brw -wch

  • Blocks written / write requests = blocks per write (multiply by 512 bytes to get block size)

navicli -h spa getlun 27 -rwr -brw -wch -disk

  • Shows the disks associated with the lun

navicli -h spa getdisk 2_0_2 -rds -wrts -bytrd -bytwrt (read reqs, write reqs, kbyte read, kbytes written)

  • Kbytes written / write requests

Putting it all together, we saw:

Nfs write size 8KB

Dvol write size 8KB

Lun write size 8KB

Disk write size 32KB

Go to the host, and check the filesystem:

df

grep fs1 /etc/mtab

Check the rsize=8192,wsize=8192 settings. These are the buffer size limits. Even if the application wanted to write 32KB, the buffer is limited to 8KB. Update those settings to 32768. You’d need to unmount and remount with the new settings. Needs to be coordinated since it will be disruptive.

server_ifconfig server_5 -all

server_stats server_5 -table net -interval 10 -count 6

  • Network in (KiB/s) / Network In (Pkts/s) to figure out the packet size
  • Do this for In and Out to see what the standard MTU size is

server_netstat server_2 -s -p tcp

  • Look for transmission errors (retranmissions)
  • A node is aware only of its own retransmissions so be sure to check both ends of the connection

Navisphere Analyzer Command Line Interface is a good reference for looking at data.

You can extract specifically what you want as a CSV file.

naviseccli analyzer -archivedump -data spa.nar -stime “…” -ftime “…” -object l -format pt,on,rio,rs,wio,ws | grep _d38

Posted September 4, 2011 by g6237118

One response to “EMC Celerra

Subscribe to comments with RSS.

  1. Excellent work.

Leave a comment