QUALIFICATION:The BCSD tool must be used to size new configurations. It is recommended that all SRDF configurations, including SRDF/A, be qualified by your EMC support representative via the SVC group.
R2 FRAME: The R2 Frame should be AT LEAST as fast as the R1.
This includes: The same amount, size, type of drives and protection schemes should be used in both the R1 and R2 for the standard volumes. If additional volumes such as BCVs are configured on the R2 side, additional drives and cache should be used. For example, if using RAID 1/0 on the source frame with 15k drives, RAID 1/0 with 15k drives should be used on the target frame. Consideration should be given to segregating standards and BCV volumes onto separate drives.
The default device write pending limit (amount of cache slots per volume) should be the same or higher in the R2 as in the R1. This may require more physical cache in the R2 than in the R1.
- When defining CLONE on the R2, keep the clone devices on segregated drives and use the pre-copy option.
- QOS with an initial value of 2 can be used to help reduce the copy impact.
- SNAP is NOT ALLOWED on the R2 volumes.
BANDWIDTH: Sufficient bandwidth needs to be provided to run SRDF/A. It is imperative to understand the workload prior to configuring SRDF/A. SRDF/A can sometimes reduce the overall bandwidth by 20% over Synchronous SRDF, but it is highly dependent on the workload. It is best to keep this bandwidth reduction in reserve until the actual solution is implemented and the data can be analyzed.
Symmetrix DMX GigE adapters and some Fibre Channel switches offer compression, but the actual compression values realized are highly dependent on the data and can fluctuate drastically over the business day. For example, certain batch workloads can get better compression than online workloads. The actual compression values can be pulled off of the GigE adapters via an Inline or out of the Fibre Channel switches. Compression can help reduce the overall bandwidth required for SRDF/A, but be extremely careful when counting on compression as cycle times can drastically elongate if the compression is not being realized.
Your support representative can use the SYMMMERGE and BCSD tools to model data and correctly size a proper configuration. Bandwidth needs to be at least equal to the average number of writes entering the sub-system. This will not guarantee minimum cycle times.
If you are targeting minimum cycle times, then sufficient bandwidth needs to be configured to handle the peak number of writes entering the system. Keep in mind that we typically are using 10 or 15 MINUTE data to model 30 SECOND cycle times. A sufficient amount of cache MUST be configured to keep SRDF/A active for the period that that data was collected. In other words, if you model on 15 minute data, you must configure enough cache and bandwidth to keep SRDF/A active for 15 minutes at a minimum. Cycle times may elongate past the minimum during this period. Never guarantee minimum cycle times. Required bandwidth must be dedicated to SRDF/A. Do not share bandwidth with network traffic, tape, etc.
RA COUNT: The correct number of RAs need to be configured. There should be at least N+1 RAs, where N is the number of RAs required, so that a service action can be performed to replace an RA if necessary.
Synchronous groups and SRDF/A groups should be segregated onto their own physical adapters. Do not mix Synchronous and SRDF/A on the same adapters. Directors supporting SRDF/A should not be shared with any other SRDF solution.
Caution! When moving from a Synchronous solution to SRDF/A, in many cases we have seen the bandwidth and adapter utilization INCREASE as a result of the overall response time to the systemdecreasing.
MONITORING: SRDF/A should be monitored during the initial roll-out to ensure that all components were properly sized and configured. Data needs to be collected via STP or WLA and then run through the tools again to verify the initial projections were correct. STP at 5×71 microcode includes SRDF/A statistics, which can be very beneficial.
Do not forget that Mainframe MSC customers have a way to monitor for issues and that is the SCF1562I and SCF1563I messages. These will tell if they are getting transmit or restore issues. The messages will also tell which box is the issue.
The SYMSTAT commands were specifically created for monitoring open systems SRDF/A, but when issued from the Service Processor on the DMX it can be quite informative regardless of whether it is mainframe or open systems.
There are three options:
- Cycle
- Requests
- Cache
Using different combinations of the three options can help determine what caused the CACA and you can even prevent a drop by monitoring the cache utilization closely. SRDF/A should be monitored on a regular basis to look for workload changes and to predict increases in CACHE or BANDWIDTH due to growth.
VERIFICATION: The network should always be verified to ensure that the projected amount of bandwidth is configured. STP or WLA should be collected during the initial Adaptive Copy Synchronization to ensure that the required bandwidth is configured and that the network runs error free. Compression ratios should also be checked either at the switches or on the GigE adapters to verify that the correct numbers were used.
Upgrade or Reconfiguration: Always re-evaluate the SRDF/A solution prior to doing any upgrades or reconfigurations. This includes drive upgrades, adding volumes to the SRDF/A links or changing the front end connectivity. For example changing ESCON to FICON.
Starting SRDF/A: SRDF/A activation is considerate of cache utilization. SRDF/A will capture a delta set of writes and send them in cycles across the link. In addition to the new writes, SRDF/A will include up to 30,000 invalid tracks per cycle. This is a design feature and the 30,000 track value was chosen to prevent cache from being flooded by the invalid tracks. Therefore, EMC generally recommends as a best practice to synchronize the boxes in Adaptive Copy Disk mode to below 30,000 invalid tracks before activating SRDF/A. This will ensure that SRDF/A will become secondary consistent within a few cycles.
SRDF/A will activate with many more than 30,000 invalid tracks and in fact, some customers choose to activate SRDF/A when they have thousands or millions of invalid tracks. This is allowed, but only a maximum of 30,000 invalid tracks will be sent with each SRDF/A cycle. As a result, it will take many cycles before the frames are secondary consistent.
Fiber RDF Directors: Enable RF flow control. See emc152051 for a description of this feature.
Page Data Sets: Your EMC CE needs to set Enable Page Date Set Mode to YES in the IMPL.bin file to ensure synchronous replication of all page data sets. Refer to emc100913.
Configuring Delta Set Extension (DSE): See emc204521 for best practices for configuring DSE.
|
|||||
Determining the EMC Symmetrix Remote Data Facility Pair State
The resource status message reflects the role and state of the RDF pair. For example, the resource status and status message of Faulted Split, is reported when the RDF pair is in a Split state.
The RDF pair state is mapped to the associated resource status as described in the following table.
Table 2–2 Mapping From the RDF Pair State to the Resource Status
Condition | Resource Status | Status Message |
---|---|---|
The RDF pair state is Invalid and the pair state is not Incorrect Role. | Faulted | Invalid state |
The RDF pair state is Partitioned and the pair state is not Incorrect Role, or Invalid. | Faulted | Partitioned |
The RDF pair state is Suspended and the pair state is not Incorrect Role, Invalid, or Partitioned. | Faulted | Suspended |
The RDF pair state is SyncInProg and the pair state is not Incorrect Role, Invalid, Partitioned, or Suspended. | Degraded | SyncInProg |
The RDF pair state is R1 UpdInProg and the pair state is not Incorrect Role, Invalid, Partitioned, Suspended, or SyncInProg. | Faulted | R1 UpdInProg |
The RDF pair state is Split and the pair state is not Incorrect Role, Invalid, Partitioned, Suspended,SyncInProg, or R1 UpdInProg. | Faulted | Split |
The RDF pair state is Failed over and the pair state is not Incorrect Role, Invalid, Partitioned, Suspended,SyncInProg, R1 UpdInProg, or Split. | Faulted | Failed over |
The RDF pair state is R1 Updated and the pair state is not Incorrect Role, Invalid, Partitioned, Suspended,SyncInProg, R1 UpdInProg, Split, or Failed over. | Faulted | Replicating with role change |
The RDF pair state is Synchronized. | Online | Replicating |
The state of the RDF pair determines the availability of consistent data in the partnership. When the state of the RDF resource on the primary or secondary cluster is Degraded or Faulted, the data volumes might not be synchronized even if the application can still write data from the primary volume to the secondary volume. The RDF pair will be in a Partitioned state and the invalid entries will be logged as the data is written to the primary volume. Manual recovery operations are required to resolve the error and resynchronize the data.
Hi Govindagouda,
I am reading your piece because we are running into an issue with our SRDF/A environment. Randomly, we are saturating the cache on the R2 side. The cache on the R2 side is smaller (designed by EMC) than the R1 side. This is a virtualized environment hosting only Exchange 2010. Sometimes the SRDF suspends are 3-6 weeks apart, other times only 3 days apart.
EMC Support has been working this and is not seeing issues on the SAN side. However, your comment above about having a larger cache on the R2 side as a recommend appoach is not the case but seems sensible.
I’m wondering what issues you’ve seen, if any, with a smaller cache on the R2 side.
Usually what happens is wp increases and reach point where wp is more than 75 percent of cache. At this point by design srdf get suspended
You can look to enable DSE if not enabled
This is a topic that’s close to my heart… Many thanks! Exactly where are your contact details though?
Thanks
Hello there! Do you use Twitter? I’d like to follow you if that would be okay. I’m undoubtedly enjoying your
blog and look forward to new posts.
Hi Stephan,
I dont use Twitter.
lko dzięki Arnoldowi. Zbrojni nie śmieli przeglądać co nonetheless wiezie krzyżacki
poseł,
zwłaszcza jak huknął na nich spośród wysokości siodła.
Prędko odstąpili odkąd wozu. Odskoczyło również 2 przerażon.
Could you please comment in English
Great Website Made Here! Very Educational Subject For A Website Keep Up The Amazing Work!
Very Nice … Blog
Thank you for the good writeup. It in fact was a amusement account it.
Look advanced to more added agreeable from you!
However, how can we communicate?
You ought to take part in a contest for one of the most
useful sites online. I will highly recommend
this website!
Thank you Bell.
Very useful site. Thanks a million.
Thank you Very Much for Visiting the Blog
very nicely explained and great blog thanks lot
Thank you Very Much for Visiting the Blog
Thank you Amar
very nice submit, i definitely love this website, keep on it fbfbedefadgf
Thank you Very Much for Visiting the Blog.
Hi,
Please I am trying to move a new device pair into a SRDF/A session and I am getting the message
The Cache Partition setup is invalid
here is the command I am typing
$ symrdf movepair -sid 4315 -rdfg 2 -new_rdfg 10 -cons_exempt -f “D:\EMC_Management\Dev_SRDF\4315\rdf_create_pair_4315.txt” -nop
An RDF ‘Move Pair’ operation execution is in progress for device
file ‘D:\EMC_Management\Dev_SRDF\4315\rdf_create_pair_4315.txt’. Please wait…
The Cache Partition setup is invalid