Wednesday, October 28, 2009

Designing VMware on NetApp for Dedupe, SMVI, and SnapMirror

I have done a fair amount of design work around VMware solutions on NetApp storage. As many already know, I love the NetApp product line and the added value around virtualization. The down side is getting all of the products to inter operate with each other from a design stand point. The most common solution I see lately is new vSphere installations where customers are requesting NetApp deduplication for better space utilization, SnapManager for Virtual Infrastructure (SMVI) to protect the virtual machines, and SnapMirror for replication to another NetApp storage system at a remote site. Typically my installs have been NFS, but all of this could easily be done on block based storage as well.

I'm not giving away any trade secrets here, all of this can be found in the NetApp TR's for the products if you dig deep enough.

Based on what I stated above, here is some advice for designing the solution:
  • Do not create a volume larger than the dedupe limits for the system (1TB for 2020, 2TB for 2050, etc.) NetApp TR-3505 has this information.  If you set autogrow for the volume, make sure it does not grow past this limit!
  • I recommend against setting autogrow unless you understand two things: 1. don't autogrow past the volume dedupe listed above and 2. autogrow may break SnapMirror.  If you autogrow the volume at the primary site, the destination site must also be grown.   See NetApp TR-3446 for more information.  My preference is to disable autogrow on SnapMirror volumes and grow both source and destination volumes manually as needed.
  • As stated in many of the TR's, you will want to put the virtual machine vSwap files on a volume that isn't replicated with SnapMirror or protected with SMVI.  This is set per vSphere host.
  • Give careful consideration to the placement of the virtual machine page file.  You have many choices with this one but be VERY careful if you plan to implement VMware SRM.  Read NetApp TR-3671 for more details.   For SRM you will need to SnapMirror the page file or jump through some painful hoops to create the page file at the remote location without replicating it.
  • If you don't want to replicate the page file, create a small independent persistent virtual disk and put it on a separate volume that doesn't get SnapShots or SnapMirror. See NetApp TR-3737 for more information.
  • Consider the order for deduplication, SMVI, and SnapMirror.  You want to deduplicate first because dedupe runs on the Active File System, NOT SnapShot blocks.  To maximize dedupe savings, run dedupe first, then  SMVI after dedupe has completed, and lastly use SMVI to kick off the SnapMirror replication
  • If you are doing cross site replication (both of your NetApp systems are both source and destinations for SnapMirror), be careful of "deswizzling".  Deswizzling is a SnapMirror post process that runs on the destination volume after a SnapMirror.  It moves the blocks around on the disk to optimize the layout.  This process can be intensive of some systems.  I have seen a 2020 with SATA disks brought to a halt because of this.  The only reference I have seen to deswizzling was in NetApp TR-3561 but it appears to have been pulled from their site.
  • If you are using block based storage, you will need to thin provision the volume and the LUN. I recommend using Operations Manager to monitor the free space and send alerts as needed
  • The combination of SMVI and SnapMirror only supports one datastore per volume and only one SnapMirror destination is supported
  • SMVI will update SnapMirror relationships but it will not create them
  • SMVI will not backup or restore RDM LUNs or iSCSI software connected LUNs in Microsoft virtual machines
  • SMVI can't take VMware snapshots of virtual machines that have iSCSI software LUNs with the Microsoft iSCSI initiator or NPIV RDM LUNs
  • Use Common Sense: Don't overwork the systems right out of the gate, start small, monitor, and grow

10 comments:

David Strebel said...

Aaron
Great post!
I was wondering though what's the reason behind "The combination of SMVI and SnapMirror only supports one datastore per volume and only one SnapMirror destination is supported." Is this a best practice or a technical limitation.

Thanks

Aaron Delp said...

The caveat of only one SnapMirror destination is a limitation of SMVI 1.0.

The caveat of one datastore per volume is a recommended best practice because SMVI only supports volume level SnapMirror

Thanks!

Unknown said...

Hi Aaron,

this is a great blog post for people using NetApp and VMware together, thank you!
I've got only one thing to remember: When you set the pagefile disk to independent-persistent mode, you can't do SVMotion (at least in 3.5, vSphere I don't know it yet).
But there's a workaround for this. You have to temporarily remove the independent mode to migrate the VM (see kb.vmware.com/kb/1004094).

Aaron Delp said...

Ronny - Thank you for the comment! I didn't know that! Great information!

I will try to test it on vSphere soon and let you know!

Anonymous said...

Hi Aaron,

Very good info here! Allow me to add some more info.

Some environments will require an onsite incremental backup, snapvault, and then a mirror of snapvault to a DR site.

source/smvi > snapvault > snapmirror

As per your blog its always best to run dedupe on the source to avoid pushing redundant blocks across in the environment.

SMVI 2 once finished will also automatically start a snapvault update and I believe it ll start dedupe on destination as well. Not sure about that or if its needed to be honest. once source has been deduped that should be enough.

I guess if your primary controller is challenged on CPU then let the snapvault destination controller handle that bit.

As always with snapmirror and DR make sure all your controllers are on same version of Ontap.

Aaron Delp said...

Great Information! Thank you! I honestly haven't worked with Snap Vault very much but I hope to in the future.

I'll be loading up SMVI 2.0 in our lab to play with it a bunch more in the future. I have done some installs but I have a bunch of lingering design questions that I want to take care of.

Unknown said...

It's correct that after a SnapVault transfer dedupe automatically kicks off (destination). The dedupe process on the destination cannot be scheduled individually, it depends on your SnapVault schedules. Don't forget the maximum volume size limits (different on each controller) if you use dedupe! You can find more info within TR-3505.
http://www.netapp.com/us/library/technical-reports/tr-3505.html

Jim said...

Aaron,

I have a question regarding the requirement to thin provision both the vol and LUN if using a block storage scenario. Is this due to the way dedupe handles the capacity freed up in a vol/LUN config or is there another reason Thin provisioning is needed?

Jim said...

Aaron,

I have a question regarding the requirement to thin provision both the vol and LUN if using a block storage scenario. Is this due to the way dedupe handles the capacity freed up in a vol/LUN config or is there another reason Thin provisioning is needed?

Aaron Delp said...

Hey Jim - You're taking me back a ways as I don't design NetApp on a day to day basis any more.

I want to say the reason for this was because if you didn't thin provision everywhere then when you deduped the free space wasn't seen in vCenter. It was seen by the filer but the free space wasn't reported at the LUN level back to vCenter. Again, since it has been so long I want to say that is the reason but I'm not 100% sure anymore.