One Year Later, My EMC World Flashy Perspective

Hi, only 1 year has passed since i wrote this post ( https://itzikr.wordpress.com/2013/05/09/reflections-on-emc-world-2013-my-flashy-prespective/ )

and it seems like a lifetime, so much as changed for me both in my personal life and in my working life, while i don’t think my readers are interested in the personal stuff, i wanted to talk about what has changed for us in the EMC XtremIO team.

01

 

this blog post doesn’t summarize all the activities we had at EMC World 2014, it’s just my own personal view of the world..

jj

OK, so the baby was born!, back in November 2013 we went GA, while we were in a directed availability mode since April 2013, the product became officially GA only in November and in such a small amount of time, it became the leading AFA product (by capacity & units sold), what it meant to us, by the way, there is a reason i’m using parents photo as the analogy, maybe it’s only me but i find so many commonalities between raising a kid to raising a product.

Anyway.. we were thankful to be a part of the EMC machine, we needed the large amount of sales reps, local support teams etc’ to stand behind the product, it’s like sending your kid to first grade and finding out at the 2nd day he has been identified as a gifted kid, im not saying this to patronize, im saying this in order to try and explain what does the driver feels when he accelerate from 0 to 100 in 3 seconds. it also signaled to everyone that the product is here to stay, it’s catching an hugh momentum in the market and customers are absolutely loving it, the fact that we have returning customers proves it!

speaking about customers, Brian Dougherty from CMA speak about his impression of XtremIO, now that has been running in their environment for some time ..

later on, Wikibon then wrote a great summary of the interview which you can read here:

666

http://siliconangle.com/blog/2014/05/06/xtremio-instrumental-in-facilitating-emcs-competitive-advantage-emcworld/

secondly, we also learned that FLASH is different, it’s not just the architecture, it’s the messaging that we need to bring to the market, look, there are so many storage vendors out there trying to brand their AFA’s as the “best in the market” that is not always easy for customers to truly understand the differences and as such, in this EMC World, we became a bit more aggressive on these points:

02

at his keynote, david goulden highlighted that fact that Inline data services matter but what did he mean, well, some AFA’s out there are preaching their data services (for example, dedupe) as inline and in many cases, they ARE, what they don’t tell you is that because of their legacy architecture, for example, an active / passive one, they have so much load on a single controller (active / Passive again) that they need to throttle back their data services to an extent it doesn’t work, it’s not off but its not working, so this slide was actually taken as a proof point from a customer  that was running a 50-50 R/W workload and in the case where the array was under heavy load, the “other” AFA couldn’t cope with the load but here’s the thing, the load wasn’t syntetic, you are NOT buying an AFA to sit idle, if that’s what you’re doing, you shouldn’t even buy it in the first place, a VNX2 for example will do a much better and a cheaper job for you as a customer and guess what, even a VNX2 has an active/active architecture if you are going with the VNX-F model!! crazy but true

03

As part of this issue, we started the 1M$ guarantee, we basically are promising our customers, we would never shut down or throttle data services to a point where they do not work, we think that true always-on inline data services should be a part of the CORE architecture and not a feature that is a semi post process, post process in this context is a step backward to ..again, legacy arrays that already exist in the market

Ehud Rokach, our GM (and a truly great person) actually speaks about it here:

 

and Josh Goldstein (VP of PM) speaks about it here:

 

04

for customers who already made the mistake of purchasing the wrong product (hey, it’s not easy to test an AFA and there was no baseline), we are offering a Trade-In program, speaks to your EMC sales rep about this one.

speaking about how to test an AFA, as i mentioned above, it’s not easy, IDC have published a white paper about what to test but it doesn’t show you how to

do it, you can ready more on the IDC WP here: https://itzikr.wordpress.com/2013/06/30/testing-an-all-flash-array/

so what can you do? well, we took the IDC WP a step further and announced a FLASH testing kit appliance, it’s basically using vdbench which is the industry standard for testing performance of your AFA, it’s very easy to work with, it will expose what happens to garbage collection while you are using your AFA and in a nutshell, will show you how your AFA will behave 6 month’s down the road during the POC, i can’t stress this enough, AFA’s are so fundamentally different that you owe to do a due dilligence while testing it, for more info on the AFA testing kit, watch the video below, one of it’s creator, is Miroslav Klivanski which is a super cool / super geeky type of person!

as for the sessions, we had a total of 5 different XtremIO sessions, i had the pleasure to present with Carl Norwich with is a corporate SE in my team around the VDI use case and what is the special sauce XtremIO brings to this use case,

WP_20140505_11_23_59_Pro

WP_20140505_11_57_53_Pro

 

On the stage we had both a partner (Cygate from sweden) and a customer (ITSAM) from sweden sharing their joint journey of migrating from a legacy array to XtremIO, it was a truly such a great story because it involved all the critical elements: an early partner that was trainer on XtremIO and saw the value of it, knew to identify a customer need (they couldnt scale above their intial VDI deployment) and the resolution to the issue which was the migration to XtremIO

 

Bm5PdqYCUAAaKI8.jpg large44

the VDI session was absolutely fully packed which again, shows the value and more than this, the awareness of the product in they minds of so many of our customers!

but it wasn’t just the VDI sessions that were packed, every XtremIO session was fully packed, standing room, below you can see Josh Goldstein delivering an XtremIO architecture session, the customer interest is booming!

Bm-szuDCcAIVzxa.jpg_large]

55

http://www.emcworld.com/emctv.htm?id=3547506466001

cj1

C.J Desai (our division president also gave it’s first keynote that you can see by clicking the picture above. he went through the progress the product has made since the time it inception the market, one of the interesting (but maybe not that sexy) point that he has made, is that the product resiliency is very very high! (touch wood..) and it’s true, was are so paranoid about stability and that was one of the main decision for the DA phase. he also reveal the VSPEX RA’s that will be introduced later on (H2 2014)

Bm-WqSDCMAEOl8m

I was fortunate to have a private briefing in front of the EMC Elect team in which together with Tamir Segal, we tried to explain some of the things that we have tried to do with XtremIO and what we are trying to do next, the EMC Elect team are a bunch of very smart people from both withing the EMC walls and outside the EMC walls and getting the feedback from them was very insightful.

 

The Booth

WP_20140507_16_25_04_Pro

WP_20140507_16_32_38_Pro

WP_20140507_16_29_45_Pro

like last year, the booth was always full with customers coming over to ask questions and share their experience with the product, it’s unbelievable to see the impact that a single year has made both internally for us at EMC and externally, to our customers

 

the interesting part?

it’s only the beginning, since the beginning and up until the GA in 2013, it was about building a good foundation, building a true scale-out, active/active, inline always on data services architecture that later on we can lay amazing features on the top, guess what, this year, we are coming with it so stay tuned, we only just began.

444

in the bigger scheme of things, the industry is going through a tectonic shift, it didn’t surprised me one bit where in every meeting i had with customers behind closed doors, there was a question that keep coming up:

“What’s going on with you guys”??

my answer to this was, we are paranoid, the storage industry was never in such a change and frankly, it’s FUN to be a this stage as oppose to talking and selling the same old story, You (the customer) and us (the vendor) are part of it and thank you for pushing us to change!

as the Imagine Dragons sang:

Welcome To The New Age!

 

The Case For SPARSE_SE

Hi,

A topic I got involved with lately is the ability to run an in guest space reclamation automatically, why is that so important you may ask,

well, let’s assume, you are after a storage array, any storage array, you pay good money for it, right? now, you are probably assuming that by deleting files from WITHIN the guest OS, the capacity will return back to the array right? you are probably assuming that if you delete files OUTSIDE the VM, say, deleting an actual VM from the datastore, the physical capacity will also return back to the array, right?

WRONG!

it won’t, in order to support it, both the guest OS needs to translate UNMAP command to the parental hypervisor or OS and the parental Hypervisor needs to pass on this information to the underlying storage array.

ok itzik, so you are telling me that I may gave GB’s or TB’s of space that I can potentially use but I can’t ??

YEP!

so what can you do, after all, you just spent good money that you want to put to a use, well, like any answer, it depends…

Microsoft Server 2012 / Hyper-V

IF you are using Windows server 2012 / 2012 R2 as a physical OS or with the Hyper-V role installed on it, you do not need to do anything!, UNMAP is built in to the OS for both physical or “virtual” (Hyper-V is just a role enabled on the parental Physical OS)

image

File deletion can generate UNMAP operations

Background operations?

As a scheduled task through “Optimize Drive”

Volume initialization (format) can generate UNMAP

if you are using Windows Server 2012, you want to Watch for HotFix 444333 , Resolves serialization of UNMAP in NTFS volumes

2012 R2? – It Just Work.

VMware vSphere

the plot get’s more complex, back in the vSphere 5.0 era, VMware DID support an automated UNMAP command but it turned out that in rare circumstances, it actually cased data corruption so you now need to do it manually at both the in guest level and the datastore level

In-Guest

you can use a free MS utlitiy called sdelete that you need to run on every VM

phase 1:

the sdelete command started to run

image

phase 2:

the sdelete command is toward the end of it’s run, note the red arrow, our physical capacity just got bigger!

image

but it’s not over, right? remember I told you that the datastore also need to be aware of the space reclamation so:

vSphere 5.1

run the vmfstools –y command, it will create a baloon file that will then gets deleted and release the capacity back to the array

image

Before:

image

After:

image

vSphere 5.5

more or less the same, the syntax is different, you now should run the unmap command

image

and if you want to properly run the command against an XtremIO array, you want to run it with the “-n 20000” parameters , so

esxcli storage vmfs unmap -l datastorename -n 20000

01/11/2014 — Update ===

you can now reclaim space at the datastore level using our VSI plugin, see a new post i wrote here

https://itzikr.wordpress.com/2014/10/08/vsi-6-3-is-here/

seriously man, do I now need to run sdelete MANUALLY on hundreds or thousands of VMs????

well, there is some hope, you can use a third party which ain’t free like sdelete but will automate, report and consume the capacity for you, the software I was using is from a company called RAXCO and the specific product is “PerfectStorage” ( http://www.raxco.com/business/products/perfectstorage )

let me show you one screenshot from my VDI lab that will tell a thousands words, the lab has been running 2,500 VDI VM’s, persistent desktops, no real users are connected to it but I DO use LoginVSI to generate load on it so temporary windows files DO EXIST.

image

yes, you are seeing it right, the tool just gave me a full report (which is part of it’s centralized reporting engine) about the fact I can claim back 5.14TB of space!!

“ok but deploying this tool is probably a nightmare and it takes ages”, nope, I used it’s ability to push the msi package and it took me 2 minutes to configure the policy and around 2 hours to push it to 2,500 VMs.

the scanning capability is also very important because it letYOU to decide if you want to reclaim back the capacity or leave it as is and wait for the next scanning reporting, the actual claiming process is very sophisticated as it takes into an account both the guest OS / ESX CPU utilization so it knows to “behave” itself in a virtualized environment, here’s the setting how to do it, they call it, virtualization awareness, it takes into an account not just the kernel CPU and the user Mode CPU but also the Disk I/O, pretty cool in my (humble) opinion1

image

by the way, PerfectStorage isn’t perfect either, you still need to run the space reclaim command per datastore but running this can be scripted and it’s easier to do then manually running “sdelete” on thousands of VM’s

“hmm, sounds very good but isn’t it up to VMware to fix this?”

yes, it is but currently they do it for VMware VIEW when using linked clones only, they basically enable a new disk format called “sParse_SE or “Flex-SE” if you are using the vCenter web interface

image

you basically set a “blackout” windows, when they will go and claim the capacity inside of these VM’s

I want to show this from a different angle, here, at XtremIO, we have a tool called “dedupe estimator”, it basically can scan volumes (physical or virtual) and will let you know about the data savings that you can have by moving these voumes to XtremIO

here’s how it look before, scanning two PRODUCTION datastore from a real customer, these datastores have been used for couple of years by now

BEFORE:

image

as you can see, the GLOBAL dedupe and data reduction savings (XtremIO dedupe is global, not per volume..) are around 2:1, not bad but not great either.

AFTER:

image

after running either sdelete or RAXCO and then running the datastore space reclaim commands on these two datastores, the data reduction has gone up to 4.6:1 !!! that is really good, it means you are buying an EMC XtremIO array but you are getting X 4.6 of what you pay for..

I hope I was able to demonstrate why cleaning after yourself is a good habit, for ALL storage arrays but in particular for AFA’s where the media is more expensive.

I truly hope that one day VMware will support SPARSE-SE as the default vdisk format but until then your best option is to use RAXCO PerfectStorage.

VPLEX / VE Is Here

Funny, one of the most anticipated products (IMHO) just pop out, you can bet you will hear more and more about it in the next months to come, VPLEX / VE was just a part of one of the biggest (if not THE biggest) DPDA launcher ever..

the theme was Data Protection for a Software-Defined World.

image

There are 3 specific sub-themes that support the overall launch. These sub-themes are key requirements for delivering effective data protection in a software defined world.

First – Delivering “data protection as a service” supports the shifts that our customers are making toward offering “IT as a service” – Effectively, becoming a service provider to the business. This also extends the opportunity to EMC Business Partners to offer “data protection as a service” to their end user customers.

Second – Empowering data owners – and by this we mean application, virtual and storage admins – with visibility and control of their own data protection needs, from familiar interfaces, removes the need for data protection silos and attacks the problem of “accidental architectures”.

And finally – Seamlessly spanning “The Continuum” of data protection, to assure appropriate service levels, based on the value of the data within the applications, from continuous availability to replication to backup and archive

Most industry analysts agree that there’s a need for a tiered recovery plan, in order to meet multiple services levels – from zero downtime and no data loss, to various point in time copies, to secure long term retention with archiving.

image

VPLEX 5.4 and VPLEX Virtual Edition are important releases for several reasons:

• VPLEX/VE is a new entry point for application data availability and mobility. Reduced TCO derived from deployment on standard ESXi server hardware and low cost iSCSI infrastructure. VPLEX/VE is a VMware-centric solution with deep VMware integration, managed and monitored through vCenter.

• VPLEX 5.4 adds support for MetroPoint Topology – Providing multi-source replication to a 3rd site – extending the VPLEX and RecoverPoint continuous availability story to include protection against events that could impact two sites, and operational recovery from data loss or corruption with any point in time RecoverPoint functionality.

• In the same timeframe as these VPLEX releases, ESA 2.3 for VPLEX delivers new predictive analytics capabilities, for VPLEX and EMC storage attached to VPLEX.

image

No competitive solution offers the continuous availability and protection that MetroPoint delivers.

Users get what they want – a single DR copy of their data, multi-site protection, no impact even though failures of two sites.

Conversation opportunities –

•New incremental revenue from installed VPLEX customers as they improve their data protection to both sides of a VPLEX Metro topology

•For new customers, you have the most comprehensive continuous availability and data protection story to tell

image

VPLEX/VE is a software based, simple, affordable, virtual storage availability platform that provides continuous availability and non-disruptive mobility targeted at mission and business critical applications in the midrange and enterprise organizations with VMware and Mid-tier iSCSI storage infrastructures

image

VPLEX/VE takes the same great technology and the associated use cases the industry has come to know and trust and delivers them as a set of virtual appliances in a fast and easy to install VMware vApp package.

This is unique in the industry as it is the first and only continuous availability platform that is designed to run natively on the customers VMware infrastructure.

This simple, affordable and virtual storage availability platform is specifically designed midmarket customers who are looking to achieve the highest levels of availability for applications running in their VMware environments running on iSCSI storage infrastructure.

Now the storage availability conversation can extend beyond the storage admin to include the vAdmin who will directly benefit from improved availability at the VM level.

image

VPLEX/VE is integrated into vSphere

•Manage through a single tool

•Leveraging existing VMware expertise

•Provision as part of VM creation

•Use with your automated work flows

•One stop shop for all event info

•Eliminate multi-GUI for monitoring and troubleshooting

Benefits of VPLEX/VE and vSphere integration

•Deliver on infrastructure demands faster

•Faster and easier provisioning as part of VM creation process

•Shortened learning curve for admins who know vSphere

•Improve incident resolution

•Faster problem resolution with fewer, more powerful troubleshooting tools

Provision and Manage Storage With VM Workflows

•Provision highly available storage resources as part of the VM creation process

image

VPLEX/VE is integrated into vSphere

•Manage through a single tool

•Leveraging existing VMware expertise

•Provision as part of VM creation

•Use with your automated work flows

•One stop shop for all event info

•Eliminate multi-GUI for monitoring and troubleshooting

Benefits of VPLEX/VE and vSphere integration

•Deliver on infrastructure demands faster

•Faster and easier provisioning as part of VM creation process

•Shortened learning curve for admins who know vSphere

•Improve incident resolution

•Faster problem resolution with fewer, more powerful troubleshooting tools

Provision and Manage Storage With VM Workflows

•Provision highly available storage resources as part of the VM creation process

image

VPLEX/VE provides benefits to a different set of customers than the traditional appliance based VPLEX. VPLEX/VE will be particularly valuable to the following customers.

Customers with VMware environments and iSCSI networks. VPLEX/VE supports iSCSI networks but not FC networks. iSCSI networks are common at SMB customers and branch offices of enterprises because of the lower implementation and maintenance costs for iSCSI. VPLEX/VE is tightly integrated with Vmware and is managed through the vCenter GUI. All management activities such as provisioning, moving volumes and monitoring storage is done through the VPLEX/VE plug in for vCenter. Customers no longer need to develop new skills in separate management tools.

VPLEX/VE benefits customers who want to instantly move VMs between sites. Unlike Storage vMotion, VPLEX/VE continuously mirrors application data (and VM data) between sites so that when a VM Vmotion is directed by an administrator, the VM can move to the new site without delaying for VM data to be copied to the new site. And because this is an active/active mirror of data, VMs can be moved back and fourth dynamically. Storage vMotion requires VM data to be copied to the new location first, before the VM can be moved often taking hours to complete a Vmotion operation. And each Storage Vmotion is a new activity so if a VM needs to be moved back, the VM data needs to be copied back.

Similarly, customers who want to load balance between sites can do so instantly and dynamically because the application data and VM data is mirrored between sites.

All of this is enabled by VPLEX/VE’s ability to mirror application data and VM data between arrays at a single site or between two sites. In addition to the added mobility benefits, customers who want to maintain mirrored data between arrays at one or two sites for improved availability can use VPLEX/VE to accomplish this.

image

VPLEX/VE leverages many of the unique capabilities of VMware such as Vmware High Availability (HA) to help create a continuously available application environment.

VMware HA provides the ability to automatically restart VMs when a physical ESX host hosting those VM’s suffers a failure. Under these circumstances, VMware directs a new ESX host to reboot the failed VM automatically using the same VMDK data stored on the single array.

The challenge is that VMware HA requires a single source of VMDK data. So in the loss of an array or in the loss of a full site, VMware HA is unable to restart servers.

image

VPLEX/VE bridges the gap in VMware HA by mirroring VMDK data across two arrays in two sites but presenting that data as a single volume to ESX hosts.

In the even of an outage from something like a site failure, VPLEX continues to make that VM data available at the remaining site.

And the ESX cluster is able to AUTOMATICALLY reboot the VMs on the remaining site without any human intervention.

image

While continuous availability is the key use case for VPLEX/VE and VMware, VPLEX/VE also enhances data mobility for VMware environments beyond what can be achieved by Storage vMotion and Storage DRS.

Storage vMotion gives vAdmins a way to non-disruptively move their VMs from one array to another. This can be done in a single site or across sites.

But Storage vMotion will not complete the move of a VM until the VM data move is actually completed. As a result, VM moves can be delayed. This is particularly challenging when multiple VMs are being moved simultaneously.

In addition, when a Storave vMotion move is requested, the ESX Host resources can quickly be overloaded by the sudden burst of CPU and resource activity as the VM data is replicated by the ESX host from the old array to the new array.

image

VPLEX/VE operates by continuously mirroring data across arrays so that the exact same data is read and write accessible from two different arrays simultaneously. When a vMotion request is made that requires the VM data be located on a new array such as when VMs are vMotioned across distance,

The vMotion move can be completed instantly because the mirror of the VM data can begin to be used as soon as the VM is moved to the new location.

image

Distributed Resource Scheduling can load balance VMs across your ESX cluster for best use of your compute assets. When DRS requires VM data to be moved to a new location to complete a DRS load balancing move, it uses Storage DRS, a process very similar to Storage vMotion.

And just like Storage vMotion, Storage DRS does not actually move the VM until the VM data is replicated to the new array.

Also like Storage vMotion, the process of moving data with Storage DRS can consume a lot of ESX host resources in a burst fashion to conduct the load balancing VM data moves.

image

Again, because VPLEX/VE mirrors data and makes it read and write accessible from multiple locations simultaneously,

VM moves for load balancing with DRS happen instantly and do not spike the ESX host resource usage in the process.

 

VPLEX Virtual Edition Cluster Witness Demo

EMC World 2014 Is Coming ! Here’s the XtremIO Sessions

image

Hi,

EMC World 2014 is coming soon, you can click on the banner above to register for it, this year is going to be VERY interesting as any other year but from our perspective (the emerging technology unit), it will be DIFFERENT, no more “area 51” about this “future” AFA that WILL change the world, this year, the XtremIO array has already been GA’d and it’s already CHANGING that market, many production customers are already USING the product and are amazed by the value it is bringing them every day and as such, we will have a lot to share, attached below are the sessions we have planned, you are all welcome to register for them, yours truly will have the VDI session running twice (or so I have been told) and of course, we would love to interact with you, our existing and future customers!

image

image

image

image

oh, and EMC is notorious for product announcements..

Smile

XtremIO 2.2 Service Pack 3 Is Out!

Hi,

If you thought 2013 was big for EMC and XtremIO, you have seen nothing yet,

2014 is going to be much much bigger, a big part of the launching theme was around to “getting the architecture right!”, this was well needed because as we said, features are easy to add, changing your core architecture isn’t!

as part of let’s add some features, we have just released the 2.2 Service Pack 3 upgrade for our XtremIO customers,

this is what i call, “A minor release” hint, hint, the next one is kinda big..

Here’s whats new:

Hardware Changes:

clip_image001

20TB Brick

Enables XtremIO to provide more capacity in a single cluster for non-dedup use cases like Databases. Can be used for VDI as well, when physical capacity is more important then logical capacity.

Based on 800GB Hitachi Sunset Cove “Type B” (Encryption ready)

Supports 1, 2 and 4 X-Bricks per clusters.

Security

Active Directory Integration

clip_image003

Supports AD integration over LDAP and LDAPS protocols

• Allows mapping of AD groups to XMS roles

• Mapping is done via groups, where AD groups are
mapped to roles within the XMS

• Supports multiple AD servers

• Use local store for user authentication if AD service is
not available

Client to XMS security

Allowing only SSH and HTTPS secure protocols.

HTTPS capabilities

– Secures GUI download from XMS

– Encryption of GUI to Management Server communication

– Supports installing 3rd party server certificate

– Allows encryption of the RESTful API

Remote Syslog Support

clip_image005

• Support sending events to a remote syslog server or multiple syslog servers

• Events handler enables the user to set rules on which event should be sent

Management Improvements

clip_image007

clip_image009

clip_image011

Monitor – display 30 minutes of history in line widget

Object Granular Monitoring – add latency reporting

– Average latency per volume

– Average latency per Initiator / Target, per block size

Support VAAI Thin Primitives

clip_image013

clip_image015

1. TP STUN

Return TPSTUN if a write is received by the XtremIO array and it cannot be stored due to lack of physical or logical capacity.

TPSTUN is enabled by default on all volumes

  1. TP Soft limit

A warning is raised and surfaced in VMware vCenter™ via VAAI if a thin-provisioned datastore reaches a specific threshold.

The following CLI commands were changed/added:

modify-cluster-thresholds  vaai-tp-limit=[1-100/NO-LIMIT] 

modify-volume vol-id=vol-id vaai_tp_alerts=[enabled/disabled] 

add-volume <list-of-parameters> added an optional parameter  vaai_tp_alerts=[enabled/disabled]

Proud To Be Selected As EMC ELECT 2014

Hi,

I’m so proud and humbled to be selected as an EMC ELECT for 2014, i always thought that sharing information is fun and useful for everyone and i’m happy to get a recognition for it.

Elect2014-web

coincidentally enough, i also received the SE MVP Award for the XtremIO BU, following the same rule as above, share and help other teams is not a one time off task and while its not easy sometimes, it’s very rewarding to know that people can count on you, a typical Virgo sign behavior, i suppose

WP_20140113_22_27_16_Pro

2014 is going to be an amazing year for EMC and the Emerging Technologies Unit!