A guest post by Tamir Segal
Very soon after we release Dell-EMC XtremIO’s copy technology, we were very surprise by an unexpected finding. We learned that in some cases, customers would be deploying XtremIO to hold redundant data or “copies” of the production workload. We were intrigued, why would someone use a Primmum array to hold non-production data? and why would they dare to consolidate it with production workloads?
Because of our observations, we commissioned IDC to perform an independent study and drill into the specifics of the copy data problem. The study included responses from 513 IT leaders in large enterprises (10,000 employees or more) in North America and included 10 in-depth interviews across a spectrum of industries and perspectives.
The “copy data problem” is rapidly gaining attention among senior IT managers and CIOs as they begin to understand it and the enormous impact that it has on their organization in terms of cost, business workflow and efficiency. IDC defines copy data as any replica or copy of original data. Typical means of creating copy data include snapshots, clones, replication (local or remote), and backup/recovery (B/R) operations. IDC estimates that the copy data problem will cost IT organizations $50.63 billion by 2018 (worldwide).
One could think that copies are bad for organizations and they just lead to sprawl of data and waste, and therefore the expected question to ask is why don’t we just eliminate all those copies? The answer is straightforward, copies are important and needed for many critical processes in any organization, for example; how can you develop the next generation of your product without a copy of your production environment as baseline for the next version? In fact, there are many significant benefits to using even more copy data in some use cases. However, legacy and inefficient copy management practices resulted with a substantial waste and financial burden on IT (Did I mention $50.3B?).
IOUG made a research on 300 DB managers and professional and what is the most activities taking up most time each week. Results are somehow surprising, Figure 1 shows that 30% spending significant amount of their time on creating copies. However, this does not end here, Test &QA are also tasks done on non-production copies and patches are first tested on non-product environments.
Figure 1 – Database Management activities taking up most time each week (source: Unisphere research, efficiency isn’t enough: data centers lead the drive to innovation. 2014 IOUG survey)
What are those copies? and what are the uses cases they support and what are the problems there today? Those can be categorized under 4 main areas:
- Innovation (testing or development)
- Analytic and decision making (run ETL from a copy rather than production)
- IT operations (such as pre-production simulation and patch management)
- Data Protection.
Before getting the research’s results, I assumed that Data Protection would be leading use case for copies. I was wrong, based on research there is no significant leader in data consumption.
Figure 2 – Raw Capacity Deployed by Workload (Source IDC Survey)
Another interesting data point was to see what technology is used to create copies, per the research results 53% used custom written scripts to create copies.
Figure 3 – what tools are used to create secondary copies (source IDC Survey)
The copy data management challenges directly impact critical business processes and therefore have direct impact on the cost, revenue, agility and competitiveness of any organization. But the big question is by how much? The IDC research was looking to quantify the size of the problem and how big it, some highlights of the researches are:
- 78% of organizations manage 200+ instances of Oracle and SQL Server databases. The mean response for the survey was 346.43 database instances.
- 82% of organizations make more than 10 copies of each instance. In fact, the mean was 14.88 copies of each instance.
- 71% of the organizations surveyed responded that it takes half a day or more to create or refresh a copy.
32% of the organizations refresh environments every few days, whereas 42% of the organizations refresh environments every week.
Based on the research results, it was found that on average a staggering 20,619 hours are spent on wait time for instance refreshes every week by the various teams. in a conservative estimate of 25% of instances yields more than 5,000 hours, or 208 days, of operational waiting or waste.
The research is available for everyone, and you can view it here
These results are very clear, there is a very large ROI (more than 50%) that can be realized and probably by many organizations since more than 78% of the companies are managing more than 200+ instances of database and as the research shows, the process today is wasteful and inefficient.
The Copy Data Management Challenges
It is important to understand why legacy infrastructure and improper copy data management processes have fostered the need for copy data management solutions. The need for efficient infrastructure is driven by extensive storage silos, sprawl, expensive storage and inefficient copy creation technologies. The need for efficient copy data management processes is driven by increased wait times for copies to be provisioned, low productivity and demands for self-service.
Legacy storage systems were not designed to support true mixed workload consolidation and require significant performance tuning to guarantee application SLAs. Thus, storage administrators have been conditioned to overprovision storage capacity and create dedicated storage silos for each use case and/or workload.
Furthermore, DBAs are often using their own copy technologies, it is very common that DBAs will ask storage administrators to provision capacity, they will than use their native tools to create a database copy. One common practice is to use RMAN in oracle and restore a copy from a backup.
Copy technologies, such as legacy snapshots, do not provide a solution. Snapshots are space efficient compared to full copies; however, in many cases copies created using snapshot technology are under-performing, impact production SLAs, taking too long to create or refresh, have limited scale, lack real modern efficient data reduction and are complex to manage and schedule.
Because of performance and SLA requirements, storage admins are forced to use full copies and clones, but these approaches result in an increase in storage sprawl as capacity is allocated upfront and each copy consumes the full size of its source. To save on capacity costs, these types of copies are created on a lower tiered storage system or lower performing media.
External appliances for copy data management lead to incremental cost and they still require a storage system to store copies. They may offer some remedy; however, they introduce more complexity in terms of additional management overhead and require substantial capacity and performance from the underlying storage system.
Due to the decentralized nature of application self-service and the multitude of applications distributed throughout organizations within a single business, a need for copy data management has developed to provide oversight into copy data processes across the data center and ensure compliance with business or regulatory objectives.
The Dell-EMC XtremIO’s integrated Copy Data Management approach
As IT leaders, how can we deliver the needed services to support efficacies, cost saving and agility to your organization? How does the copy data management can be addressed in a better way? This is how Dell-EMC can help you to resolve the copy data service at its source.
Dell EMC XtremIO pioneered the concept of integrated Copy Data Management (iCDM). The concept behind iCDM is to provide nearly unlimited virtual copies of data sets, particularly databases on a scale-out All-Flash array using a self-service option to allow consumption at need for DBAs and application owners. iCDM is built on XtremIO’s scale-out architecture, XtremIO’s unique virtual copy technology and application integration and orchestration layer provided by Dell-EMC AppSync.
Figure 4 – XtremIO’s integrated Copy Data Management stack
XtremIO Virtual Copy (XVC) used with iCDM is not physical but rather a logical view of the data at a specific point in time (like a snapshot), unlike snapshot XVC is both metadata and physical capacity efficient (dedup and compression) and does not impact production SLAs. Like physical copies, XVC provides the equal performance compared to production, but unlike physical copies, which may take long time to create, XVC can be created immediately. Moreover, data refreshes can trigger as often as desired at any direction or hierarchy enabling flexible and powerful data movement between data-sets.
The ability to provide consistent and predictable performance on a platform that can scale-out is a mandatory requirement. Once you have an efficient copy services with unlimited copies, you will want to consolidate more workloads. As you consolidate more workloads into a single array, more performance may be needed and you to be able to add more performance to your array.
We live in a world where the copies have consumers, in our case they are the DBAs and application owners. As you modernize your business, you want to empower them to be able to create and consume copies when they need them, this is where Dell-EMC AppSync can provide the application orchestration and automation for applications copies creation.
iCDM is a game changer and its impact on the IT organization is tremendous; XtremIO iCDM enables significant costs savings, provide more copies when needed and support future growth. Copies can be refreshed on-demand, they are efficient, high performance and have no SLA risks. As a result, iCDM enables DBAs and application owner to accelerate their development time and trim up to 60% off the testing process, have more test beds and improve the product quality. Similarly, analytical databases can be updated frequently so that analysis is always performed on current data rather than stale data.
Figure 5 – Accelerate database development projects with XtremIO iCDM
More information on XtremIO’s iCDM can be found here.
As a bonus, I included a short checklist to help you choose your All-Flash array and copy data management solution:
|Does your CDM is based on All-Flash array?||
|Can you have copies and production on the same array w/o SLA risks?||
|Does your CDM solution is future proof? Can you SCALE-OUT and add more performance and capacity when needed? Can you get scalable number of copies?||
|Does your CDM can immediately refresh copies from production or any other source? Can it refresh to any direction (prod to copy, copy to copy or copy to prod?||
|Can your copies have the same performance characteristics as production?||
|Do your copies get data service like production including compression and deduplication?||
|Can you get application integration and automation?||
|Can your DBAs and application owner get self-service options for application copy creation?||
XtremIO iCDM is the most effective copy data management option available today, it enables better workflows, reduces risks, eliminates costs and ensures SLA compliance. The benefits extend to all stakeholders, they can now perform their work more efficiently while having better results; the results can be seen in reduced waste and costs reduction while providing better services, improved business workflows and greater productivity.