Top 8 Trends in Data Storage
In this article
The new calendar year has gotten off to a fast start, and without fail, our customers continue to work on improving their data strategies. In a world of cyber threats, customers are heightening their focus on protecting their data, data sovereignty, data locality, data tiering and density. Explosive data growth and data placement remain challenges for decision-makers in an increasingly digital world. How does an organization accommodate data spread across a private data center, colocation facility or a public cloud?
The last two years have introduced change at a pace the industry has never seen. Some customers simply survived the pandemic, while others thrived. Most organizations expanded their IT footprint with new investments in areas supporting a virtual workforce like video collaboration and virtual desktop deployments. Each of these investments generates substantially more data.
In this article, we will be diving into data storage industry trends, leveraging insights we've uncovered from working in and around global service providers, global financials, healthcare and enterprise accounts; testing these technologies in our Advanced Technology Center (ATC); and working with our OEM partners, ranging from industry stalwarts to stealth startups.
Data conversations typically start with the use case and applications that will leverage the storage subsystem. Primary use cases for shared storage still lean toward databases, high-performance applications such as those for data science, enterprise resource planning (ERP) solutions and large-scale hypervisors like VMware. These continue to make up the most significant amounts of primary storage consumption inside the four walls of a traditional data center.
Companies must plan for storage needs in various locations with unpredictable timelines. For example, applications that reside on-premises today may or may not move to the public cloud. Other organizations made storage decisions based on short-term supply chain constraints that may be sub-optimized for the application and require remediation as product availability frees up.
As things normalize in 2022 and cloud strategies come into focus, storage groups must work again to customize storage to the application, ensuring optimal performance, uptime, resiliency and, ultimately, a great end-user experience.
The new IT world is hybrid and complex with new areas emerging like CloudOps and FinOps to optimize data storage across clouds. Public cloud providers like AWS, Azure and Google continue to expand, and we see independent software vendors (ISV) investing heavily to provide data services in the public cloud.
As customers begin their journey to the cloud, we see traditional storage teams expanding their knowledge and contributions to applications in the public cloud. Organizations will consider leveraging and extending traditional ISV data management solutions to the public cloud versus leveraging cloud-native services. How a customer leverages storage solutions on-premises can and should be part of the decision-making process on how to plan for a shift to ISV or cloud-native deployments for storage in a public cloud.
In addition to technical considerations, organizations must also understand how various IT and cloud groups interact. We saw dedicated cloud teams formed early in many organizations to move quickly to public cloud; now those organizations must work to ensure these teams align closely with traditional IT organizations to build out an integrated and comprehensive data strategy. Cloud strategy should not be an "either/or" conversation; it must be an "and" conversation.
2016 was the year most storage manufacturers began delivering solutions with 100% solid-state flash drives. This was the "year of all flash," and the array manufacturers were racing to hit 100,000 input/output operations (IOPs) at sub-millisecond (ms) response times in what's called a "three-tier stack"— meaning separate compute resources and switching fabric (usually Fibre Channel at this stage of the game), coupled with an array-based storage platform.
The introduction of NAND flash into mainstream storage array products has since evolved to additional advancements in the storage media to "storage class memory" (SCM), which improves response times and delivers advancements in quad-level cell technology (QLC) to help drive down costs of the previous tri-level cell products still in heavy use today.
Additionally, until the last few years, all these very modern storage types have ridden atop a small computer serial interface (SCSI) stack — most recently via serial attached SCSI (SAS) — that was designed for spinning media starting in the 1980s. This fabric must evolve as well to ensure more efficient use of the compute stack's integration with storage. Non-Volatile Memory Express (NVMe) and NVMe over Fabrics (NVMe-oF) have been introduced in recent years to help solve that problem. Now that the server operating systems are being updated with native NVMe-oF capabilities for FC, TCP and RoCEv2 (for example), we anticipate growth in adoption to new storage fabric technologies as customers look to get more out of their IT investments.
Some customers are looking for a cloudlike experience but have no interest in putting their data into a public cloud provider, while others have a cloud-first/cloud-native mentality. Whether your organization's crown jewels are stored in an EPIC solution for healthcare, a database for an enterprise resource planning (ERP) system, or you have machine learning (ML) initiatives, we are seeing a shift from traditional capital expenditure (CAPEX) spending on IT infrastructure to a "pay by the drip" operational expense (OPEX) model. In the data storage space, this is referred to as storage as a service (STaaS).
These models vary greatly from OEM to OEM, and it's important to understand the nuances of each; they include but are not limited to:
- Dell APEX
- NetApp Keystone
- Pure as a Service
- HPE Greenlake
- Hitachi Everflex
- IBM Storage as a Service
Automation continues to be top of mind for our customers. The most successful organizations are selective and targeted with automation initiatives, prioritizing areas that accelerate storage adoption, increase reliability and reduce operational costs.
Many dive directly into picking tools and engineering a solution, but it's important to first understand the drivers behind automation and map out a comprehensive approach inclusive of people, process and technology to maximize the impact.
There are common patterns we see within organizations that have successful automation programs:
- Culture of collaboration
- Aligning teams around a "platform" concept
- Culture of learning
Collaboration ensures mutual understanding among the teams, both within the storage discipline and beyond, leading to a superior outcome for data consumers. This collaboration typically leads to an evolution within IT around a platform concept. Moving to a platform concept forces teams to think differently. Instead of focusing on silos, platform teams gravitate toward learning more about and understanding consumers of data and work to create solutions that help those teams operate with more autonomy.
While technology remains important, understanding desired outcomes and actively connecting people, process and technology to those outcomes is paramount.
Throughout the last 24 months, many organizations had to move very quickly to ensure they could adhere to the new demands of remote work at the edge. As if keeping a remote workforce up and running isn't challenging enough, IT organizations had to also produce new ways of protecting the data generated at the edge while keeping that data secure. The downstream data impacts to an IT organization come full circle when designing a solution to maintain workforce productivity. In addition to a user's persistent data generated at the edge, the mere existence of a virtual desktop to perform job functions increases the need for performant primary storage solutions that can adhere to flexible working hours and parallel workstreams for many users accessing the storage at once.
In the world we live in today, the threat of a cyber-attack is very real. Ransomware attacks are at an all-time high, and organizations must proactively protect themselves from bad actors. Storage providers often discuss how their solution with immutability is a one-stop-shop to solving the ransomware problem, but in reality, it is just one piece of the puzzle.
There is no silver bullet. Enterprises must pursue a comprehensive strategy spanning the business, application, enterprise architecture, security and IT teams to mitigate risk and improve overall data security. Most organizations will pursue a turnkey cyber resilience program in the long-term but will start with a cyber recovery initiative focused around a data vault or cyber vault concept.
With the above in mind, there are still things that IT and the storage teams can do to protect data. Understanding and tiering data according to application priority is a good place to start. Taking advantage of encryption and immutability features with critical production data contributes to mitigating exposure as well. And adding backups, data cleansing and local/remote storage replication can contribute to a more robust data protection posture.
Last, but not least, most organizations are developing cloud-native architectures based on containers but are struggling to provide persistent storage to these ephemeral container environments.
In recent years, development groups were just dipping their toes in the water with Kubernetes (K8), but recently, the level of adoption has increased substantially with organizations moving towards a DevOps mindset. This accelerated adoption has created the need for containers to communicate with storage subsystems for block and file-level persistence. The container storage interface (CSI), in the case of Kubernetes, gives the container the ability to retain their data persistence without the need to touch the core Kubernetes code.
This is important as applications like databases are not ephemeral, so if organizations try to run these in a container, the data must survive a reboot, or the application will be corrupted. CSI comes in many deployments, from basic drivers and APIs to full orchestration suites. Most storage manufacturers have written a driver and support CSI now.
We covered a lot of ground above, and there are also other areas we're tracking, such as next-generation artificial intelligence, machine learning and associated storage solutions. To help you keep up with the changing storage landscape, there are many ways WWT can assist in giving you insights into critical key performance indicators related to your infrastructure and storage stack, including protecting your most valuable IT asset – your data.
We also recommend exploring our Advanced Technology Center (ATC) to gain hands-on experience with the latest technologies and cut your proof-of-concept time from months to weeks. WWT's deep-rooted relationships with major OEMs and our rigorous evaluation of recent technology providers can help streamline decision-making, testing and troubleshooting.
For more information on primary storage, data protection, cyber resilience or any of the topics mentioned within the article, connect with one of our storage industry experts today.