Controlling SAP Hana Data Sprawl

Enterprises running large Hana instances in the cloud are seeing a new challenge appear as their databases continue to grow. Since Hana has a simplified data layout and structure compared to a more complex legacy database, it was assumed this would result in less data sprawl and duplication. But does the data stay small?

A key principle of cloud architecture is that services should be both atomic and scalable allowing capacity to grow and shrink enormously with demand. But this capacity is generally delivered through the scale-out of many mid-sized compute instances. Hana, however, generally works on the principle of scale-up, moving to ever-increasingly large compute instances to deal with the growth in data stored within it. Hana scale-out capability is limited in solving this problem. As such, large Hana instances present somewhat of an anti-pattern on hyperscale cloud.

Hyperscalers have met this challenge by delivering colossal instances, up to 24TB in size, which today is normally only needed by the very largest of enterprises. Of course, when those enterprises grow beyond that limit, they ask for even larger instances. This cannot go on indefinitely. Larger instances are much harder to provide at scale in the cloud and SAP customers who push the limits face greater availability risks.

While the availability risk of Hana is a cause for concern, the infrastructure bills for large Hana instances are enough to compel customers to get their estate under control. The cost of a single 24-terabyte system in the cloud is around 800,000 USD a year. With most Enterprises running multiple production-sized instances, it’s clear to see how cloud spending can get out of control with the unfettered growth of Hana. Some organizations running large Hana instances in the cloud can already see material costs coming on the horizon, and they are struggling to justify it.

So, what options are available to SAP customers who want to avoid the inevitable Hana sprawl? There must be a better answer than “just keep growing larger instances.”

Deploy a data management strategy for large Hana

To deal with Hana data sprawl and the subsequent infrastructure costs that come with running large instances in the cloud, enterprises should have a clearly defined data management strategy. And, the earlier organizations put this strategy in place, the better.

The four-point data management strategy outlined below can help organizations effectively manage the growth of their Hana data and successfully keep data sprawl at bay.

Put things in order

The first order of business to halt the massive growth of Hana data is to put your data in order. All the data your organization generates doesn’t need to exist in SAP for the long term. For example, your active data within the current period needs to be available for audit purposes. But once the data and reporting periods have been closed, the question you and other stakeholders should ask is “why does this need to stay in SAP?” Going through this evaluation process will help you put proper rules in place to check the growth of your Hana database.

Trim the fat by archiving data

After figuring out which data needs to go, the next order of business is data archiving. Data archiving is the process of moving data that is no longer actively used out of production systems and into a separate storage system for long-term retention. By archiving your data, you can trim the fat in your Hana database and ensure your overall cloud environment is lean.

It is worth noting that while data archiving is a technical exercise that has been around for a while, organizations have always struggled to get it going. That said, key stakeholders need to fully support the data archiving initiative for it to yield the desired results.

Make Hana smarter with SAP NSE

One of Hana’s strengths is to put everything in memory. This function is not conducive to an effective data management strategy. To effectively manage the growth of Hana, organizations need to make it smarter with Hana Native Storage Extension (NSE).

Hana NSE is a general-purpose, built-in, warm data-tiered store in Hana that lets organizations manage less frequently accessed data without fully loading it into memory. The solution integrates disk-based database technology with Hana in-memory to intelligently put into memory only what it thinks you’re going to use.

Hana NSE configuration is based on understanding usage patterns, the age of the data, and the relevance of the data being evaluated. When properly configured, NSE will keep the growth of Hana in check and improve its price-performance ratio.

Hana data partitioning

With rules set around the data that stays in SAP, data archiving to trim the size of your data, and Hana NSE to keep the growth of your Hana database in check, the final piece to an effective Hana data management strategy is data partitioning. Data partitioning is the process of dividing index-organized tables into smaller pieces so the data can be easily accessed and managed.

When it comes to managing Hana, data partitioning is all about looking at the biggest tables within the database and deciding which of those large tables your organization can do without. By partitioning large tables in Hana, you can reduce the size of the tables being loaded into memory and ease the demand on memory. This translates to smaller, more manageable, and cost-effective Hana systems.

Bringing it all together

As SAP data accelerates, data sprawl is becoming more of a challenge for enterprises running Hana in the cloud. The four-point strategy above can help prevent and deal with Hana data sprawl and its risks however, a critical component to making this strategy work is an SAP managed services provider with the skills and expertise to effectively incorporate archiving, Hana NSE, and partitioning into your overall Hana on cloud data management strategy.

Links:

You may also like

About the author

Eamonn O'Neill, Lemongrass

Add Comment