Recently SAP added the feature „Native Storage Extensions“ to the Hana database. This allows to store data on disk. It is using memory to cache the database blocks based on the least-recently-used basis. That is exactly what classic relational database have been doing for the past 30 years! What does that mean for the in-memory story and when should this feature be used?
In order to answer these questions, I would like to dive into a couple of Hana details and bust some myths about Hana.
Rows for OLTP, Columns for OLAP
A common and unchallenged statement is that row storage is more efficient at reading single rows and column storage than at reading aggregating columns. Even Hasso Plattner hinted in that direction in his Sapphire keynote (Minute 22 in the keynote). The argumentation is straight forward: In a row storage all data of a row is next to each other, hence reading one row is fast. Sounds obvious, but it ignores how RAM works.
A spread sheet might be a good analogy. Reading row number five with all ten cells is obviously the fastest for humans. This is how we read, and this is how hard drives work as well. A hard drive moves the disk head to cylinder five, then waits for a full disk revolution and all data on that cylinder is read on the fly.
However, memory is different. An address is sent to the memory module and it responds with the data stored at that address. There is no advantage requesting nearby addresses compared to data on opposite sides. (In fact, larger servers have multiple memory controllers and hence can access different RAM modules independently from each other.) The spread sheet analogy would be asking “What is in cell A5? And in cell B5?” In that scenario, there is no benefit reading nearby cells.
The same argument is true when considering SSDs instead of hard drives. Data is requested and a large 4k block of data is returned. All these devices are called block devices for a reason. In contrast, the response of a RAM-read is 16 bytes small.
Therefore, the statement that row-reading is fast when data is nearby – meaning inside the same 4k block – applies to devices providing large 4k blocks. For memory, it (almost) does not matter; to read 4k of data, the RAM must be accessed 256 times anyhow.
With that, the myth of column orientation not being suited for full rows access is busted – under the condition of in-memory access. However, it has no advantage according to this argumentation.
Data compression is another feature every database vendor provides. But while it is an integral part of the Hana database, it is rarely enabled in classic databases. The reason is simple: Within a column, the data can be compressed much better because of the similarity of the values. A good example is the material master containing diverse fields like material number, text, color, type, size, price. It is much easier to compress all color values compared to all the diverse values within a single row.
By using a bitmap index approach, Hana is optimizing this functionality even further. For each distinct value it creates a string of bits where a 1 means that the row at this exact position has this exact value. For example, the bitmap string for the size = M might look like 0100‘0000‘0000‘1000‘… and tells that row #2 and #13 have a ‘M’ in the size column. This can be compressed and processed extremely efficiently, of course. Even trivial and therefore fast compression algorithms provide a good compression factor. By supporting different compression algorithms, the database can pick the best suited for each column. One for sparse data, another for data with just a few distinct values and another for unique keys.
This compression plays an important role when it comes to search.
Accessing data happens either via the primary key or via a scan. “Show sales order with the ID #1234” is a primary key access, all databases use a B*Tree index for that. A scan like “Show all sales items for the order #1234” is a different story. In a classic relational database, this requires an explicitly created index. Such index is a tree, and under the node for #1234 all row positions are listed. Without such index the only option for the database is to read every single record, compare its value and throw away the non-matching rows.
In Hana, finding the row positions is trivial thanks to the way the compression works. No additional indexes are needed. The database will pick the bitmap string for the value #1234, find that all values are zero except a small cluster and it reads the rows at those offsets.
It’s the combination!
Hana is not the first database to use column orientation, in-memory, bitmap indexes and compression. But even today it is the only database combining the power of these technologies in a way their individual advantages work together and the disadvantages of one feature is compensated by the other. Memory is fast but expensive – compression to the rescue! Compression is expensive for diverse data – use columnar orientation and the data is not diverse. Columnar access of rows is expensive on block devices – but not in memory.
OLAP plus OLTP
With the above technologies in combination, Hana is well-suited for transactional processing, reading, writing and updates. For analytical cases, columnar storage and bitmap indexes are perfect, no one doubts that. Therefore, Hana enables hybrid scenarios with mixed workloads and as a result brings back operational reporting to where it belongs: the operational system.
But other than SAP tends to imply, OLAP still comes at a cost. If 20 ERP users require 5 CPUs for their work and one OLAP user another 5 CPUs for a single complex query, the server better has 10 CPUs to spare. Therefore, and because the ERP system is about current data and does not contain the full history of all master and transaction data, it is still a good idea to offload the Business Intelligence queries to a data warehouse database. Operational reporting works fine on the ERP database, but Business Intelligence with its expensive adhoc queries does not.
The above comparison assumed that both the classic database and Hana have more RAM than needed. Storing all data in RAM, even if not used, would be a waste of memory, however. SAP added features to Hana to optimize that. It started with LOB, CLOB and NCLOB datatypes to remain on disk and Hana knows the file names only. Nice idea, except the S/4 Hana and ERP systems in general are not using such data types.
The next step was to load data in-memory only once it had been requested. This makes sense, else the startup of a 4TB database would take hours. Much better to start the database right way and load the table partitions as needed. An example might be a finance table. It has one partition per year and the first users will request only the current year’s data. The other 9 years are not required to be loaded in memory – until a first query compares multiple years. Hence a great feature to shorten the database startup time but after a while, all data will still be in memory.
The retention period setting is the first feature that allows to reduce the memory footprint of a running Hana database. It is meant for above cases where data was loaded into memory for one query and will not be touched again for a long time. By setting a retention period, these unused partitions are evicted from memory after a while. Note: This setting is off by default!
All these features make sense but do not solve the core problem: Partitions tend to be large. Just because a user requested a single record, the entire partition with e.g. 1GB in size is loaded into memory. This is not very efficient, as the loading will take a few seconds. Also, if the list of potentially used partitions is 1.1TB overall and the system has 1TB of memory only, it does not help, either.
To automate the administration of such occasionally used data, the recently added Hana Native Storage Extension comes to the rescue. The data is on disk and an upper limit of memory is defined. This memory pool is used to load pages of data in RAM instead of entire partitions. Multiple advantages here: pages instead of entire partitions; a memory limit is defined directly instead of using a retention time.
As a result, instead of specifying all tables that are rarely used and hoping they get evicted fast enough to stay below the physical memory limit, the administrator can mark all table partitions that might be used occasionally. The database handles the memory limits.
And that is the point. The hot data, the data that is used constantly, must stay in the Hana in-memory engine to get all the in-memory benefits. The Native Storage Extension feature is a comfortable way to access warm(!) data and allows a broader definition of which data is considered warm instead of hot. To use the prior example, of 10 years of finance data, only two years would be considered hot data.
The more data is considered warm and accessed more often, the more the database works like a classic relational database, but this time the downsides of the columnar storage and disk are felt. To extend the database with warm data, it is the perfect solution.
One older feature for warm data storage in Hana is Dynamic Tiering. It does utilize a Sybase IQ database as warm storage under the covers. And a more generic option is to use Smart Data Access – Hana’s Data Federation functionality – which can integrate pretty much every remote database as warm storage, even Hadoop, into Hana.
Which of the three to pick is not that easy, unfortunately. Dynamic Tiering and Smart Data Access are using external databases and SQL commands for access. Single row reads commonly used in transactional systems are okay but complex analytical queries not so much. Then there are the barely hidden differences. Does Sybase IQ or the remote database support the same datatypes with the same precisions as Hana? The same SQL statements and functions? Unlikely. The standard functionality is standardized for all SQL databases, that means that users will have no problems here.
Hana Native Storage Extensions is reusing Hana’s own logic for how to persist data on disk and therefore has none of these limitations – a good starting point.