Speculations, Misunderstandings And Myths About Hana Must Be Put to Rest. [shuttersock: 237716635, Tashatuvango]

Speculations, Misunderstandings And Myths About Hana Must Be Put to Rest. [shuttersock: 237716635, Tashatuvango]

Misunderstandings in the Context of SAP’s Hana

Since SAP put Hana on general release in 2011, much speculation has surrounded its technology, architecture and usability. Because developments stride ahead so fast, statements, items of documentation and information find themselves rapidly replaced. The author would like to confront the E-3 readers with some myths in this article and then ask: would you have known it?

‘The starting-time, through to availability of the SAP system, can take from 30 to 60 minutes to load all data in the main memory!’

Yes, to load all files into Hana’s main memory, some time is required, but this is no different if AnyDB is used. The latter always demands time to fill the buffer.

This usually happens after access to the data is first obtained and the data is staying in the buffer until the LRU – ‘least recently used’ – algorithm gets into operation and forces the data out.

At each start, Hana loads the complete row store into the RAM. After that, the system is available straightaway.

(…) Hana loads the complete row store into the RAM. After that, the system is available (…)

The test system is a BW on Hana on IBM Power. The DB size is 40 GB, the row store has 6 GB and the starting process lasts around 60 seconds; the stopping process requires around 75 seconds. In the second run, a 5 GB column table is added, as is an SQL for the preload.

Again, the start-process lasted around 60 seconds and the stop process about 75 seconds.

‘Why was the start process not significantly longer although there is more data to be loaded?’

Since SPS7 the preloading operation, together with the reloading of the tables, takes place asynchronously, immediately after the start-process of Hana-DB is completed. In this way, the system is available immediately once again, without waiting for the loading process with regard to the column-oriented tables.

(…) can be tested using the script loadAllTables.py.

If someone wants to test how long it takes for all tables to be loaded into the RAM, this can be tested using the script loadAllTables.py.

‘With Hana, sets of statistics are no longer needed with Hana; there is no longer a need to schedule-in collective-runs for statistics!’

This is partly correct. For the column-oriented tables this statement is accurate.  No special collective runs are needed because the optimiser is very quickly in the picture, through the dictionary and via the distribution. Sets of statistics are generated automatically for the row store as soon as these are needed (on-the-fly).

Therefore, there is likewise no need for these to be scheduled-in through collective runs. Currently it is not officially documented how one can influence these sets of statistics (e.g. sample-size, manual statistics-run, etc.).

‘A ‘Restore’ always needs logs for a consistent recovery!’

Wrong, the Hana back-ups are based on a snapshot technology. Thus it is a completely frozen database status, which is determined from the log position at the time when the back-up is carried out.

Therefore the back-up is in a consistent state without a log of any kind. Certainly, the logs are needed for a forward-roll, e.g. point-in-time recovery or for moving to the last-possible status prior to a cut-out.

Regarding the back-up catalog: catalog information items are stored as they are in Oracle (*.anf-file); these are absolutely needed for the recovery. The back-up catalog is secured with each data back-up and log-back-up!

Even without this original file from the back-up, a recovery can take place.

This is no normally legible file. Even without this original file from the back-up, a recovery can take place (see SAP Note 1812057, reconstruction of the back-up catalog with hdbbackupdiag).

This is to be found in the back-up location (at backup-to-disk) or in the back-up set of a third-party provider.

The catalog contains all necessary information required for a recovery, such as which logs are required at which point in time or which files belong to which back-up set. If the back-ups are physically deleted at hard-disk level, VTL level or tape level, the back-up catalog nevertheless holds these invalid items of information.

There is not currently an automatic procedure made available that cleanses this.

‘What size is the catalog file in the system?’

You can test this for yourself! You can view this using Hana Studio in the back-up editor, if you get all back-ups (including logs) shown to you.

If this file is larger than 20 MB, you need to keep an eye on the housekeeping; as mentioned, this too is secured at every back-up. This means more than 200 times per day! 200 x 20 MB x 3 (because it is a 3-system landscape): this is already 12,000 MB.

‘The result of the sizing report must be doubled!’

The new sizing-results of the SAP reports are final and do not need to be doubled once again; this contrasts with what old sets of documentation may still cause someone to believe. A BW scale-up solution can be used as an example.

(…) new sizing-results of the SAP reports are final and do not need to be doubled once again.

This means that master and slave nodes are located on a server. SAP’s recommendations are that a scale-out approach in the BW environment consists of a master node, bearing the transactional load, and at least two slave nodes responsible for the reporting.

The SAP main-memory sizing consists of a static part and a dynamic part. The static part is comprised of indices and also column data and row data, corresponding to the sum of the used data.

The dynamic part is comprised of temporary files for the reporting (OLAP BW queries), delta-merge and also the sorting and grouping; in total, this corresponds to the temporary memory that is released again after the action is completed.

(…) the temporary memory that is released again after the action is completed.

An example: the row store, with 53 GB, x 2, corresponding to 106 GB; a master-column has 11 GB, x 2, corresponding to 21 GB (rounded), plus 67 GB x 2, corresponding to 135 GB (rounded). In total: 156 GB. 50 GB caches and services are required for each server.

This ultimately results in a total of 312 GB.

You might also like

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

*