GDPR: A Practical Example For Data Discovery In SAP
Blog Compliance

GDPR: A Practical Example For Data Discovery In SAP

Metadata discovery software vendor Silwood Technology has expanded its series of GDPR Starter Packs for major application packages, including SAP, JD Edwards and Siebel. The recent launch of Oracle E-Business Suite and Microsoft Dynamics AX 2012 Starter Packs will enable more users to quickly and simply identify the precise location of Personal Data. Silwood Technology has also provided E-3 Magazine with a working example of data discovery in SAP that can be found below.

The GDPR Starter Packs are underpinned by Safyr, a dedicated metadata discovery software, which enables users to access, understand, share and utilise the underlying data structures for all major application packages. Silwood Technology is urging application package users and its channel partners to ensure that they have, at minimum, instigated the discovery and documentation of personal data before 25 May 2018.

Nick Porter, founder and technical director at Silwood Technology, has confirmed that interest in the existing Starter Packs is strong and the reasons are twofold: “Firstly, it is highly unlikely that all organisations will have documented Personal Data in leading packages by the 25 May 2018. Therefore, those that are not fully compliant will need to undertake this work later as part of their data protection programme. As a result, there is an urgency now to find tools to find data quickly and accurately. Secondly, GDPR compliance is not a one-time event. When maintaining compliance, we believe that Safyr will be of value in keeping data catalogues, inventories and glossaries up-to-date with the locations of Personal Data across major packaged applications.”

Recent research conducted by Silwood Technology illustrated the scale of the challenge when locating personal data for GDPR compliance. For instance, within a typical Oracle E-Business Suite system, when searching for personal data, users will be faced with approximately 22,000 tables and 570,000 fields. Similarly, Microsoft Dynamics AX 2012 contains 7,000 tables and 100,000 fields.

Data Discovery in SAP

Working out which tables store personal data that needs to be reviewed for GDPR is a challenge in any environment, but particularly so where the system is one or more of the major application packages. The following example describes how to use the tool Safyr to ‘scope’ the potential tables that store ‘relevant’ personal data in an SAP system. In this case we are looking for tables which store ‘Date of Birth’ information. However, the process would work for any data which comes under the general definition of Personal Data. (Article 4 – Definitions, Section 1 of the Regulations defines the scope of data covered).

Of course many SAP systems have been customised so rather than providing a reference model, Safyr is more effective and useful because it extracts metadata from the application as implemented – including customisations.

The screenshot below shows a list of tables from a typical SAP system which has been extracted into Safyr, in this case just short of 100,000 tables, which is around the number in most such systems.

Figure 1: Close to 100.000 tables can be found in this SAP system.
Figure 1: Close to 100.000 tables can be found in this SAP system.

You can now do a search across all these tables to find any that contain a field with the ‘business name’ of the field containing the string ‘Date of Birth’.

Figure 2: Filtering for 'Date of Birth' in Safyr.
Figure 2: Filtering for ‘Date of Birth’ in Safyr.

This reduces the list of nearly 100,000 tables to just 84.  So there are 84 tables that have a field with the description of the field containing ‘date of birth’.

Figure 3: The search identified 84 tables that contain a record for Date of Birth.
Figure 3: The search identified 84 tables that contain a record for Date of Birth.

On the right of the list is a Row Count which gives the number of records in each table. Quite a few have zero – and this is not unusual in a SAP system as SAP delivers a full set of features and tables that may or may not be actually used by a given customer. The query on ‘date of birth’ can be refined to filter out any of the tables with no data and this gives  just 10 tables (this may be very different in another SAP system, depending on what features and modules of SAP the customer uses).

Figure 4: Excluding tables with no data has provided us with ten tables.
Figure 4: Excluding tables with no data has provided us with ten tables.

Having found a set of tables that contain likely Personal Data attributes, the results can be recorded using what Safyr calls a Subject Area. This is like a folder where we can group tables, and can be refined further by identifying individual fields.   It’s easy to select the tables and add them to a Subject Area – and there is an option to ‘mark’ those fields that meet the selection criteria used (in this case ‘data of birth’ fields). So the result is a group of tables that contain a field with the string ‘date of birth’ in the ‘business’, name and containing data.

Figure 5: The Subject Area for 'Date of Birth' now contains the ten tables we identified previously.
Figure 5: The Subject Area for ‘Date of Birth’ now contains the ten tables we identified previously.

The ‘Marked Fields’ column shows how many fields on each table meet the search criteria.   In the example above, table PA0002 has 3 such fields. The table details can be displayed to show the individual fields.

Figure 6: You are able to identify the individual field that contains 'Date of Birth'.
Figure 6: You are able to identify the individual field that contains ‘Date of Birth’.

Two of the 3 fields are visible in this example. We could do this same process for other Personal Data fields until we had assembled a set of GDPR Subject Areas that represent all the Personal Data categories that need to be assessed. Safyr then has features for merging these Subject Areas to create a consolidated list of Personal Data items. This brings together the Personal Data fields for each of the categories (Birth, Address, Credit Card Number….) into one integrated set.  Here is the same table – HR Master Record show above – with the merged fields from these categories.

Figure 7: A consolidated list of all relevant fields.
Figure 7: A consolidated list of all relevant fields.

Having identified and marked the Personal Data fields using the method described above, a next step might be to make the attributes for these fields easily available to a wider audience.  Safyr has a number of formats that can be exported, one of the most popular for GDPR being Excel. It’s easy to select exactly what properties to include in the spreadsheet.

Identifying candidate Personal Data attributes is but one step of any GDPR strategy.   In the case of large application packages like SAP it can be a very challenging first step. And a final thought:  Unlike Y2K, GDPR is not a one-time job. There is a responsibility on each organisation to monitor their storage of personal data on an on-going basis.

A free trail of Safyr is available from the link below.

Silwood Technology

About the author

E-3 Magazine

Articles published through E-3 Magazine International. This includes press releases by our partners as well as articles and reports from the E-3 team of journalists.

Add Comment

Click here to post a comment

Sign up for e3zine´s biweekly newsbites

Please do not use administrative mail adresses like "noreply@..", "admin@.." or similar as these may get blocked for security reasons.

We use rapidmail for dispatching our newsletter. By signing up, you agree that the data you have entered will be transmitted to rapidmail. Please take note of their terms and conditions and privacy policy.termsandconditions.

Our Authors