Version: Next

SAP Analytics Cloud

Important Capabilities

Capability	Status	Notes
Descriptions	✅	Enabled by default
Detect Deleted Entities	✅	Enabled via stateful ingestion
Platform Instance	✅	Enabled by default
Schema Metadata	✅	Enabled by default (only for Import Data Models)
Table-Level Lineage	✅	Enabled by default (only for Live Data Models)

Configuration Notes

Refer to Manage OAuth Clients to create an OAuth client in SAP Analytics Cloud. The OAuth client is required to have the following properties:
- Purpose: API Access
- Access:
  - Data Import Service
- Authorization Grant: Client Credentials
Maintain connection mappings (optional):

To map individual connections in SAP Analytics Cloud to platforms, platform instances and environments, the connection_mapping configuration can be used within the recipe:

connection_mapping:
    MY_BW_CONNECTION:
        platform: bw
        platform_instance: PROD_BW
        env: PROD
    MY_HANA_CONNECTION:
        platform: hana
        platform_instance: PROD_HANA
        env: PROD

The key in the connection mapping dictionary represents the name of the connection created in SAP Analytics Cloud.

Concept mapping

SAP Analytics Cloud	DataHub
`Story`	`Dashboard`
`Application`	`Dashboard`
`Live Data Model`	`Dataset`
`Import Data Model`	`Dataset`
`Model`	`Dataset`

Limitations

Only models which are used in a Story or an Application will be ingested because there is no dedicated API to retrieve models (only for Stories and Applications).
Browse Paths for models cannot be created because the folder where the models are saved is not returned by the API.
Schema metadata is only ingested for Import Data Models because there is no possibility to get the schema metadata of the other model types.
Lineages for Import Data Models cannot be ingested because the API is not providing any information about it.
Currently, only SAP BW and SAP HANA are supported for ingesting the upstream lineages of Live Data Models - a warning is logged for all other connection types, please feel free to open an issue on GitHub with the warning message to have this fixed.
For some models (e.g., builtin models) it cannot be detected whether the models are Live Data or Import Data Models. Therefore, these models will be ingested only with the Story subtype.

CLI based Ingestion

Install the Plugin

The sac source works out of the box with acryl-datahub.

Starter Recipe

Check out the following recipe to get started with ingestion! See below for full configuration options.

For general pointers on writing and running a recipe, see our main recipe guide.

source:
    type: sac
    config:
        stateful_ingestion:
            enabled: true

        tenant_url: # Your SAP Analytics Cloud tenant URL, e.g. https://company.eu10.sapanalytics.cloud or https://company.eu10.hcs.cloud.sap
        token_url: # The Token URL of your SAP Analytics Cloud tenant, e.g. https://company.eu10.hana.ondemand.com/oauth/token.

        # Add secret in Secrets Tab with relevant names for each variable
        client_id: "${SAC_CLIENT_ID}" # Your SAP Analytics Cloud client id
        client_secret: "${SAC_CLIENT_SECRET}" # Your SAP Analytics Cloud client secret

        # ingest stories
        ingest_stories: true

        # ingest applications
        ingest_applications: true

        resource_id_pattern:
            allow:
                - .*

        resource_name_pattern:
            allow:
                - .*

        folder_pattern:
            allow:
                - .*

        connection_mapping:
            MY_BW_CONNECTION:
                platform: bw
                platform_instance: PROD_BW
                env: PROD
            MY_HANA_CONNECTION:
                platform: hana
                platform_instance: PROD_HANA
                env: PROD

Config Details

Options
Schema

Note that a . is used to denote nested fields in the YAML recipe.

Field	Description
client_id ✅ string	Client ID for the OAuth authentication
client_secret ✅ string(password)	Client secret for the OAuth authentication
tenant_url ✅ string	URL of the SAP Analytics Cloud tenant
token_url ✅ string	URL of the OAuth token endpoint of the SAP Analytics Cloud tenant
incremental_lineage boolean	When enabled, emits lineage as incremental to existing lineage already in DataHub. When disabled, re-states lineage on each run. Default: False
ingest_applications boolean	Controls whether Analytic Applications should be ingested Default: True
ingest_import_data_model_schema_metadata boolean	Controls whether schema metadata of Import Data Models should be ingested (ingesting schema metadata of Import Data Models significantly increases overall ingestion time) Default: True
ingest_stories boolean	Controls whether Stories should be ingested Default: True
platform_instance string	The instance of the platform that all assets produced by this recipe belong to. This should be unique within the platform. See https://datahubproject.io/docs/platform-instances/ for more details.
query_name_template string	Template for generating dataset urns of consumed queries, the placeholder {query} can be used within the template for inserting the name of the query Default: QUERY/{name}
env string	The environment that all assets produced by this connector belong to Default: PROD
connection_mapping map(str,ConnectionMappingConfig)	Any source that produces dataset urns in a single environment should inherit this class
connection_mapping.`key`.env string	The environment that this connection mapping belongs to Default: PROD
connection_mapping.`key`.platform string	The platform that this connection mapping belongs to
connection_mapping.`key`.platform_instance string	The instance of the platform that this connection mapping belongs to
folder_pattern AllowDenyPattern	Patterns for selecting folders that are to be included Default: {'allow': ['.*'], 'deny': [], 'ignoreCase': True}
folder_pattern.ignoreCase boolean	Whether to ignore case sensitivity during pattern matching. Default: True
folder_pattern.allow array	List of regex patterns to include in ingestion Default: ['.*']
folder_pattern.allow.string string
folder_pattern.deny array	List of regex patterns to exclude from ingestion. Default: []
folder_pattern.deny.string string
resource_id_pattern AllowDenyPattern	Patterns for selecting resource ids that are to be included Default: {'allow': ['.*'], 'deny': [], 'ignoreCase': True}
resource_id_pattern.ignoreCase boolean	Whether to ignore case sensitivity during pattern matching. Default: True
resource_id_pattern.allow array	List of regex patterns to include in ingestion Default: ['.*']
resource_id_pattern.allow.string string
resource_id_pattern.deny array	List of regex patterns to exclude from ingestion. Default: []
resource_id_pattern.deny.string string
resource_name_pattern AllowDenyPattern	Patterns for selecting resource names that are to be included Default: {'allow': ['.*'], 'deny': [], 'ignoreCase': True}
resource_name_pattern.ignoreCase boolean	Whether to ignore case sensitivity during pattern matching. Default: True
resource_name_pattern.allow array	List of regex patterns to include in ingestion Default: ['.*']
resource_name_pattern.allow.string string
resource_name_pattern.deny array	List of regex patterns to exclude from ingestion. Default: []
resource_name_pattern.deny.string string
stateful_ingestion StatefulStaleMetadataRemovalConfig	Stateful ingestion related configs
stateful_ingestion.enabled boolean	Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or `datahub_api` is specified, otherwise False Default: False
stateful_ingestion.remove_stale_metadata boolean	Soft-deletes the entities present in the last successful run but missing in the current run with stateful_ingestion enabled. Default: True

The JSONSchema for this configuration is inlined below.

{
  "title": "SACSourceConfig",
  "description": "Base configuration class for stateful ingestion for source configs to inherit from.",
  "type": "object",
  "properties": {
    "incremental_lineage": {
      "title": "Incremental Lineage",
      "description": "When enabled, emits lineage as incremental to existing lineage already in DataHub. When disabled, re-states lineage on each run.",
      "default": false,
      "type": "boolean"
    },
    "env": {
      "title": "Env",
      "description": "The environment that all assets produced by this connector belong to",
      "default": "PROD",
      "type": "string"
    },
    "platform_instance": {
      "title": "Platform Instance",
      "description": "The instance of the platform that all assets produced by this recipe belong to. This should be unique within the platform. See https://datahubproject.io/docs/platform-instances/ for more details.",
      "type": "string"
    },
    "stateful_ingestion": {
      "title": "Stateful Ingestion",
      "description": "Stateful ingestion related configs",
      "allOf": [
        {
          "$ref": "#/definitions/StatefulStaleMetadataRemovalConfig"
        }
      ]
    },
    "tenant_url": {
      "title": "Tenant Url",
      "description": "URL of the SAP Analytics Cloud tenant",
      "type": "string"
    },
    "token_url": {
      "title": "Token Url",
      "description": "URL of the OAuth token endpoint of the SAP Analytics Cloud tenant",
      "type": "string"
    },
    "client_id": {
      "title": "Client Id",
      "description": "Client ID for the OAuth authentication",
      "type": "string"
    },
    "client_secret": {
      "title": "Client Secret",
      "description": "Client secret for the OAuth authentication",
      "type": "string",
      "writeOnly": true,
      "format": "password"
    },
    "ingest_stories": {
      "title": "Ingest Stories",
      "description": "Controls whether Stories should be ingested",
      "default": true,
      "type": "boolean"
    },
    "ingest_applications": {
      "title": "Ingest Applications",
      "description": "Controls whether Analytic Applications should be ingested",
      "default": true,
      "type": "boolean"
    },
    "ingest_import_data_model_schema_metadata": {
      "title": "Ingest Import Data Model Schema Metadata",
      "description": "Controls whether schema metadata of Import Data Models should be ingested (ingesting schema metadata of Import Data Models significantly increases overall ingestion time)",
      "default": true,
      "type": "boolean"
    },
    "resource_id_pattern": {
      "title": "Resource Id Pattern",
      "description": "Patterns for selecting resource ids that are to be included",
      "default": {
        "allow": [
          ".*"
        ],
        "deny": [],
        "ignoreCase": true
      },
      "allOf": [
        {
          "$ref": "#/definitions/AllowDenyPattern"
        }
      ]
    },
    "resource_name_pattern": {
      "title": "Resource Name Pattern",
      "description": "Patterns for selecting resource names that are to be included",
      "default": {
        "allow": [
          ".*"
        ],
        "deny": [],
        "ignoreCase": true
      },
      "allOf": [
        {
          "$ref": "#/definitions/AllowDenyPattern"
        }
      ]
    },
    "folder_pattern": {
      "title": "Folder Pattern",
      "description": "Patterns for selecting folders that are to be included",
      "default": {
        "allow": [
          ".*"
        ],
        "deny": [],
        "ignoreCase": true
      },
      "allOf": [
        {
          "$ref": "#/definitions/AllowDenyPattern"
        }
      ]
    },
    "connection_mapping": {
      "title": "Connection Mapping",
      "description": "Custom mappings for connections",
      "default": {},
      "type": "object",
      "additionalProperties": {
        "$ref": "#/definitions/ConnectionMappingConfig"
      }
    },
    "query_name_template": {
      "title": "Query Name Template",
      "description": "Template for generating dataset urns of consumed queries, the placeholder {query} can be used within the template for inserting the name of the query",
      "default": "QUERY/{name}",
      "type": "string"
    }
  },
  "required": [
    "tenant_url",
    "token_url",
    "client_id",
    "client_secret"
  ],
  "additionalProperties": false,
  "definitions": {
    "DynamicTypedStateProviderConfig": {
      "title": "DynamicTypedStateProviderConfig",
      "type": "object",
      "properties": {
        "type": {
          "title": "Type",
          "description": "The type of the state provider to use. For DataHub use `datahub`",
          "type": "string"
        },
        "config": {
          "title": "Config",
          "description": "The configuration required for initializing the state provider. Default: The datahub_api config if set at pipeline level. Otherwise, the default DatahubClientConfig. See the defaults (https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/graph/client.py#L19).",
          "default": {},
          "type": "object"
        }
      },
      "required": [
        "type"
      ],
      "additionalProperties": false
    },
    "StatefulStaleMetadataRemovalConfig": {
      "title": "StatefulStaleMetadataRemovalConfig",
      "description": "Base specialized config for Stateful Ingestion with stale metadata removal capability.",
      "type": "object",
      "properties": {
        "enabled": {
          "title": "Enabled",
          "description": "Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or `datahub_api` is specified, otherwise False",
          "default": false,
          "type": "boolean"
        },
        "remove_stale_metadata": {
          "title": "Remove Stale Metadata",
          "description": "Soft-deletes the entities present in the last successful run but missing in the current run with stateful_ingestion enabled.",
          "default": true,
          "type": "boolean"
        }
      },
      "additionalProperties": false
    },
    "AllowDenyPattern": {
      "title": "AllowDenyPattern",
      "description": "A class to store allow deny regexes",
      "type": "object",
      "properties": {
        "allow": {
          "title": "Allow",
          "description": "List of regex patterns to include in ingestion",
          "default": [
            ".*"
          ],
          "type": "array",
          "items": {
            "type": "string"
          }
        },
        "deny": {
          "title": "Deny",
          "description": "List of regex patterns to exclude from ingestion.",
          "default": [],
          "type": "array",
          "items": {
            "type": "string"
          }
        },
        "ignoreCase": {
          "title": "Ignorecase",
          "description": "Whether to ignore case sensitivity during pattern matching.",
          "default": true,
          "type": "boolean"
        }
      },
      "additionalProperties": false
    },
    "ConnectionMappingConfig": {
      "title": "ConnectionMappingConfig",
      "description": "Any source that produces dataset urns in a single environment should inherit this class",
      "type": "object",
      "properties": {
        "env": {
          "title": "Env",
          "description": "The environment that this connection mapping belongs to",
          "default": "PROD",
          "type": "string"
        },
        "platform": {
          "title": "Platform",
          "description": "The platform that this connection mapping belongs to",
          "type": "string"
        },
        "platform_instance": {
          "title": "Platform Instance",
          "description": "The instance of the platform that this connection mapping belongs to",
          "type": "string"
        }
      },
      "additionalProperties": false
    }
  }
}

Code Coordinates

Class Name: datahub.ingestion.source.sac.sac.SACSource
Browse on GitHub

Questions

If you've got any questions on configuring ingestion for SAP Analytics Cloud, feel free to ping us on our Slack.

Is this page helpful?

SAP Analytics Cloud

Important Capabilities​

Configuration Notes​

Concept mapping​

Limitations​

CLI based Ingestion​

Install the Plugin​

Starter Recipe​

Config Details​

Code Coordinates​