Target audience: Customers who utilize Azure Data Factory as a platform orchestrating data movement and transformation.
This article shows how Azure Data Factory can be used to trigger and automate SAP data movements using Xtract Universal’s command line tool.
- Xtract Universal is installed on a cloud VM and is accessible remotely over HTTP/S
- Customer has access to Azure Data Factory
Step 1: Create your SAP data extract in XU
In the XU Designer, configure your data extract with SAP connection, source object and destination.
In the example below, data from SAP table KNA1 is extracted and stored in an Azure blob destination.
Step 2: Test your SAP data extract from a remote machine
On a remote machine, i.e. a machine other than your XU server, run the SAP data extract with the xu.exe command line utility. You can copy xu.exe and xu.exe.config from the XU server to a folder on the remote machine.
This is to ensure that the XU server is reachable. The example below executes a data extract named “KNA1” on the IP address of the XU server on port 8065. You can configure an HTTPS connection to secure the connection.
Make sure the extract completes successfully:
Step 3: Configure an Azure Batch account #
The detailed steps for how to configure a batch account in the Azure portal are described here.
3.1 A Storage account needs to be associated with your Batch account. This can a be new storage account dedicated to batch processing, or an existing storage account. Microsoft recommends a general-purpose v2 storage account in the same region as your Batch account (for better performance).
3.2 The Pool allocation mode (under Advanced) can be the default Batch service (no need to select User subscription).
Step 4: Add a Pool to your Batch account #
More Information on this topic is available here.
4.1 The Pool will provide the compute resources (VM) to execute a task, in our case we need to run the command line utility xu.exe. This is not a very resource-intensive application and depending on whether you plan to use Azure Batch for other processing, you will choose an appropriately sized resource for your needs.
There is an Azure cost associated with the selected Pool.
In the example, a Window Server 2019 Datacenter with small disk configuration was used.
4.2 When you create the Pool, specify the Target dedicated nodes as at least 1.
Step 5: Upload xu.exe to storage account #
In the storage account that you associated with your Azure Batch account in step 3 above, create a container for the Xtract Universal command line utility.
In the example below, the container is named ‘xuexe’. Upload the files xu.exe and xu.exe.config from your Xtract Universal server installation to the storage account.
Step 6: Create Linked Service to Azure Batch in ADF #
In your ADF, go to Connections and create a new Linked Service.
From the available Linked Services options, select the Compute category, then Azure Batch.
For the new Batch Linked Service, specify the Batch Account, Access Key, Batch URL** and Pool name of the Batch account that you created in step 3.
For Storage linked service name, created a new linked service and reference the storage account that you configured in step 3.
Step 7: Create an ADF Pipeline with Custom Activity #
Create a new Pipeline and drag the Custom Activity under Batch Service into your pipeline.
On the General tab, provide a name for the activity (in the example below ‘KNA1’).
On the Azure Batch tab, select the Batch Linked Service from step 6.
On the Settings tab, specify the xu.exe command that you want to execute. This is the same command that you tested in step 2.
Also on the Settings tab, select the **Storage Linked Service from step 6 and the container / folder path where the xu.exe file is located.
Step 8: Debug your pipeline #
Click the Debug button to test the execution of the SAP data extract.
You can review the Input and Output of the activity, including the exitcode from xu.exe (0 if successful)
In your storage account from step 3, you will find a folder named adfjobs.
For every pipeline execution, there will be a subfolder with log information.
The files stderr.txt and stdout.txt will contain the output from xu.exe.