Managing Basic Data Loads Using Dataloader CLI

Hello, and welcome to Another Salesforce Blog! Here I will be posting solutions to problems that I couldn’t find an answer to in hopes of helping those who find themselves stuck when using the Salesforce platform.

User Story

We want to complete a data load into a production environment to insert and update a list of contacts from a .CSV file that we received from our data team. We have to do this load every day, and we’re doing it in production, so we want to cut down on time and human error.

Background

The Dataloader Command Line Interface is usable in Windows to automate data loads. A quickstart is available in the Salesforce documentation here. The Salesforce documentation does a good job of laying out this process, so I will try not to spend too much time repeating it, and will instead walk through an application of this process step by step in order to increase understanding, especially for those who are new to a CLI.

Solution

Step one – install Dataloader if you haven’t already.

The most recent version of Dataloader for Windows can be found in the Salesforce documentation here. I have mine installed directly under my C:\ drive for ease of access.

Step two – create the encryption key.

In order to encrypt our password and security token, we must first create an encryption key. This can be achieved by opening a command line, navigating to the \Dataloader\bin folder using the cd command, and then entering the encrypt.bat -k command.

For those who are unfamiliar with a command line, this would look like:

C:\Users\EvelynMaguire>cd c:\dataloader\bin
c:\dataloader\bin>encrypt.bat -k
Keyfile "C:\Users\EvelynMaguire.dataloader\dataLoader.key" was created!

Step three – encrypt password.

The next step is to generate an encrypted password using our newly created encryption key. This can be completed using the command encrypt.bat -e <password> <securitykey> "<encryption key file path>".

c:\dataloader\bin>encrypt.bat -e PASSWORDSecurityKey123456789 "C:\Users\EvelynMaguire.dataloader\dataloader.key"
The output string of encryption is:
c15a6ed0d1b72ade274a6c91cd8f825e2fd7e33d3ca6fb790f01deb10504b2223e6a7a591134468418b90d35fc0dfe50

This will generate an encrypted password string, which will need to be regenerated every time a password is changed, but can be left alone in the meantime. If you are setting up a script for multiple environments, the password and security key for each environment will need to be generated.

If multiple environments need to be set up for a data load, a config.properties file can be used to store login information for multiple environments.

Step 4 – create mapping file.

The easiest way to create a mapping file, and the way that I recommend, is to actually go through the steps in Data Loader and save a map. This way, if there are any invisible characters in your file, your file will still map properly.

The Create Map dialog box.

Manually (or automatically!) map your fields, and then click Save Mapping. Save this mapping where you will be uploading your files from.

File Name: TestMapping File Type: *.sdl

Step 5 – create process-conf.xml file.

The process-conf.xml file can be created using the documentation found here. There are also sample files included in the Data Loader installation file, which can be found at C:\...\dataloader\samples.

We will be building off of the example in the Salesforce documentation to make a more complex file.

First, let’s establish the framework with which we are writing our XML file. We will be using Spring, which is an open source Java platform, and the most popular framework for Enterprise Java. A full tutorial for Spring can be found at TutorialsPoint here.

The Spring Framework has a Bean Module, which is “an object that is instantiated, assembled, and otherwise managed by a Spring IoC container.” Each bean contains configuration metadata, which instructs the container on how to create a bean, the bean’s lifecycle, and what the bean is dependent on. In our project, each bean will perform a different Data Loader function.

Next, let’s take a look at the sample process-conf.xml file that Data Loader gives us, located at C:\...\dataloader\samples\conf\process-conf.xml.

<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
<beans>
    <bean id="accountMasterProcess"
          class="com.salesforce.dataloader.process.ProcessRunner"
          singleton="false">
        <description>AccountMaster job gets the Customer record updates from ERP (Oracle financials) and uploads them to salesforce using 'upsert'.</description>
        <property name="name" value="accountMasterProcess"/>
        <property name="configOverrideMap">
            <map>
                <entry key="sfdc.debugMessages" value="true"/>
                <entry key="sfdc.debugMessagesFile" value="c:\dataloader\samples\status\accountMasterSoapTrace.log"/>
                <entry key="sfdc.endpoint" value="https://login.salesforce.com"/>
                <entry key="sfdc.username" value="user@mycompany.com"/>
                <!-- password below has been encrypted using key file, therefore it will not work without the key setting: process.encryptionKeyFile
                the password is not a valid encrypted value, please generate the real value using encrypt.bat utility -->
                <entry key="sfdc.password" value="5555555555555555"/>
                <entry key="process.encryptionKeyFile" value="c:\users\username\sample.key"/>
                <entry key="sfdc.timeoutSecs" value="600"/>
                <entry key="sfdc.loadBatchSize" value="200"/>
                <entry key="sfdc.externalIdField" value="Oracle_Id__c"/>
                <entry key="sfdc.entity" value="Account"/>
                <entry key="process.operation" value="upsert"/>
                <entry key="process.mappingFile" value="c:\dataloader\samples\conf\accountMasterMap.sdl"/>
                <entry key="dataAccess.name" value="queryAccount"/>
                <entry key="dataAccess.type" value="databaseRead"/>
                <entry key="process.initialLastRunDate" value="2005-12-01T00:00:00.000-0800"/>
            </map>
        </property>
    </bean>
    <bean id="opportunityUpsertProcess"
          class="com.salesforce.dataloader.process.ProcessRunner"
          singleton="false">
        <description>Opportunity Upsert job gets the Customer record updates from a CSV file and uploads them to salesforce using 'upsert'.</description>
        <property name="name" value="opportunityUpsertProcess"/>
        <property name="configOverrideMap">
            <map>
                <entry key="sfdc.endpoint" value="https://login.salesforce.com"/>
                <entry key="sfdc.username" value="user@mycompany.com"/>
                <!-- password below has been encrypted using key file, therefore it will not work without the key setting: process.encryptionKeyFile
                the password is not a valid encrypted value, please generate the real value using encrypt.bat utility -->
                <entry key="sfdc.password" value="5555555555555555"/>
                <entry key="process.encryptionKeyFile" value="c:\users\username\sample.key"/>
                <entry key="sfdc.timeoutSecs" value="600"/>
                <entry key="sfdc.loadBatchSize" value="200"/>
                <entry key="sfdc.externalIdField" value="Oracle_Id__c"/>
                <entry key="sfdc.entity" value="Opportunity"/>
                <entry key="process.operation" value="upsert"/>
                <entry key="process.mappingFile" value="c:\dataloader\samples\conf\opportunityUpsertMap.sdl"/>
                <entry key="dataAccess.name" value="c:\dataloader\samples\data\opportunityData.csv"/>
                <entry key="dataAccess.type" value="csvRead"/>
                <entry key="process.initialLastRunDate" value="2006-12-01T00:00:00.000-0800"/>
            </map>
        </property>
    </bean>
    <bean id="databaseAccountExtractProcess"
          class="com.salesforce.dataloader.process.ProcessRunner"
          singleton="false">
        <description>DatabaseAccountExtract job gets account info from salesforce and updates or inserts info into database."</description>
        <property name="name" value="databaseAccountExtract"/>
        <property name="configOverrideMap">
            <map>
                <entry key="sfdc.debugMessages" value="false"/>
                <entry key="sfdc.debugMessagesFile" value="c:\dataloader\samples\status\sfdcSoapTrace.log"/>
                <entry key="sfdc.endpoint" value="https://login.salesforce.com"/>
                <entry key="sfdc.username" value="user@mycompany.com"/>
                <!-- password specified below is invalid, please generate one using the encrypt.bat utility -->
                <entry key="sfdc.password" value="1111111111111111"/>
                <entry key="process.encryptionKeyFile" value="c:\users\username\sample.key"/>
                <entry key="sfdc.timeoutSecs" value="600"/>
                <entry key="sfdc.loadBatchSize" value="200"/>
                <entry key="sfdc.entity" value="Account"/>
                <entry key="sfdc.extractionRequestSize" value="500"/>
                <entry key="sfdc.extractionSOQL" value="Select Id, Name, Type, ParentId, Phone, AccountNumber, Website, Sic, AnnualRevenue, NumberOfEmployees, TickerSymbol, Oracle_Id__c FROM Account"/>
                <entry key="process.operation" value="extract"/>
                <entry key="process.mappingFile" value="c:\dataloader\samples\conf\accountExtractToDbMap.sdl"/>
                <entry key="dataAccess.type" value="databaseWrite"/>
                <entry key="dataAccess.name" value="updateAccount"/>
            </map>
        </property>
    </bean>
    <bean id="csvAccountExtractProcess"
          class="com.salesforce.dataloader.process.ProcessRunner"
          singleton="false">
      <description>csvAccountExtract job gets account info from salesforce and saves info into a CSV file."</description>
        <property name="name" value="csvAccountExtract"/>
        <property name="configOverrideMap">
            <map>
                <entry key="sfdc.debugMessages" value="false"/>
                <entry key="sfdc.debugMessagesFile" value="c:\dataloader\samples\status\sfdcSoapTrace.log"/>
                <entry key="sfdc.endpoint" value="https://login.salesforce.com"/>
                <entry key="sfdc.username" value="user@mycompany.com"/>
                <!-- password specified below is invalid, please generate one using the encrypt.bat utility -->
                <entry key="sfdc.password" value="1111111111111111"/>
                <entry key="process.encryptionKeyFile" value="c:\users\username\sample.key"/>
                <entry key="sfdc.timeoutSecs" value="600"/>
                <entry key="sfdc.loadBatchSize" value="200"/>
                <entry key="sfdc.entity" value="Account"/>
                <entry key="sfdc.extractionRequestSize" value="500"/>
                <entry key="sfdc.extractionSOQL" value="Select Id, Name, Type, ParentId, Phone, AccountNumber, Website, Sic, AnnualRevenue, NumberOfEmployees, TickerSymbol, Oracle_Id__c FROM Account"/>
                <entry key="process.operation" value="extract"/>
                <entry key="process.mappingFile" value="c:\dataloader\samples\conf\accountExtractMap.sdl"/>
                <entry key="dataAccess.type" value="csvWrite"/>
                <entry key="dataAccess.name" value="c:\dataloader\samples\data\extract.csv"/>
            </map>
        </property>
    </bean>
</beans>

We will specifically be looking at the “opportunityUpsertProcess” bean, as it most closely fits our need to insert and update a list of accounts given to us by our data department.

<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
<beans>
    <bean id="opportunityUpsertProcess"
          class="com.salesforce.dataloader.process.ProcessRunner"
          singleton="false">
        <description>Opportunity Upsert job gets the Customer record updates from a CSV file and uploads them to salesforce using 'upsert'.</description>
        <property name="name" value="opportunityUpsertProcess"/>
        <property name="configOverrideMap">
            <map>
                <entry key="sfdc.endpoint" value="https://login.salesforce.com"/>
                <entry key="sfdc.username" value="user@mycompany.com"/>
                <!-- password below has been encrypted using key file, therefore it will not work without the key setting: process.encryptionKeyFile
                the password is not a valid encrypted value, please generate the real value using encrypt.bat utility -->
                <entry key="sfdc.password" value="5555555555555555"/>
                <entry key="process.encryptionKeyFile" value="c:\users\username\sample.key"/>
                <entry key="sfdc.timeoutSecs" value="600"/>
                <entry key="sfdc.loadBatchSize" value="200"/>
                <entry key="sfdc.externalIdField" value="Oracle_Id__c"/>
                <entry key="sfdc.entity" value="Opportunity"/>
                <entry key="process.operation" value="upsert"/>
                <entry key="process.mappingFile" value="c:\dataloader\samples\conf\opportunityUpsertMap.sdl"/>
                <entry key="dataAccess.name" value="c:\dataloader\samples\data\opportunityData.csv"/>
                <entry key="dataAccess.type" value="csvRead"/>
                <entry key="process.initialLastRunDate" value="2006-12-01T00:00:00.000-0800"/>
            </map>
        </property>
    </bean>
</beans>

Let’s break down each line, and build our file.

First, we need to declare our XML document type and reference our DTD (Document Type Definition) file. This will allow us to define the structure of our XML document.

<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">

Next, we declare our bean using <bean />, which is an XML analog of the Java @Bean annotation. The Spring Framework documentation and further definitions for this can be found here.

<beans>
...
</beans>

Let’s instantiate a bean called contactUpsertProcess, with a class of “com.salesforce.dataloader.process.ProcessRunner” (as defined by Salesforce documentation), and with singleton set to false. “Singleton” is a bean scope that allows for one shared instance of the bean. Further documentation is located here.

We also want to give our bean a thorough description, like “Contact Upsert job upserts the Contact records from a CSV file to Salesforce.” This will be enclosed within the <description /> tags.

Next, we want to define our property name as “contactUpsertProcess,” the same as the bean ID. This will be the command that we call using the Command Line Interface, so we want it to be something descriptive.

...
    <bean id="contactUpsertProcess"
          class="com.salesforce.dataloader.process.ProcessRunner"
          singleton="false">
        <description>Contact Upsert job upserts the Contact records from a CSV file to Salesforce.</description>
        <property name="name" value="contactUpsertProcess"/>
...
    </bean>
...

Lastly, we can modify the existing code to fit our needs. We will edit the highlighted lines with our values, including username, password, encryptionKeyFile path, externalIdField, which entity we are upserting, and the path to our mappingFile that we generated through Data Loader. The process configuration parameters can be found here.

...
        <property name="configOverrideMap">
            <map>
                <entry key="sfdc.endpoint" value="https://login.salesforce.com"/>
                <entry key="sfdc.username" value="user@mycompany.com"/>
                <entry key="sfdc.password" value="5555555555555555"/>
                <entry key="process.encryptionKeyFile" value="c:\users\username\sample.key"/>
                <entry key="sfdc.timeoutSecs" value="600"/>
                <entry key="sfdc.loadBatchSize" value="200"/>
                <entry key="sfdc.externalIdField" value="EXTERNAL ID FIELD"/>
                <entry key="sfdc.entity" value="Contact"/>
                <entry key="process.operation" value="upsert"/>
                <entry key="process.mappingFile" value="c:\dataloader\samples\conf\TestMap.sdl"/>
                <entry key="dataAccess.name" value="c:\dataloader\samples\data\ContactData.csv"/>
                <entry key="dataAccess.type" value="csvRead"/>
                <entry key="process.initialLastRunDate" value="2006-12-01T00:00:00.000-0800"/>
            </map>
        </property>
    </bean>
</beans>

One thing to note is that Line 14 will contain the name of our file that we are upserting.

If we want to specify the names and locations of our Error and Success files, we can add the following:

<entry key="process.outputSuccess" value="C:\Users\...\Documents\Data Loads\Contacts\Contact_Success.csv"/>
<entry key="process.outputError" value="C:\Users\...\Documents\Data Loads\Contacts\Contact_Error.csv"/>

I have my files in a folder called C:\Users\Evelyn\Documents\ASFB\DataLoader CLI, and my encryption file is located in C:\Users\Evelyn\.dataloader\dataLoader.key. The file that I am upserting is called extract.csv, and I am matching on the Contact field “Id”. My file, as a result, will look like this:

<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
<beans>
    <bean id="contactUpsertProcess"
          class="com.salesforce.dataloader.process.ProcessRunner"
          singleton="false">
        <description>Contact Upsert job upserts the Contact records from a CSV file to Salesforce.</description>
        <property name="name" value="contactUpsertProcess"/>
        <property name="configOverrideMap">
            <map>
                <entry key="sfdc.endpoint" value="https://login.salesforce.com"/>
                <entry key="sfdc.username" value="user@mycompany.com"/>
                <entry key="sfdc.password" value="5555555555555555"/>
                <entry key="process.encryptionKeyFile" value="C:\Users\EvelynMaguire\dataloader\dataLoader.key"/>
                <entry key="sfdc.timeoutSecs" value="600"/>
                <entry key="sfdc.loadBatchSize" value="200"/>
                <entry key="sfdc.externalIdField" value="Id"/>
                <entry key="sfdc.entity" value="Contact"/>
                <entry key="process.operation" value="upsert"/>
                <entry key="process.mappingFile" value="C:\Users\...\Documents\ASFB\DataLoader CLI\TestMap.sdl"/>
                <entry key="dataAccess.name" value="C:\Users\...\Documents\ASFB\DataLoader CLI\extract.csv"/>
                <entry key="process.outputSuccess" value="C:\Users\...\Documents\ASFB\DataLoader CLI\Contact_Success.csv"/>
                <entry key="process.outputError" value="C:\Users\...\Documents\ASFB\DataLoader CLI\Contact_Error.csv"/>
                <entry key="dataAccess.type" value="csvRead"/>
                <entry key="process.initialLastRunDate" value="2006-12-01T00:00:00.000-0800"/>
            </map>
        </property>
    </bean>
</beans>

Step 5 – upsert Contacts.

First, make sure all of your files are in the right places, and update your process-conf.xml file with the proper filepaths. Your file structure will likely be different than mine! I like to have a folder for each type of data load that I do, and then move the files to a more appropriate location after the load is completed.

Next, we will navigate to the Data Loader Bin folder. Again, this will be completed with the cd c:\...\dataloader\bin command.

cd c:\dataloader\bin

Next, we need the filepath of our process-conf.xml file. Again, mine is located at C:\Users\Evelyn\Documents\ASFB\DataLoader CLI.

To call our process, we use the process.bat command, followed by the filepath in quotes, followed by our process name, which we have declared as contactUpsertProcess.

process.bat "C:\Users\Evelyn\Documents\ASFB\DataLoader CLI" contactUpsertProcess

process.bat "C:\Users\Evelyn\Documents\ASFB\DataLoader CLI" contactUpsertProcess

This will cause our command line to spit out a bunch of information, and after a few seconds to a few minutes, our contacts will be upserted!

Upsert successful!
Contact Error and Success files.

Thanks for reading, let me know if you have any comments or questions!

-Evelyn, Another Salesforce Blog

One thought on “Managing Basic Data Loads Using Dataloader CLI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: