As advertised, Salesforce is the world No. 1 on demand CRM platform. It is quite popular nowadays, across different sizes and areas of businesses. From our clients, the platform is being used by sport utilities, government agencies, energy retailers, and computer and service retailers.
Unfortunately it still has its own problems, for example, data quality problem. Due to the openess of the Salesforce platform, it is nearly impossible to control the data quality from its own features, at least from our clients, data report generated from Salesforce is in horrible quality, definitely cannot be used for reporting/analysis purpose.
So what we could do is extracting the raw data from Salesforce, and do some cleansing task, load into SQL Server database, a standard ETL procedure. However it is difficult to find a free plugin/component you can use in your SSIS to extract Salesforce data. So what I would like to do, is to show how to retrieve the data from sales force through its API. (Keep in mind there are many ways to access Salesforce data, here is only one of them)
To begin our work, we need to have the API. Log into the Salesforce platform, go to the Setup page, on the left hand side, under the Build - Develop list view, you should see the API option.
Click this link, a list of WSDL options should be shown on the main panel. They are for different design purpose, as described on the page. What we need is the Enterprise WSDL, as it contains the most basic types we need in this example. Right click the link "Generate Enterprise WSDL", and save the file to your local folder.
Now before go any further, a quick notification about the WSDL. The downloaded WSDL can
NOT be used in the project directly. If we add the WSDL as a web service reference into the project, as shown below
When you compile and run the project, you will get error message like below
In short it is because the ListViewRecord, this complex type defined in the WSDL does not contain a data type. It is fine for some languages, because the columns "ListViewRecordColumn" contains the data type definition. But from .Net, we have to explicitly identify the data type.
So we need to manually fix this problem: Open the downloaded WSDL file, locate the node "ListViewRecord" as shown below
Under the tag </sequence>, we need to declare its data type by adding
<xsd:attribute name="tmp" type="xsd:string" />
Then it should look like
Now the API definition part is done. We need to collect some other information before we go to SSIS part.
To access the Salesforce API, we need to provide our login email, password, and the token. To get the token we need to go to My Setting of the Salesforce platform, on the left hand side, expand options under Personal node, you should see the option "Reset My Security Token"
By click it, on the main panel of the page, you will see the warning message
Click the button "Reset Security Token", you will see the confirmation message like below, and a system email will shot into your associated email address.
Now we have all the information we need from Salesforce side. But before we go into SSIS, a quick tip may save your time: you may want to set password never expires under a Dev environment. Otherwise every time you change your password, you need to reset your security token.
Now time to look into SSIS. What I would like to do is to use a script component to get the User data from Salesforce.
Firstly drop a data flow task onto the control flow panel.
Within in the data flow task, drag and drop a script component, and select source on the pop up window.
Double click the script component, and go to "Inputs and Outputs" tab. As you can see, I created four output columns for the default output, FirstName, LastName, Title, and Email. They will be used to hold data retrieved from the User entity in Salesforce. For them, I defined their data type as string, length as 255. (You can also see I created another output "Output_Err". Because Salesforce's API provide the login result class, it can be used to show standard or customised error messages. But it will not be covered in this post, as it could be a huge topic)
Now to use the API, we need to add the downloaded, and modified WSDL as a web service into the script project.
Now turn off serialisation assembly option as shown below. Otherwise we will see "no binary code" error in the package.
At this stage, the project configuration is down. The only remaining job is coding.
Add the name space
Then I declared a SforceService to hold the actual service.
Coding is always simple from my understanding, once we know what we need to do
Then drag a Union All component under the Script component, and add a data viewer between these two components
Execute the task, the user records are exported from Salesforce
So now you have the chance to cleanse the Salesforce data, import into SQL Server, and do your own reporting/analysis task.
Codes within the CreateNewOutputRows method:
public override void CreateNewOutputRows()
{
string email = "<login email>";
string password = "<login password>";
string token = "<security token>";
Service = new SforceService();
LoginResult LoginResult = Service.login(email, password + token);
string authurl = Service.Url;
Service.Url = LoginResult.serverUrl;
Service.SessionHeaderValue = new SessionHeader();
Service.SessionHeaderValue.sessionId = LoginResult.sessionId;
string sQuery = "SELECT FirstName, LastName, Title, Email FROM User";
QueryResult Qr = Service.query(sQuery);
if (Qr.size > 0)
{
foreach (sObject o in Qr.records)
{
if (o is User)
{
User u = (User)o;
Output0Buffer.AddRow();
Output0Buffer.FirstName = u.FirstName;
Output0Buffer.LastName = u.LastName;
Output0Buffer.Title = u.Title;
Output0Buffer.Email = u.Email;
}
}
}
}