Bulk Data API

Report 8 Downloads 219 Views
Bulk Data API Nick Simha Technical Alliance Manager

Agenda  Bulk Data API basics  Demo  Best practices  Resource list

What is the Bulk Data API?  REST based, asynchronous API optimized for loading large sets of data.

Why is it useful?  Enable high volume integration with Salesforce (volume)  Enable integration that has to finish in a certain window of time (speed)  Part of suite of features that enable our customers to store very large data volumes in Salesforce Batch Apex Skinny tables Divisions Custom Indexing Etc.

How does the Bulk Data API works Loop until all records sent (e.g. 50 times for 500k rows)

Decoupled phases. One doesn't wait for the other, each can run in parallel

Loop until all files processed

How can I call the Bulk Data API?  Through Data Loader  From any Web Services Client

Java, C# etc.

 From the command line!  Support by our integration partners

Using the Bulk API from a client  Create Job  Create Batch (es) and add to Job Number of batches determined by the amount of data and the limits on batch size.

 Close Job  Retrieve Batch Status  Retrieve Batch Result

Sample Request POST /services/async/17.0/job HTTP/1.1 User-Agent: curl/7.19.6 (i386-pc-win32) libcurl/7.19.6 OpenSSL/09.8k zlib/1.2.3 Host: na6.salesforce.com Accept: */* X-SFDC-Session:00D80000000MD0n!AQgAQI1EfPPEyWvwuaD_IRpvSlrwm7Kr00e Content-Type: application/xml; charset=UTF-8 Content-Length: 195 <jobInfo xmlns="http://wwwforce.com/2009/06/asyncapi/dataload"> insert ContactCSV

See Getting Started Chapter in the API guide. Use curl trace-ascii to capture messages.

Demo  Load 100K Addresses of medical providers  Cleanse the data

Bulk API - Some Additional Information  Can Monitor Bulk Loads in Builder Monitoring -> Bulk Data Load Jobs

 Doesn t handle attachments  Governor limits 500 batches per 24 hour limit. 10,000 records per batch. So theoretical limit of 5M records per day. Caveat

Batch size also needs to be less than 10MB

 Batch limits can be increased by engineering Need to contact your SE

Best practices  Combine Bulk API with Batch Apex to get optimal performance Faster than complex triggers Similar to the demo many scenarios

this is a generic pattern that you can use in

 Stick with parallel processing unless there is a reason not to do so See FAQ for scenarios when you would serial processing

 Handling Very Large Data Volumes requires a comprehensive, holistic approach Bulk API is one part of the solution

Resource list  Bulk API doc http://www.salesforce.com/us/developer/docs/api_asynch/index.htm