SalesForce Batch Compression Unexpected Error

In SalesForce’ Bulk API, on page 60 of their documentation:

Compression
The only valid compression value is gzip. Compression is optional, but strongly recommended. Note that compression
doesn’t affect the character limits defined in Batch size.

Ref. http://www.salesforce.com/us/developer/docs/api_asynch/api_asynch.pdf

When I don’t zip my string; it works fine.

But, when I zip my string with the following code, I get an error that says:

Got an unexpected error while processing BULK-API. Contact support with error ID: 424397084-63823 (-43640737)

It seems SalesForce isn’t decompressing my gzip’ed string… Am I doing this wrong?

I have added this to my HTTP request:

req.Headers.Add("Content-Encoding: gzip");

More Info

I log in with this:

SfdcBinding = new SforceService();

try
{
    CurrentLoginResult = SfdcBinding.login(UserName, Password);
}
catch (System.Web.Services.Protocols.SoapException e)
{
    // This is likley to be caused by bad username or password
    SfdcBinding = null;
    throw (e);
}
catch (Exception e)
{
    // This is something else, probably comminication
    SfdcBinding = null;
    throw (e);
}

//Change the binding to the new endpoint
SfdcBinding.Url = CurrentLoginResult.serverUrl;

//Create a new session header object and set the session id to that returned by the login
SfdcBinding.SessionHeaderValue = new SessionHeader();
SfdcBinding.SessionHeaderValue.sessionId = CurrentLoginResult.sessionId;

I create a job by generating this standard XML from the documentation:

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo
xmlns="http://www.force.com/2009/06/asyncapi/dataload">
<id>750x0000000005LAAQ</id>
<operation>insert</operation>
<object>Contact</object>
<createdById>005x0000000wPWdAAM</createdById>
<createdDate>2009-09-01T16:42:46.000Z</createdDate>
<systemModstamp>2009-09-01T16:42:46.000Z</systemModstamp>
<state>Open</state>
<concurrencyMode>Parallel</concurrencyMode>
<contentType>CSV</contentType>
<numberBatchesQueued>0</numberBatchesQueued>
<numberBatchesInProgress>0</numberBatchesInProgress>
<numberBatchesCompleted>0</numberBatchesCompleted>
<numberBatchesFailed>0</numberBatchesFailed>
<numberBatchesTotal>0</numberBatchesTotal>

I generate a CSV like this:

FirstName,LastName,Department,Birthdate,Description
Tom,Jones,Marketing,1940-06-07Z,"Self-described as ""the top"" branding guru on the West
Coast"
Ian,Dury,R&D,,"World-renowned expert in fuzzy logic design.
Influential in technology purchases."

I, finally, gzip the CSV string like so:

public static string Zip(string text)
{
    byte[] buffer = System.Text.Encoding.Unicode.GetBytes(text);
    MemoryStream ms = new MemoryStream();
    using (System.IO.Compression.GZipStream zip = new System.IO.Compression.GZipStream(ms, System.IO.Compression.CompressionMode.Compress, true))
    {
        zip.Write(buffer, 0, buffer.Length);
    }

    ms.Position = 0;
    MemoryStream outStream = new MemoryStream();

    byte[] compressed = new byte[ms.Length];
    ms.Read(compressed, 0, compressed.Length);

    byte[] gzBuffer = new byte[compressed.Length + 4];
    System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
    System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);
    return Convert.ToBase64String(gzBuffer);
}

And, I send my HTTP POST requests like this:

private static string WebRequestPostData(string url, string postData, string sessionId, string contentType)
{
    System.Net.WebRequest req = System.Net.WebRequest.Create(url);

    req.Headers.Add("X-SFDC-Session: " + sessionId);
    req.Headers.Add("Content-Encoding: gzip");
    //req.ContentType = "application/xml";
    //req.ContentType = "text/csv";
    req.ContentType = contentType + "; UTF-8";
    req.Method = "POST";

    byte[] bytes = System.Text.Encoding.ASCII.GetBytes(postData);
    req.ContentLength = bytes.Length;

    using (Stream os = req.GetRequestStream())
    {
        os.Write(bytes, 0, bytes.Length);
    }

    using (System.Net.WebResponse resp = req.GetResponse())
    {
        if (resp == null) return null;

        using (System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream()))
        {
            return sr.ReadToEnd().Trim();
        }
    }
}

Answer

From Using Compression for Responses (page 13):

In API version 27.0 and later, Bulk API can optionally compress response data which reduces network traffic and improves
response time.
Responses are compressed if the client makes a request using the Accept-Encoding header, with a value of gzip.

So firstly, make sure you are specifying API version 27.0 or greater in the serverURL.

Secondly, most of the documentation references compression in the response rather than the request. It would probably still support compression in the request, but the real benefit with the bulk API will be compression in the response.

If you cast your WebRequest to a HttpWebRequest you can use the AutomaticDecompression property to handle this for you (adds Accept-Encoding header and decompresses the response stream).

Back to the request headers… Try setting the request header as:

req.Headers["Content-Encoding"] = "gzip";

I’ve been successful in compressing the requests by inheriting from WebRequest and overriding GetRequestStream() and EndGetRequestStream() to inject a GZipStream in to handle the compression. Unfortunately I can’t directly share this as it is part of a product.

However, the following should give you a general idea of how to gzip the request (Note, I haven’t actually run it):

string url = "...";

System.Net.WebRequest req = WebRequest.Create(url);
req.Method = "Post";
req.ContentType = "text/csv";
req.Headers["Content-Encoding"] = "gzip";
req.Headers["X-SFDC-Session"] = sessionId;

System.IO.Stream reqStream = req.GetRequestStream();

// Create a gzip stream that will send the compressed data into the WebRequest Stream
GZipStream gz = new GZipStream(reqStream, CompressionMode.Compress);

System.IO.StreamWriter sw = new System.IO.StreamWriter(gz, Encoding.ASCII);
sw.Write(postData);
sw.Close();

gz.Close();
reqStream.Close()


using(System.Net.WebResponse resp = req.GetResponse()) {
    // read in response stream
}

Attribution
Source : Link , Question Author : user1477388 , Answer Author : Daniel Ballinger

Leave a Comment