Ghost processes / batch processes (NA34 and NA39)

3-30-2017 Just got a call from sf tier one wanting me to show them how to replicate the issue SMH. They then point me to the dev forums and closed the case ugh


We are seeing multiple customers having issues with external systems logging transactions from Salesforce IP’s during times when there are no batch processes running in salesforce.

Background

  1. Daily batch that sends records to an external system
  2. Batch runs without issue daily starting at 2AM
  3. Batch takes on average 5-10 to complete

We are seeing within the last week or so customers complaining that duplicate transactions are occurring daily

Several of these customer have been running the same version of our software for over 5 years without any issues. No org changes have occurred (not sure of a split). nothing shows up in audit trail. Same records have been sent on a routine basis to the external system (monthly/annually) without change. It just started happening out of the blue

Heuristics

  1. Duplicate timestamps are varying times (i.e. 4:23AM, 5:31AM)
  2. NO batch processes are running in the org at the time
  3. Some customers have Full Sandboxes, some do not
  4. Ids sent to the external system match an Id of a record in the production org
  5. No timestamps in that timeframe exist on the records
  6. Our error logging does not log anything
  7. No records indicating the communication took place are created

Summary

We can find nothing that is running in the orgs that would be sending the records to the external system. Given some customers do not have full sandboxes, I ruled out the possibility fo the process be ran from a sandbox by mistake. Either way, the timestamps would suggest that not be the case as well

Question

  • Is anyone else seeing this behavior?
  • Is it possible that there is a critical issue where shadow processes are being run for whatever reason by SF? Maybe an org split causing things to go haywire?

I know this is a bit broad and subjective, etc but I would really
appreciate if anyone that may have seen something similar chime in.
Trying to put this in as a support ticket to SF would be, well, I
would rather get married again, poke my eye with glass, pull all my
fingernails off, you get the point.

I have never ran into anything like this in the last 7 years working on the platform and I am at a complete loss. My only hope is to find others that have seen a similar thing occurring in the last week or so

Any suggestions on troubleshooting (other than contacting SF support) or if you have seen anything similar would be helpful…..

Answer

Not sure this is worthy of an answer, but too much to put as a comment. The only time I have seen anything remotely similar was with this and I was able to prove to support that a job had run and created duplicates.

The info I got back from Tier 3 at the time:

As we discussed, during the execution, an internal server error on the asynchronous process for the batch, resulted in cloned message in the asynchronous queue to not be removed resulting in duplicate messages getting worked.
As a result of unlucky timing when an exception is thrown, this may result in cloned messages not being deleted and actually executed in parallel to actual messages

A Bug has been logged to fix this issue from Salesforce end.

It is scheduled in patch release next week.

‘Last week’ was just over 2 years ago. I can share my case # if you wish. The case status is currently: Closed – Bug Fix Submitted

Worth noting our customer stopped using our app ~6 months after but I never saw a re-occurrence in that time. It would also have been an EUn pod.

Good luck.

Attribution
Source : Link , Question Author : Eric , Answer Author : Community

Leave a Comment