Moving to parallel testing has made one of our builds run 4x faster but the test result files returned from:
after polling via:
vary on every run. This is illustrated by the last section of this Jenkins trend chart:
or this individual Jenkins test summary (showing the difference from the last build):
Is losing consistent test results the price you pay for parallel testing, or is there a way to get a consistent count?
The actual code that is running is this https://github.com/claimvantage/sfdx-jenkins-shared-library/blob/master/vars/runApexTests.groovy
Adding the query that Daniel Ballinger suggests in our build confirms that the variation is that entire classes don’t get tested e.g.:
... BenefitClaimedPaymentsControllerTest Completed (4/4) BenefitReductionsTest Failed Could not run tests on class 01p1k000000EEbF because: connection was cancelled here BenefitReductionsWithLookbackTest Completed (22/22) BenefitReductionGraphsControllerTest Completed (1/1) ...
I just completed a parallel test run. Note that all these tests currently pass when run individually or in the synchronous test mode. After the parallel run I then ran the same SOQL query you are:
select Status, MethodsEnqueued, MethodsCompleted, MethodsFailed from ApexTestRunResult where AsyncApexJobId = '7071W00006mnz7p'
That came back with:
- Status: Completed
- MethodsEnqueued: 811
- MethodsCompleted: 797
- MethodsFailed: 13
This aligns somewhat with your findings. E.g. Why doesn’t
A query against
ApexTestResult for the Apex Job showed only 797 records.
select Id, TestTimestamp, Outcome, ApexClassId, MethodName, AsyncApexJobId, QueueItemId, RunTime from ApexTestResult where AsyncApexJobId = '7071W00006mnz7p'
And that included the 13 marked as failures. So the first thing we can conclude is that
ApexTestRunResult.MethodsFailed is actually included in the
ApexTestRunResult.MethodsCompleted count and there are 14 test methods unaccounted for.
I think the important details are in ApexTestQueueItem with the Status Failed.
select Id,ApexClassId, ApexClass.Name,Status,ExtendedStatus,ParentJobId,TestRunResultId,ShouldSkipCodeCoverage from ApexTestQueueItem where ParentJobId = '7071W00006mnz7p' order by Status desc
There are three test classes that outright failed to run. They failed with the statuses:
- Status: Failed
- ExtendedStatus: “Could not run tests on class 01p40000000HDZA because: connection was cancelled here”
I checked the number of test methods in the classes that failed. They were 2, 1, and 11 respectively. So that explains my missing 14 test methods. Those test methods where neither Completed or Failed. They just didn’t run.
For my scenario – There is a data isolation issue when running the tests in parallel that manifests as a “connection was cancelled here” error. As a result, the test methods aren’t run and no results come back for them. As it is a concurrency error the number of failures will vary from test run to test run. I’ve gone into some of the painful details of this in Speeding up Salesforce unit testing performance.
Your scenario is likely similar. You could try adding some sort of guard statement that
ApexTestRunResult.ClassesCompleted to make sure entire classes aren’t being excluded from the results.
With this being a concurrency issue it would also be useful to control the number of test cases that run at any one time. The current limit of 30 is way to high when there are issues siloing test data. Please consider voting for the idea – Control the degree of parallelism when running apex tests in parallel
I’ll need to gather some more data to prove it, but my suspicion is that is caused by using the Steaming API to monitor and report on the results.
It will either be a timing issue between when the test job finishes and when the last CometD message is received or possibly an outright failure to receive a message. Either way, the SFDX CLI command is missing a number of ApexTestResult (07M) records.
I’d take your test ApexJobIds and directly query for the corresponding ApexTestResult records. E.g.
select Id,TestTimestamp,Outcome,ApexClassId,MethodName,AsyncApexJobId,QueueItemId,RunTime from ApexTestResult where AsyncApexJobId = '7071W000052hsrXQAQ'
The other possibility is that the
force:apex:test:run command isn’t queuing up all the existing test classes. If it is somehow not creating the ApexTestQueueItem records for a subset of Apex classes then those tests would never run. I’d confirm that the expected number of records are created for the ApexJobId.
select Id,ApexClassId,Status,ExtendedStatus,ParentJobId,TestRunResultId,ShouldSkipCodeCoverage from ApexTestQueueItem where ParentJobId = '7071W000052hsrXQAQ'
There should be one record per Apex test class.