Is it more efficient to use a map and call .get(), or use a set and call .contains()?

Prelude

I did a couple of interwebs searches, and I used the following search terms here on SFSE…

apex map get set contains
apex map vs set
apex map set efficien*

…then looked through a number of Qs & As and some links within some of the As. If I missed the answer, please forgive and point me to it.


Background

I know just enough of Apex and other coding to be dangerous. Adhering to best practices as much as possible is preferable, but I don’t know all the best practices. I also know that some debates are akin to “Toilet paper over or under?”, in that there are lots of opinions but not often is there a real better answer. (There is a correct answer to the toilet paper question, btw.)

Item at hand

I was sent an Apex class with an invocable method which a different team had created and our team needs to use. One bit of functionality that it was supposed to be doing was not working as stated, and some parts seemed to me that they could use some improvement. So I modified the code, and it was sent back to the code owners for their consideration. They kept most of my changes, but a blanket statement was made that I want to know whether it is true. Not (necessarily) so that I can say that I am correct (if I am), but just to know as part of the aforementioned best practices.

Statement

Maps are more efficient [than Sets] when doing a lookup.


Question

Is it more efficient to use a map and call .get(), or use a set and call .contains()?


Code Comparison

Showing only the pertinent code (and changing names to protect the [not so] innocent), here is the original:

List<Group> groups = [SELECT Id, Name FROM Group WHERE Name LIKE :queryString];
Map<String, Group> groupMap = new Map<String, Group>();

for (Group g: groups) {
    groupMap.put(g.Name, g);
}

Group targetGroup;
String targetGroupName;

for (...) {
    targetGroupName = <something different every loop iteration>
    targetGroup = groupMap.get(targetGroupName);
    if(targetGroup != null){
        // Found the only one we're looking for, so set the output
        break;
    }
}

My suggested changes:

List<Group> groups = [SELECT Name FROM Group WHERE Name LIKE :queryString];
Set<String> groupNameSet = new Set<String>();

for (Group g: groups) {
    groupNameSet.add(g.Name);
}

String targetGroupName;

for (...) {
    targetGroupName = <something different every loop iteration>
    if (groupNameSet.contains(targetGroupName)) 
        // Found the only one we're looking for, so set the output
        break;
    }
}

UPDATES

After the first three answers, I just have to say that I love y’all here at SFSE.

2021.11.12
Normally I would choose an answer, because I don’t like seeing questions with answers that clearly do answer the question and one is not accepted. However, all four of these answers are awesome, and helpful, and each contributes something different from the others. So if y’all don’t mind, I’m not going to accept just one.

Answer

In pretty much every concievable scenario, the number of records (or data in general) that we can work with in Salesforce in a single transaction means that any difference between approaches is likely to be immeasurable.

Things like this (using a set vs using a map) is something I would say falls under the category of micro-optimization. Micro-optimizations are a waste of time (to be clear, this is aimed at the performance assertion of this other team). If you are looking to optimize something, you should do two things:

  1. Define a specific thing to be improved that can be measured (requests per second, heap space, cpu time, # of queries, etc…)
  2. Measure that thing before your change, make your change, then measure it again

Personally, given the information you’ve provided, I’d prefer using a Set here because it requires less typing.

Ok, but for real, which is faster?

My gut says the Set approach, and I think it’d take less heap space too. This should be a fairly simple thing to test, just a bit tedious to set up.

Here’s my anon apex:

DateTime beforeDTControl, beforeDT1, beforeDT2, afterDTControl, afterDT1, afterDT2;
Long dtControl;
Integer beforeCPUControl, beforeCPU1, beforeCPU2, afterCPUControl, afterCPU1, afterCPU2, cpuControl;
Integer beforeHeap1, beforeHeap2, afterHeap1, afterHeap2;

Integer iterations = 100000;

// Set up benchmark data
Map<String, Account> acctMap = new Map<String, Account>();
Set<String> acctNameSet = new Set<String>();
List<Integer> randomIndex = new List<Integer>();
Long heapBefore = Limits.getHeapSize();
for(Integer i = 0; i < iterations; i++){    
    String name = 'testAcct' + i;
    acctMap.put(name, new Account(Name = name));
    randomIndex.add((Math.random() * 499).intValue());
}
Long heapAfter = Limits.getHeapSize();
acctNameSet = acctMap.keySet();
Long heapAfter2 = Limits.getHeapSize();

system.debug(heapAfter - heapBefore);
system.debug(heapAfter2 - heapAfter);
Set<String> randomNamesSet = new Set<String>();
for(Integer i = 0; i < 500; i++){
    Integer rand = (Math.random() * 10000000).intValue();
    randomNamesSet.add('testAcct' + rand);
}
List<String> randomNamesList = new List<String>(randomNamesSet);

// Timing our common loop so that we can remove its effect on the results
beforeDTControl = DateTime.now();
beforeCPUControl = Limits.getCpuTime();
for(Integer i = 0; i < iterations; i++){
    String name = randomNamesList[randomIndex[i]];
}
afterCPUControl = Limits.getCpuTime();
afterDTControl = DateTime.now();

dtControl = afterDTControl.getTime() - beforeDTControl.getTime();
cpuControl = afterCPUControl - beforeCPUControl;

beforeDT1 = DateTime.now();
beforeCPU1 = Limits.getCpuTime();
for(Integer i = 0; i < iterations; i++){
    String name = randomNamesList[randomIndex[i]];
    Account acct = acctMap.get(name);
    Boolean contained = acct != null;
}
afterCPU1 = Limits.getCpuTime();
afterDT1 = DateTime.now();

beforeDT2 = DateTime.now();
beforeCPU2 = Limits.getCpuTime();
for(Integer i = 0; i < iterations; i++){
    String name = randomNamesList[randomIndex[i]];
    Boolean contained = acctNameSet.contains(name);
}
afterCPU2 = Limits.getCpuTime();
afterDT2 = DateTime.now();
    
system.debug(String.format('map.get approach:\nClock Time elapsed: {0}\nCPU Time used: {1}', new List<String>{
    String.valueOf(afterDT1.getTime() - beforeDT1.getTime() - dtControl),
    String.valueOf(afterCPU1 - beforeCPU1 - cpuControl)
}));
             
system.debug(String.format('set.contains approach:\nReal Time elapsed: {0}\nCPU Time used: {1}', new List<String>{
    String.valueOf(afterDT2.getTime() - beforeDT2.getTime() - dtControl),
    String.valueOf(afterCPU2 - beforeCPU2 - cpuControl)
}));

The results of 5 runs (100k operations) are as follows:

run 1 run 2 run 3 run 4 run 5
map.get() 315ms clock, 297 CPU 354ms clock, 279 CPU 280ms clock, 257 CPU 355ms clock, 275 CPU 302ms clock, 294 CPU
set.contains() 346ms clock, 274 CPU 391ms clock, 312 CPU 230ms clock, 231 CPU 292ms clock, 274 CPU 338ms clock, 257 CPU

Conclusion

The run-to-run variance within each approach is on par with the difference between them, so there is no significant performance difference between the two approaches.

I was slightly kind to the map.get() approach in my measured runs by having the .get() and the comparison to null evaluated in a single expression (saving one assignment operation), but this difference is insignificant.

In either approach, you’re looking at a cost of around 0.00275 CPU units per call. Like I said, this is a micro-optimization. The biggest difference here would be in characters typed.

The Set approach also takes less heap space (since you aren’t storing the value as well as the key like you are with the Map approach). The Set approach takes roughly 57% of the heap space that the Map approach does in my test, and that would likely be even more in favor of the Set approach in real usage. Even if your query returns a single field, you automatically get the Id (and recordTypeId if you have a custom record type for the object).

Attribution
Source : Link , Question Author : Moonpie , Answer Author : Derek F

Leave a Comment