# How to identify outliers in server uptime performance data?

I have a python script that creates a list of lists of server uptime and performance data, where each sub-list (or ‘row’) contains a particular cluster’s stats. For example, nicely formatted it looks something like this:

-------  -------------  ------------  ----------  -------------------
Cluster  %Availability  Requests/Sec  Errors/Sec  %Memory_Utilization
-------  -------------  ------------  ----------  -------------------
ams-a    98.099          1012         678          91
bos-a    98.099          1111         12           91
bos-b    55.123          1513         576          22
lax-a    99.110          988          10           89
pdx-a    98.123          1121         11           90
ord-b    75.005          1301         123          100
sjc-a    99.020          1000         10           88
...(so on)...


So in list form, it might look like:

[[ams-a,98.099,1012,678,91],[bos-a,98.099,1111,12,91],...]


### My question:

• What’s the best way to determine the outliers in each column? Or are outliers not necessarily the best way to attack the problem of finding ‘badness’?

In the data above, I’d definitely want to know about bos-b and ord-b, as well as ams-a since it’s error rate is so high, but the others can be discarded. Depending on the column, since higher is not necessarily worse, nor is lower, I’m trying to figure out the most efficient way to do this. Seems like numpy gets mentioned a lot for this sort of stuff, but not sure where to even start with it (sadly, I’m more sysadmin than statistician…). When I asked over at Stack Overflow, someone mentioned using numpy’s scoreatpercentile function and throw out anything over 99th percentile – does that seem like a good idea?

(Cross-posted from stackoverflow, here: https://stackoverflow.com/questions/4606288)

Based on the way you phrase the question

are outliers not necessarily the best
way to attack the problem of finding
As an example, if all of your servers were at 98 $\pm$ 0.1 % availability, a server at 100% availability would be an outlier, as would a server at 97.6% availability. But these may be within your desired limits.