Monday, January 23, 2012

Bland - Altman plot in R

Not so long time ago I did a comparison study of two software packages, more precisely their ability to estimate random effects. The plain approach was to do correlations among the outcomes, and if they were high I assumed that both programs are OK. I presented this to my more experienced colleagues who suggested to use Bland-Altman plots to confirm the results. This is a reasonably simple technique to measure the agreement of the two outcomes. As they say in their paper Statistical methods for assessing agreement between two methods of clinical measurement (pdf linked, worthwile to read):

In clinical measurement comparison of a new measurement technique with an established one is often needed to see whether they agree sufficiently for the new to replace the old. Such investigations are often analysed inappropriately, notably by using correlation coefficients.The use of correlation is misleading. An alternative approach, based on graphical techniques and simple calculations, is described, together with the relation between this analysis and the assessment of repeatability.

 Simple, yet beautifull technique.

Naturally I was searching for implementartion in R. I quickly found and installed a package called ResearchMethods, but personally I thought it has very few soft tuning possibilities (e.g. scaling or color change). I went deeper and found a quite well documented page with a custom/modified R function. 

I used this one, as I could scale the y axis, so the final plot was nicer. On the other hand I also found that this particular function was written for demonstration purposes, but it had difficulties to run on other data.

For example the limits of the y axis were hard coded to -60, 60, which is quite problematic if you are interested in much higher or smaller differences (In my case on the 0.01 level...). Also the data set names were hard coded into the funtion.

So I modified the code to a more generic function like this:

BAplot <- function(x,y,yAxisLim=c(-1,1),xlab="Average", ylab="Difference") {
   d <- ((x + y)/2)
   diff <- x - y        
   plot(diff ~ d,pch=16,ylim=yAxisLim,xlab=xlab,ylab=ylab)
   abline(h=mean(diff)-c(-2,0,2)*sd(diff),lty=2)
}
 You call it as:
BAplot(testSet1,testSet2,yAxisLim=c(-0.1,0.1))

The "testSet"s are the datasets to compare, the yAxisLim modifies the scaling of the axis according to your needs. You might modify the labels of x and y axis if you wish with xlab and ylab.