The R statistical package is a widely used software, I use it myself to analyze data for my studies. As any computer scientist knows, numeric representations for non-integer numbers in computers are mostly approximate. Typically using the IEEE 754 standard and R is no exception.

Due to the approximate nature of numeric variables it is possible that the result of an expression containing several operations is not correct. How much are the errors due to representation accuracy relevant?

A typical example of paradoxical result is e.g.:

> sum(c(1,2,-3)) [1] 0 > sum(c(1,2,-3)/10) [1] 2.775558e-17

This result is due to the fact that the values 0.1, 0.2 and 0.3 cannot be represented precisely using the floating point representation. In the following I will recap the principles of the floating point representation and provide some basic advices to improve the accuracy of the results.

## Floating point representation

The floating point representation is based on the following formula:

where

** s** is the sign (represented by 1 bit)

**is the coefficient that always take the form of 1.xxx**

*c***is the exponent can be positive or negative**

*e***is the base (typically 2) and is fixed**

*b*Since *c* as well as s and e are expressed in base *b*, some values result into periodic numbers. The value 0.1 (decimal) is such an example; in binary fixed point notation it can be written as 0.00011001100110011001100…

Using the floating point representation it can be represented as:

s = 0

c = 1.1001100110011…

e = -100

In addition to floating point values, R also uses the integer representation that is a precise (non approximate) representation.

## Relevance of accuracy errors

For several applications, the accuracy errors deriving from the limitations of the floating point representation are negligible in absolute term. For instance, the sum we observed before has the magnitude of 10^-17, which for most applications falls outside the relevant domain.

Though, independently of the application and the relevant domain, one case is particularly critical: whenever we have to compare the result to some predefined value, e.g. to a significance level. Again with the previous numbers:

> sum(c(1,2,-3)/10) == 0 [1] FALSE

The consequences of a wrong comparison may be dramatic, up to drawing the wrong conclusion when applying inferential statistic.

## Precision improvement advices

Often when dealing with floating point number we have to accept the accuracy errors and try to apply a few tricks to at most mitigate it (e.g. perform the operation between values with similar order of magnitude first).

There is one specific condition when we can apply some basic techniques to minimize and possibly avoid such errors: when integers values are involved in a calculation that eventually results into a floating point number.

The simple rule is: first perform all the operation among integers with integer results and only at the end of the calculation perform the operation that return floating point variables. For instance, in our running example, first perform the sum among integers that returns an integer, only later apply the division that returns a floating point result:

> sum(c(1,2,-3))/10 == 0 [1] TRUE

It is not always so easy, sometimes you need to transform the formulas.

## A practical example

Let’s take a simple example, the function to compute the Vargha and Delaney A statistic. this is an effect-size measure that has been recommended – together the similar Cliff’s delta – when non-parametric statistics cannot be applied.

Let us take the code reported in this post (slightly polished):

AMeasure <- function(a,b){ r = rank(c(a,b)) r1 = sum(r[seq_along(a)]) m = length(a) n = length(b) A = (r1/m - (m+1)/2)/n # critical A }

On line 6 the formula for computing *A* uses three integer values: *r1*, *m*, and *n*. The computation suffers from accuracy error, e.g.

> AMeasure(1:10,9:18) == 0.02 [1] FALSE

Using the above rule, the formula can be replaced with a formally equivalent one that first performs all integer bound operations (* – +) and eventually performs the non-bound operation (/):

A = ( 2*r1 - m*(m+1)) / (2*m*n)

Such a change removes the accuracy error and allows for a correct comparison:

> AMeasure1(1:10,9:18) == 0.02 [1] TRUE

In practice a small change can remove the problem. In this specific case the formula rewrite increased the number of operations from 5 to 7. Overall the performance penalty resulting from this change is about 3% increase in time. Probably worth while the accuracy improvement.

## Appendix: the whose source code.

## The original function AMeasure <- function(a,b){ r = rank(c(a,b)) r1 = sum(r[seq_along(a)]) m = length(a) n = length(b) A = (r1/m - (m+1)/2)/n # critical A }

## The function with modified formula AMeasure1 <- function(a,b){ r = rank(c(a,b)) r1 = sum(r[seq_along(a)]) m = length(a) n = length(b) A = (2* r1 - m*(m+1))/(2*m*n) A } # input data x = 1:10 y = 9:18 AMeasure(x,y) == 0.02 AMeasure1(x,y) == 0.02

# empirical time performance comparison orig=c() new=c() for(n in 1:100){ orig = c(orig, system.time(for(i in 1:5000) AMeasure(x,y))[1] ) new = c(new, system.time(for(i in 1:5000) AMeasure1(x,y))[1] ) }

## visualize difference boxplot(list(orig,new)) mtext(median(orig),side=3,at=1) mtext(median(new),side=3,at=2) perf.loss = (median(new) - median(orig)) / median(orig) segments(1,median(orig),2,median(new),col="red",lwd=2) text(1.5,median(new),paste(round(perf.loss*100,1),"%",sep=""),col="red")