18 months of effsize

I developed the R package effsize, one and half years ago, in July 2013.

It is a package for efficient effect size computation. The computation algorithms have been optimized to allow efficient computation even with very large data sets. It contains the functions to compute the standardized effect sizes for experiments (Cohen d, Hedges g, Cliff delta, Vargha and Delaney A).

Today I looked at the download statistics from http://cran-logs.rstudio.com/. The mirror linked to the most famous R IDE, R Studio. The statistics refer to only one out of 100+ mirrors of the R project.

The number of unique IP, which downloaded is shown in the following figure:


What I saw represents a success for me,during the last year an average 311 distinct user per month installed the package.

Of course that figure is nothing, when compare to very famous packages, such as ggplot2 or plyr, that count monthly downloads in the order of 15000, i.e. 50 times more than mine.

Nevertheless for being a small, very limited, package I consider it a success. I take it as a stimulus to devote my time to this small piece of code.


Bullet Graph in R

Bullet graphs are an effective and efficient visual representation for key indicators (e.g. KPI), that were proposed by Stephen Few. I find them highly suitable to build dashboards. For this purpose I developed an implementation in R to draw such graphs.

In summary a bullet graph is a variation of a bar graph with additional references that consist of a thick line that represents the reference point (e.g. benchmark, goal, or previous value) and a background that identifies three levels (e.g. Low-Medium-High, Bad-Average-Good, etc.).

The figure below illustrates the main components, though a complete specification is available on Few’s web site.

Bullet Graph elements (from Wikipedia)

The function bulletgraph() provides a simple interface to plot a bullet graph, for instance to reproduce the example above we can use the following statements:

par(mfrow=c(2,1), mar=c(2,9,.1,1))

            name= "Revenue 2005 YTD",subname="(U.S. $ in thousands)",

            name= "Revenue 2005 YTD",subname="(U.S. $ in thousands)")

Which generates the following diagram:


Example of generated bullet graph

The code is available as open-source under the GPL at:

Enjoy and let me know!

Update: added an option (colored=F) to have gray scale background (as recommended in the specification), and a subtitle to be able to reproduce the example.

Update 2: now the revised and tested code, with documentation too, is available on GitHub, here: https://github.com/mtorchiano/MTkR/wiki/Bullet-Graph

Effect size of R precision

The R statistical package is a widely used software, I use it myself to analyze data for my studies. As any computer scientist knows, numeric representations for non-integer numbers in computers are mostly approximate. Typically using the IEEE 754 standard and R is no exception.

Due to the approximate nature of numeric variables it is possible that the result of an expression containing several operations is not correct. How much are the errors due to representation accuracy relevant?

Continua a leggere

R optimization: midpoint

In several algorithm (e.g. binary search) you need to find the mid-point among two indexes. In practice you have to compute the integer average of the two indexes. In R there are several ways of performing the computation, three that may easily come to your mind are:

(x+y) %/% 2
as.integer( (x+y)/2 )

Which one is the most efficient?

To find out I run a small experiment:

Continua a leggere