Sunday 10 February 2013

The importance of logarithmic transformation in 'natural' data


Reading the Edward Tuft book about data analysis in politics and policy

http://www.edwardtufte.com/tufte/dapp/

Edward Tuft is one of the gurus of Data Analysis visualization [0], and in this chapter [1] he show in a very didactic and clear way the importance of logarithmic transformation for data of naturally occurring counts.
   [0] http://en.wikipedia.org/wiki/Edward_Tufte
   [1] http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0003uF

The importance of logarithmic transformation


http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0003uF

This is a very clear and didactic explanation of the importance of logarithmic transformation that anyone on doing data analysis in natural sciences or epidemiology must read.

And a very important point is to raise the point that regression analysis of a model DOES NOT TEST the relationship but SHOWS the proportionality GIVEN THE MODEL BEING TRUE 


The end part of this section has a bit more of mathematics that some biologist probably have already forgotten but it is worthy to read it anyway.

I truly recommend reading this even it is a very old book (ed. 1976).

Final note: Remember to add 1 to your data before log transform in order to avoid log(0). Don't do that if you have negative number ;-). Other option is to add a small quantity to all your 0s

No comments: