Wednesday 21 July 2010

nice blog entries about GWAS QC checks

Three very nice blog entries from campus coworkers to read:

First two about how to scrutinize GWAS, and one about the sociological, ethical and political issues of giving feedback to the DNA donors:

* [Daniel MacArthur] Serious flaws revealed in "longevity genes" study
* [Jeff Barret] How to read a genome-wide association study
* [Vincent Plagnol] Communicating genetic data to DNA donors


The points of the two first blog entries are relevant to the recent Nature paper
Prepublication data sharing where one of the recommendations of the Toronto International Data Release Workshop is:

Editors and reviewers

As reviewers of manuscripts submitted for publication, scientists should be mindful that prepublication data sets are likely to have been released before extensive quality control is performed, and any unnoticed errors may cause problems in the analyses performed by third parties. Where the use of prepublication data is limited or not crucial to a study's conclusions, the reviewers should only expect the normal scientific practice of clear citation and interpretation. However, when the main conclusions of a study rely on a prepublication data set, reviewers should be satisfied that the quality of the data is described and taken into account in the analysis.

Participants at the Toronto meeting recommended that journals play an active part in the dialogue about rapid prepublication data release (both in their formal guide to authors and informal instructions to reviewers). Journal editors should remind reviewers that large-scale data sets may be subject to specific policies regarding how to cite and use the data. Ultimately, journal editors must rely on their reviewers' recommendations for reaching decisions about publication. However, encouraging reviewers to carefully check the conditions for using data that authors have not created themselves can help to raise both the quality of analysis and fairness in citation of published studies.

If the reviewers start to ask for these checks, studies using big consortium data (WTCCC etc) would be fine but studies would face serious problems if using data from small labs without web page or metadata information availability other than the supplementary information (if any) in a low impact paper. I wonder if would it be possible that resources like EGA , dbGAP or Gen2Phen would have tools to facilitate this checks to the users, referees and readers in the public metadata area (as the data is expected to be in any of these repositories) and have a very 'proactive attitude' asking for this kind of data as complete as possible to the submitter and backing this request with this nature or similar paper.


Finally the image of the Daniel's entry comparing the science longevity paper Manhattan plot vs WTCCC Manhattan plots

No comments: