Day two kicked off with a fantastic keynote by
Atul Butte, Associate Professor in Medicine and Pediatrics, Stanford University
School of Medicine: Translating a trillion points of data into therapies,diagnostics and new insights into disease [PDF] [Video on Youtube]. This one was well worth a separate blogpost.
Butte starts his presentation with some great
examples of how the availability of a wealth of open data has already radically
changed bio/medial research. Over one million datasets are now openly available
in the GeneChip standardized format. A search for breast cancer samples in NCBI Geo datasets database gives 40k results, more than the best lab will ever have
in their stores. And PubChem has more samples than all pharma companies
combined, completely open.
The availability of this data is leading to
new developments. Butte cites a recent study that by combining datasets
revealed ‘overfitting’, where everybody does an experiment in exactly the same
way leading to reproducable results that are irrelevant to the real world.
But this is tame compared to the change in the
whole science ecosystem with the advent of online marketplaces. Butte goes on
to show a number of thriving ecommerce sites - “add to shopping cart!” - where
samples can be bought for competitive prices. Conversant Bio is a
marketplace for discarded samples from hospitals, with identifiers stripped
off. Hospitals have limited freezer space, have biopsy samples that can be
sold, and presto. What about the ethics? "Ethics is a regional thing. They can get away with a
lot if stuff in Boston we can't do in Palo Alto." Now any lab can buy
research samples for a low price and develop new blood marker tests. This way recently a test was developed for preeclampsia, the disease now best known from Downton Abbey.
Marketplaces also have sprung up for services, such as AssayDepot.com. This is a clearinghouse for medical
research services, including animal tests. Thousands of companies provide these
worldwide. Butte stresses that it's is not just a race to the bottom and to China, but that this
also creates opportunity for new specialised research niches, such as a lab
specializing in mouse coloscopies. Makes it possible to do real double blind
tests by just buying more tests from different vendors (with different
certifications, just to spread). This makes it especially interesting to
investigate other effects of tested and approved drugs. Which is a good thing,
because the old way of research on new drugs is not sustainable when patents
run out (the “pharma patent cliff of 2018”).
This new science ecosystem is built on top of the availability
of open data sets, but there are questions to be solved for the sustainability. Butte sees two players here, funders and repositories themselves.
Incentives for sharing are lacking. Altmetrics are just beginning, and funders
need to kick in. Secondary use grants are an interesting new development.
Clinical trials may be the next big thing. The most expensive experiments in
the world, costing $200 mln each. 50% fails and not even a paper is written about them... Butte expects funders to start requiring publications on negative trails and publishing
of the raw data.
The international repositories are at the
moment mostly government funded and this may run out. Butte thinks that mirroring
and distributing is the future. He also stresses that repositories need to
bring the cost down - outsourcing! - and real show use cases, that will inspire people. The
repositories that will win are the ones that yield the best research.
No comments:
Post a Comment