This post is the third in a blog series focused on researchers’ experiences with open science practices, highlighting the impact of sharing data, code and protocols openly.
Professor Zornitza Stark is a clinical geneticist at the Victorian Clinical Genetics Services in Melbourne, Australia. She works in the field of translational genomics, specialising in rare diseases. As a senior researcher, Zornitza Stark has wide publishing experience, with recent articles published in Nature Portfolio journals, such as this Call to action to scale up research and clinical genomic data sharing and Feasibility, acceptability and clinical outcomes of the BabyScreen+ genomic newborn screening study. She is also Associate Editor for NPJ Genomic Medicine and has frequently published in the European Journal of Human Genetics.
We asked Professor Stark to share her experiences of open science practices and the impact that open data sharing has on patients and their families.
I'm a clinician researcher, so my time is split between clinical work and research. My research tends to be very translational; it's closely related to the type of clinical work that I do, which is primarily with families affected by rare disease. There are several projects that we're currently working on, including genomic newborn screening. We are also exploring how to leverage machine learning and artificial intelligence approaches to scale up the analysis or reanalysis of genomic data. I'm heavily involved in novel gene discovery for rare conditions and research into how to speed up the implementation of genomic testing into clinical practice, in particular how to generate the evidence that governments and policymakers need to fund it as part of healthcare. A big portion of my work involves working out the health economics for specific clinical applications of genomics.
Translational genomics is a relatively new field, which involves the application of new technologies. Working out how to do it is heavily dependent on sharing knowledge, but genomics also depends on the sharing of data; we need data for as many individuals as possible to work out what is rare and significant.
As a community, we share data using specific genetics platforms, such as Gene Matcher, ClinVar, and Shariant (an Australian genomics-led variant sharing initiative). I also lead PanelApp Australia, a platform for sharing information on gene–disease associations for diagnostic use. Through this platform , we contribute to the Gene Curation Coalition, which is an international aggregation database of gene-disease relationships.
From the sharing of code right through to how to conduct an economic evaluation as a protocol, or publishing a paper open access, we’re sharing knowledge on how to do relatively new tasks, because we're all new to this field.
“Open science ensures we share knowledge as quickly as possible, because that's what impacts both research and clinical outcomes”
- Zornitza Stark, Victorian Clinical Genetics Services, Melbourne, Australia
I think most people are in medicine and research for altruistic reasons. I believe in the underlying premise of open science. We are also generally funded by public funds, and work in a public healthcare system. I strongly feel that public funds should be used to create public goods.
Open science also ensures we share knowledge as quickly as possible, because that's what impacts both research and clinical outcomes. And those two are very closely intertwined in my type of work.
For the area that I work in, there is a lot of dependence on infrastructure. We know there is willingness from patients and research participants to share their data — that’s generally not a barrier — and most researchers also feel the same way. When we have been held back, it has been through a lack of political will to resolve potential legal issues and the lack of investment in large-scale infrastructure that would enable much more data sharing to occur.
As an example, as a federated country, Australia has not been able to create a centralised Australian genomic dataset that could aggregate genomic data from across the country. This is due to a lack of investment in infrastructure and agreement between all the parties that we can share across borders. Contrast that to the National Genomic Research Library set up in England, which has managed to aggregate hundreds of thousands of datasets into a single infrastructure that can be queried by researchers.
“Sharing data openly has also had an impact on citations: being recognised by the international community as a contributor increases the visibility of our work.”
- Zornitza Stark, Victorian Clinical Genetics Services, Melbourne, Australia
We have been better at organising data sharing for other types of data, including data at gene variant level, which is crucially important in terms of facilitating the accurate interpretation of genomic data. Because it is considered to be more de-identified, there has been less objection, and we have been able to build the necessary infrastructure to share this data type. As a country, this infrastructure has enabled us to participate in international databases such as ClinVar, massively accelerating our submission to this open international resource.
Having our own data accessible in both national and international databases actually alerts us when we have potentially misinterpreted something. It allows researchers to get in touch with us if they disagree with our assessment, which is often very helpful because it can have an impact on patient diagnosis. From a research perspective, it has enabled us to participate in the assembly of international cohorts of patients with rare diseases because it has alerted other researchers that there are patients in Australia with those conditions and provided a means to get in touch with the clinicians and the families to participate in that international effort. This has helped us to understand the natural history of rare conditions better and to link families with emerging therapies for rare diseases.
Another area to highlight that is almost entirely dependent on open data sharing is the discovery of new gene–disease associations; data sharing massively accelerates gene discovery. For example, the discovery of the RNU4-2 gene a couple of years ago was made possible through large-scale data sharing; a cohort of over 100 patients was assembled within a matter of weeks. It was a groundbreaking discovery, because we think this one gene accounts for nearly 1% of all cases of intellectual disability.
Sharing data openly has also had an impact on citations; being recognised by the international community as a contributor increases the visibility of our work.
For us, open science is the accepted way of working. Occasionally, I guess people may feel reluctant to share their work because they feel anxious about competition. My role as a senior researcher is to deal with those anxieties and instead emphasise all the benefits of being part of the international community and large-scale collaboration. Working this way just massively increases our capacity to produce world-class research.
The principal issue is infrastructure as an enabling factor. We need much more strategic planning and high-level support that extends beyond individual projects or groups, backed with investment and the enabling infrastructure to magnify the benefits of open access to data. For example, several of the datasets that we've generated over the past decade are no longer available or accessible because the funding for a particular project finished. Although the participants consented to data sharing, with the lack of infrastructure for data deposition, these data have now been lost, which has been heartbreaking to watch.
"My advice is always to embrace open science and to be an active participant in the international community; that will enrich your research and also your life.”
- Zornitza Stark, Victorian Clinical Genetics Services, Melbourne, Australia
I think my advice is always to embrace open science and to be an active participant in the international community, and that will only enrich your research and also your life.
There is much greater recognition of the importance of open science, and there is momentum: we have seen investment in large-scale projects in several countries, including the US and UK. They have created the opportunity to build some of the enabling infrastructure to create trusted research environments and develop and implement the standards that will enable data sharing. The demonstration that something is possible, and how to do it, is really important to move people forward in creating similar investments and participating together in these endeavours, and I hope will accelerate open science practices further.
Building trust through transparency: An open science conversation with Geir Kjetil Sandve
Starting your career? Here’s how Springer can help you publish open access
Don't miss the latest news and blogs, sign up to The Researcher's Source Monthly Digest!