Building trust through transparency: An open science conversation with Geir Kjetil Sandve

T
The Researcher's Source
By: Erika Pastrana, Mon Apr 13 2026
Erika Pastrana

Author: Erika Pastrana

Vice President, Nature Research Journals Portfolio

Welcome to a new series of blog posts, where researchers share their experiences of open science practices and the impact that sharing open data, code or protocols can have. 

Geir Kjetil Sandve is Professor of Scientific Computing and Machine Learning at the University of Oslo in Oslo, Norway. He develops machine learning methods for life sciences and public health, with recent work focusing on climate-sensitive disease prediction. As a senior researcher, Professor Sandve has extensive publishing, teaching, and supervision experience, with articles in journals such as Nature Machine Intelligence, including Improving generalization of machine learning-identified biomarkers using causal modelling with examples from immune receptor diagnostics and The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires.  

We asked Professor Sandve to share his experiences of open science practices and the impact that sharing open code can have on public health and research in the age of deep learning. 

Can you briefly introduce yourself and your research area?  

I do research within machine learning, and often I do methodological research which is very tightly inspired by real-use cases. I try to always base my new methodological projects on some concrete need. I have worked in life sciences for 20 years in machine learning; I first worked in genomics, then pharmacoepidemiology, and, for some years, quite intensely with immune receptors, like adaptive immune cells and how they recognise foreign threats. Most recently, I'm spending most of my time on how climate change is affecting health, such as machine learning for malaria and dengue fever, and predicting outbreaks ahead of time. 

How did you first become interested in open science practices?  

I think it was during my PhD, where we published quite a bit in the BMC series, including BMC Bioinformatics and Genome Biology, which published open access articles. I liked that for two reasons. As a computer science researcher, it’s not just about whether you can find a paper in the end but how quickly you can browse and navigate papers. Even though we had access to most of the papers we needed in my PhD, the open access ones were more convenient. That was my entry point. 

For open software, that was more on transparency. I could see that when anything was open science, it was much easier to double-check if things had been done correctly and feel reassured that I could trust the paper in detail. Also, it's much easier to build further on. In my PhD, I spent most of my time on preparation and boilerplate aspects. When I went in a more open software direction, where I tried to create open science projects that were built on open software, I could focus more on new ideas using the context of existing work. 

What motivated you to adopt open science practices?  

I went into research a bit of an idealist. I think it's quite a tough sector to be in, so I feel we have to find motivation in some ideals of science. One is that if I do something, I want it to be as helpful to others as possible. I want others to be able to build on what I've done. The second is trust. In science, we shouldn't be forced to trust people. We should instead always be open to check each other’s work. 

You’re never 100% proud of your research code. You always know that if you had unlimited time, you would make it even better. Allowing yourself to share it and say, ‘Okay, this is what I had time for,’ and being able to trust that you’ve made the right prioritisations is important. If you have done something wrong, there is a big chance that somebody will catch it. 

Have you ever faced any particular challenges or barriers in undertaking open science practices? 

Data sensitivity is one challenge, but another big challenge is the computational cost of ensuring reproducibility, especially when analyses are run frequently. We have practises to ensure transparency, reproducibility, and unbiasedness of results, but if you want to go into deep learning, you simply cannot do them realistically based on computational budgets. I think we are entering a phase where sometimes computation is so intense and demanding that we have to sacrifice a bit of this transparency and reproducibility in order save computational time.  

Also, it’s one thing to share code, but for it to be valuable, people need to be able to run it. That is becoming harder, as it may depend on particular GPUs or system setups. It may also sometimes rely on closed models from large commercial companies or require access to computing centres that are not openly available. I think they are the main challenges we face in the machine learning community specifically. 

Is there a particular success story or example you’re proud of that illustrates the benefits of open science?  

“For researchers from smaller institutions, open science is crucial for being part of a broad international community beyond their own office.”


Geir Kjetil Sandve, Professor of Scientific Computing and Machine Learning, University of Oslo, Norway 

We have one ongoing project that is not yet academically published, but it has already had an impact. I’m collaborating with the HISP Centre to develop DHIS2, which is the world’s largest open-source health information system. It is used as the national health information system in around 70 countries, where vaccination data, medicine stocks, tuberculosis outbreaks, and other health indicators are tracked.  

I’ve worked with them for three years to develop machine learning methodologies to predict climate-sensitive disease outbreaks. While there are many open-source models for early warning systems, we are trying to build a platform that enables such models to be used operationally, as that can be quite challenging in practice. We are building an open community where everything is open source and where we share our ambitions and ideas transparently. The goal is to bring the community together so countries can retain control over the models and how they are used, make informed decisions, build local capacity, and communicate effectively with governments to ensure real-world impact. 

It has been fascinating to me because it goes beyond publishing open access papers or open-source code. It requires an open development cycle, where we contribute core software platform technologies to a shared community and develop them collaboratively. 

Another point is that open science is also a way of contributing to capacity building. For researchers from smaller institutions, open science (its processes, networks, software, and publications) is crucial for being part of a broad international community beyond their own office. 

Have open science practices influenced your career progression? 

Yes, I think so. When I was hired for my permanent position, they wanted someone who could interact with existing research. I was able to say that I bring with me a lot of open and still actively developing code that others can contribute to. This is better for students as they can contribute to open-source code, which facilitates collaboration and interaction with the research environment. Open science practices have also increased the impact of my work. Several people have been inspired by the code I've made, and I have gotten collaborations both nationally and internationally based on it. 

What advice would you give to researchers considering adopting open science practices? 

“Open science is a way of thinking about science and work that makes it feel meaningful and easier to stay motivated in the long run.” 


Geir Kjetil Sandve, Professor of Scientific Computing and Machine Learning, University of Oslo, Norway 

It is completely worth it. I think you gain so much by going in this direction. If you see it as a community and become conscious of your strengths and weaknesses, it will open so many collaboration and learning opportunities. Open science is a way of thinking about science and work that makes it feel meaningful and easier to stay motivated in the long run. 

I feel the point of open science is that work should be possible to reproduce or reuse in practice. To me, open science is not about whether it’s theoretically possible with unlimited time to build on something but about ensuring it’s open in a way that actually invites reuse, transparency, and reproducibility. Sharing code without putting yourself in others’ shoes and considering how it might realistically be reused is not truly in the spirit of open science. It's not about checking a box; it's about actually contributing.

Learn more about open science and sharing research data, code and protocols & methods openly 


Geir Kjetil Sandve, PhD, Professor of Scientific Computing and Machine Learning, University of Oslo, Norway 

Geir Kjetil Sandve © Springer Nature

Geir Kjetil Sandve studied computer science at the Norwegian University of Science and Technology (NTNU). During his PhD, he surveyed, benchmarked and developed machine learning methodology for motif discovery in biosequences. For his postdoctoral studies at the University of Oslo, Norway, he broadened his understanding of statistics, collaborating with biologists and statisticians to pioneer statistical analysis of genomic co-localization. Currently, his main focus is on doing his part to help make our research environment fun but productive, brutally honest but supportive, and visionary while delivering on our promise.


Related content

  • Best practices for transparency and reuse: 

  1. How to share your research protocols and methods openly 

  2. How to share your research code openly 

  • Supporting open science practices:
  1. Why share your research data? 

  2. Why sharing protocols matters 

  3. Why sharing your code matters 

Don't miss the latest news and blogs, sign up to The Researcher's Source Monthly Digest

Erika Pastrana

Author: Erika Pastrana

Vice President, Nature Research Journals Portfolio

Erika Pastrana is the Vice President of the Nature Research and Reviews Journals, a distinguished collection of over 60 scientific publications that span diverse fields—from Nature Sustainability and Nature Reviews Psychology to Nature Medicine and Nature Reviews Genetics. Under her leadership since January 2025, these journals uphold the highest standards of scientific reproducibility, global impact, and a strong commitment to open science.

Erika began her editorial career in 2010 as an editor at Nature Methods, focusing on neuroscience. In 2014, she transitioned to Nature Communications as a Team Manager, and by 2017, she became Editorial Director of the Nature Research Journals division, overseeing editorial strategy for health and applied sciences.

She holds a degree in Biochemistry and Molecular Biology and a Ph.D. in Neuroscience from the Universidad Autónoma de Madrid, where she researched axonal regeneration in animal models of nervous system injury. Erika continued her scientific work with four years of postdoctoral research at Columbia University in New York.

In recognition of her contributions to scientific communication, Erika received the 2024 Communication Award from the Spanish Geographical Society.