Kaggle is bridging the gap between AI and medicine. Learn more about research’s role.

The Source
By: Lucy Frisch, Fri May 22 2020
Lucy Frisch

Author: Lucy Frisch

Fast and reliable information is critical right now and the name of the game is collaboration. AI-powered literature review, Kaggle, is looking to leverage the expertise in three completely different industry sectors — basic science research, clinical medicine, and artificial intelligence — to provide continuously updated literature to help address the COVID-19 pandemic. 

We had a chat with Dr. Tayab Waseem, the AI-literature reviews project lead, who has extensive experience bridging the worlds of AI and medicine together and he is excited for how this project will not only impact the pandemic, but also change the way research is done moving forward. Find out how Kaggle is using machine learning to help the research and medical communities, as well as the role that publishers play in this space moving forward.


The need to swiftly develop treatment protocols, vaccines, and stop the spread of the virus has led to an explosion in research within the scientific and medical community. With the rapid pace at which the literature is advancing and evolving the medical and scientific community is struggling to keep pace. To address this challenge, the White House Office of Science and Technology Policy issued a challenge to the machine learning community to develop an automated literature evaluation toolset for the medical research community. Leaders from the National Institutes of Health, Allen Institute for AI, Chan Zuckerberg Initiative, Microsoft, the National Library of Medicine, the Georgetown University’s Center for Security and Emerging Technology, and various other medical organizations were recruited to address this challenge. Over 128,000 manuscripts were converted into an AI readable format and tools were developed by the AI community on Kaggle to comb through and mine the literature to answer high-priority scientific questions in the form of an AI powered literature review.

Tayab headshot cropped
Dr. Waseem has a PhD in Immunology and is a Public Policy Fellow for the American Association of Immunologists. He’s the Director of Medical Informatics and AI Integration at the Wagner Macula & Retina Centers and a medical student at Eastern Virginia Medical School. He met Anthony Goldbloom at a Stanford Human-Centered Artificial Intelligence Conference and a beautiful collaboration quickly flourished.

What are Kaggle’s aims and scopes insofar as COVID-19?

TW: To create an AI powered literature review tool that aims to make the processes of writing literature reviews, rapid reviews, meta-analysis and editorials more efficient. The design of this toolset, which can be easily adapted for other purposes, will allot scientists and health care providers to find the most current information to quickly and efficiently identify current knowledge gaps and opportunities for further research.

What is the primary use case of AI-generated tables and how is it changing the way researchers work?

TW: Ideally, these tables will help aggregate all relevant data needed to provide a meaningful overview of pertinent evidence. This will aid rapid production of systematic literature review, meta-analysis, etc. 

Literature reviews need to be done for grants, papers, applications, etc. This will significantly decrease the amount of time it takes to go through and look for literature, decrease the amount of time it takes to extract relevant information, find papers that may be missed with a traditional search, and also identify gaps in knowledge. 

How is Springer Nature’s CORD19 data being used in this instance?

TW: We use all papers from all journals including preprints (due to the rapid pace nature of the pandemic). Further, we are working with major medical journals in developing a tool that will be useful for the medical and scientific research community at large. We are also asking journals for specific questions that they want us to tackle using our tool. 

We plan to publish the results of our studies in medical journals. The goal is not for these tables to live on Kaggle, but instead be put in front of the audiences that need/will use them (medical and institutional websites).

How is Kaggle optimizing open source code and how critical is open data at times like this?

TW: All the AI code used to generate summary tables is released open source and under a Create Commons license. Without the articles being made open access and machine readable, this project wouldn't be possible. More generally, datasets like the Johns Hopkins University data on cases, recoveries and fatalities and Google and Apple's release of mobility data have been crucial to understanding the pandemic.

Feedback from experts

What feedback have you had from practitioners in the field so far?

TW: Currently our team has over 150 medical and scientific volunteers across 30 institutes working on creating an AI-driven live literature review tool. We asked a couple of our team members across specialties, ranging from medical students, residents, and attending physicians to PhD students and post docs how this project will impact the future of their field. In their own words:

  • Jose Morey (M.D., Eisenhower Fellow, Chief Medical Innovation Officer - Liberty BioSecurity): Using AI to augment literature reviews is the epitome of man + machine moving the needle forward for medicine. It is a perfect example of human-centered AI and its best.
  • Maikel Boot, PhD (Postdoctoral Fellow at Yale University): Having an AI/ML search engine provide a landscape analysis with relevant metrics of any given research topic would be a game-changer for writing grants, chapters and reviews. Especially when fields are rapidly developing or have large bodies of literature, having a detailed overview of all relevant literature saves a lot of time.
  • Michael Stolz, M.D. (Surgery Resident at Northeast Georgia Medical Center): Increase the efficiency of time spent looking for answers will allow me to spend more time caring for our patients using the latest and most relevant data available.
  • Lucas Buyon (PhD candidate at Harvard): The hope is the AI-powered approaches will  dramatically reduce the time spent writing literature reviews. Furthermore, as the rate of scientific publication continues to increase,  AI approaches may allow for the publication frequency of subject reviews to better keep pace the ever-growing rate of scientific output.
  • Jan Bremer (Medical student at University Medical Center Hamburg-Eppendorf):  This AI-based Literature Review will lay the foundation for producing more transparent and reproducible results not subject to human biases in the future, which is important across scientific disciplines. In the Covid-19 pandemic, this tool will facilitate the research in the medical community by fast and comprehensive management of the present flood of data.
  • Justin Zaremba (Medical student at EVMS):  Searching for answers to scientific questions can be incredibly time consuming and effort is often wasted analyzing sources that are ultimately irrelevant. Having a tool where I can type in a question and immediately be provided with a list of sources along with the relevant supporting data would not only simplify answering questions that may arise during my education, it would also help me make informed decisions as a future physician.

Next steps

TW: Artificial intelligence has continued to evolve in its applications for healthcare. This will elucidate the first time AI has been implemented to glean insights from academic journals to combat the COVID-19 pandemic.  It will be a cornerstone for others to build upon and further discoveries at a rate not previously possible

Many institutions have literature reviews on their websites, but keeping them up to date with the pace at which information is being published is unsustainable. We are happy to partner with these institutions and use our AI tool to alleviate part of this burden.  

You can explore the COVID-19 staging area here. If you’re interested in participating in Kaggle’s utility study or our validation study in the near future, please reach out to Dr. Waseem.

All our interviews reflect the views and opinions of the interviewees.

Springer Nature is committed to supporting the research community. Visit our COVID-19 hub for the latest research, data, and resources for the research community.

Lucy Frisch

Author: Lucy Frisch

Lucy Frisch is a Senior Marketing Manager leading the Content Marketing Programmes team, based in the New York office. She has a passion for storytelling and works to humanize the research published across Springer Nature with a focus on the researcher experience.

Related Tags: