Knowledge Distillation and Scientific Paper Production using Artificial Intelligence

NU 2020-227

INVENTORS
Dashun Wang*
Nima Dehmamy
Lu Liu
Woo Seong Jo

ABSTRACT
Given the vast amount of research articles in different areas of science and humanities, efficient retrieval and condensing of relevant information is crucial for our ability to utilize humanities knowledge. While search engines and data mining allow us to find candidate articles or publications in relation with to a query, casting the collected information in a coherent form, as humans do in presentations, review articles, or textbooks, has not been fully achieved yet. Here, Northwestern researchers showcase a first attempt at a pipeline for creating review articles which combines science of science method with a transformer-based seq2seq architecture to create a complete review article. They assess the quality of each step of our pipeline and discuss challenges and future steps to improve the quality of the final outcome. The overall result is a proof of concept in the direction of creating AI capable of coherent summarization of multiple textual sources and can aide in scientific writing. This could have a great impact in reducing the burden of  writing scientific articles, and knowledge condensation, thus, accelerating the advancement of science.

APPLICATIONS  

  • Scientific/R&D field-Citation recommendation
  • Government/Politics/Policy-making

ADVANTAGES  

  • Citation Recommendation based on several key areas including co-citations, bibliometric features, etc. -
  • Organization of review papers using various natural language processing (NLP) tasks which outperforms existing use of recurrent neural networks (RNN) in existing methods.
  • Scientific paper summarization built upon abstractive frameworks in news summarization and fine-tuned models with scientific publications and their citation context through a unique training based on a curated dataset of MAG that is trained using a unique methodology intended to train the model on domain differences

IP STATUS
A provisional application has been filed.

Patent Information: