Helium @ CL-SciSumm-19: Transfer learning for effective scientific research comprehension

Automatic research paper summarization is a fairly interesting topic that has garnered significant interest in the research community in recent years. In this paper, we introduce team Helium’s system description for the CL-SciSumm shared task colocated with SIGIR 2019. We specifically attempt the first task, targeting in building an improved recall system of reference text spans from a given citing research paper (Task 1A) and constructing better models for comprehension of scientific facets (Task 1B). Our architecture incorporates transfer learning by utilising a combination of pretrained embeddings which are subsequently used for building models for the given tasks. In particular - for task 1A, we locate the related text spans referred to by the citation text by creating paired text representations and employ pre-trained embedding mechanisms in conjunction with XGBoost, a gradient boosted decision tree algorithm to identify textual entailment. For task 1B, we make use of the same pretrained embeddings and use the RAKEL algorithm for multi-label classification. Our goal is to enable better scientific research comprehension and we believe that a new approach involving transfer learning will certainly add value to the research community working on these tasks.

Learn More

Publications

Helium @ CL-SciSumm-19: Transfer learning for effective scientific research comprehension

Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2019) at the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)

Publication date: July 25, 2019

Bakhtiyar Syed, Vijayasaradhi Indurthi, Balaji Vasan Srinivasan, Vasudeva Varma

Research Areas: AI & Machine Learning Content Intelligence Natural Language Processing