Sequential short-text classification with neural networks

MIT PhD Thesis 2017

Publication date: May 30, 2017

Franck Dernoncourt

Medical practice too often fails to incorporate recent medical advances. The two main reasons are that over 25 million scholarly medical articles have been published, and medical practitioners do not have the time to perform literature reviews. Systematic reviews aim at summarizing published medical evidence, but writing them requires tremendous human efforts. In this thesis, we propose several natural language processing methods based on artificial neural networks to facilitate the completion of systematic reviews. In particular, we focus on short-text classification, to help authors of systematic reviews locate the desired information. We introduce several algorithms to perform sequential short-text classification, which outperform state-of-the-art algorithms. To facilitate the choice of hyperparameters, we present a method based on Gaussian processes. Lastly, we release PubMed 20k RCT, a new dataset for sequential sentence classification in randomized control trial abstracts.

Learn More