We present an empirical argument in favor of a cascaded architecture to neural text summarization. Summarization practices vary widely but few other than summarizing news articles can garner a large amount of training data sufficient to meet the requirement of end-to-end systems which perform content selection and surface realization jointly to generate abstracts. Furthermore, such systems pose a challenge to summarization evaluation, as they force content selection to be evaluated along with text generation, yet evaluation of the latter remains an unsolved issue. In this paper, we present empirical evidence showing that the performance of a cascaded pipeline that separately identifies important content pieces and stitches them together into a coherent text is comparable to or outrank that of end-to-end systems, whereas a pipeline architecture allows for flexible content selection. We finally discuss how we can take advantage of a cascaded pipeline in neural text summarization and shed light on important directions for future research.