Jiuxiang Gu is a Research Scientist in the Document Intelligence Lab (DIL) of Adobe Research at Maryland site. He obtained his Ph.D. degree at Nanyang Technological University (NTU), Singapore, where he focused on the fusion of deep learning and reasoning for computer vision, such as image captioning, cross-modal retrieval, scene graph generation, visual question answering and so on.

His research interests lie at the intersection of computer vision and natural language processing. His contact email is jigu@adobe.com, and more information is available on his personal website.

Publications

Customization Assistant for Text-to-image Generation

Zhou, Y., Zhang, R., Gu, J., Sun, T. (Jun. 17, 2024)

CVPR 2024

TRINS: Towards Multimodal Language Models that Can Read

Zhang, R., Zhang, Y., Chen, J., Zhou, Y., Gu, J., Chen, C., Sun, T. (Jun. 17, 2024)

CVPR 2024

DocScript: New Task, Dataset, and Models for Document-level Script Event Prediction

Mathur, P., Morariu, V., Garimella, A., Dernoncourt, F., Gu, J., Sawhney, R., Nakov, P., Manocha, D., Jain, R. (May. 25, 2024)

LREC-COLING 2024

DocEdit: Language-guided Document Editing

Mathur, P., Jain, R., Gu, J., Dernoncourt, F., Manocha, D., Morariu, V. (Feb. 14, 2023)

AAAI 2023

LayerDoc: Layer-wise Extraction of Spatial Hierarchical Structure in Visually-Rich Documents

Mathur, P., Jain, R., Mehra, A., Gu, J., Dernoncourt, F., Natarajan, A., Tran, Q., Kaynig-Fittkau, V., Nenkova, A., Manocha, D., Morariu, V. (Jan. 6, 2023)

WACV 2023

User-Entity Differential Privacy in Learning Natural Language Models

Lai, P., Phan, N., Sun, T., Jain, R., Dernoncourt, F., Gu, J., Barmpalios, N. (Dec. 20, 2022)

2022 IEEE International Conference on Big Data

EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval

Ma, H., Zhao, H., Lin, Z., Kale, A., Wang, Z., Yu, T., Gu, J., Choudhary, S., Xie, X. (Sep. 27, 2022)

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

DocLayoutTTS: Dataset and Baselines for Layout-informed Document-level Neural Speech Synthesis

Mathur, P., Dernoncourt, F., Tran, Q., Gu, J., Nenkova, A., Morariu, V., Jain, R., Manocha, D. (Sep. 22, 2022)

Interspeech 2022

DocTime: A Document-level Temporal Dependency Graph Parser

Mathur, P., Morariu, V., Kaynig-Fittkau, V., Gu, J., Dernoncourt, F., Tran, Q., Nenkova, A., Manocha, D., Jain, R. (Jul. 15, 2022)

NAACL 2022

Multi-Scale Aligned Distillation for Low-Resolution Detection

Qi, L., Kuen, J., Gu, J., Lin, Z., Wang, Y., Chen, Y., Li, Y., Jia, J. (Jun. 21, 2021)

Conference on Computer Vision and Pattern Recognition (CVPR'21)

Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU models

Du, M., Manjunatha, V., Jain, R., Deshpande, R., Dernoncourt, F., Gu, J., Sun, T., Hu, X. (Jun. 11, 2021)

NAACL 2021