Longxu Dou 窦隆绪

Hi, I am a Ph.D student from Harbin Institute of Technology, where I am a member of Language Analysis Group of HIT-SCIR Lab under the supervision of Prof. Wanxiang Che.

Currently, I am a visiting student of NUS-WING Lab, advised by Prof. Min-Yen Kan. Previously, I have finished two wonderful internships in Microsoft Research Asia working with Yan Gao, Jian-Guang Lou, Jinpeng Wang and Chin-Yew Lin.

Email  /  Google Scholar  /  LinkedIn  /  Github

profile photo
Research

Currently, I am working on text-to-SQL semantic parsing, which could greatly facilitate the interaction between database and data analyst. To summarize, our work advance the research by proposing (1) unified text-to-SQL parser for better task generalization; (2) multilingual text-to-SQL parser for realistic globalization requirement and (3) knowledgeable text-to-SQL parser for assisting domain experts.

Besides that, I am also very interested in (domain-)knowledge-intensive NLP, including (1) how to acquire the domain-specific knowledge efficiently (user-interaction&data-mining) and (2) how to harness the NLP model with domain knowledge effectively (implicit&explicit).

From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning
Qian Liu, Fan Zhou, Zhengbao Jiang, Longxu Dou, Min Lin
Priprint, 2023
MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning
Bohan Li, Longxu Dou, Yutai Hou, Yunlong Feng, Honglin Mu, Wanxiang Che
Priprint, 2023
Controllable Data Augmentation for Context-Dependent Text-to-SQL
Dingzirui Wang, Longxu Dou, Wanxiang Che
Preprint, 2023
A Survey on Table-and-Text Hybrid QA: Definitions, Methods, Challenges and Future Directions
Dingzirui Wang, Longxu Dou, Wanxiang Che
Preprint, 2023
paper
MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing
Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Jian-Guang Lou
AAAI, 2023
paper / poster / slides / video
KnowSQL: Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge
Longxu Dou, Yan Gao, Xuqi Liu, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Min-Yen Kan, Dechen Zhan, Jian-Guang Lou
EMNLP, 2022
paper / poster / slides / demo
UniSAr: A Unified Structure-Aware Autoregressive Language Model for Text-to-SQL Semantic Parsing
Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Jian-Guang Lou
IJMLC Journal, 2022
paper / code
Transition-based Parser and Iterative Inference Parser
Longxu Dou, Yunlong Feng, Yuqiu Ji, Wanxiang Che, Ting Liu
CoNLL, 2020
paper / slides
A Unified Pipeline for Meaning Representation Parsing via Effective Encoding and Efficient Training
Wanxiang Che, Longxu Dou, Yang Xu, Yuxuan Wang, Yijia Liu, and Ting Liu
CoNLL, 2019
paper / poster / slides / supplement / code
Data2Text Studio: Automated Text Generation from Structured Data
Longxu Dou, Guanghui Qin, Jinpeng Wang, Jin-Ge Yao, and Chin-Yew Lin
EMNLP, 2018
paper / poster
Competitions

  • First Prize in CCIR Cup: Tabular and Textual Financial QA Challenge, 2022
  • Third Prize in WAIC Financial AntSQL Text-to-SQL Challenge, 2022
  • First Prize in SGCC Text-to-SQL AI Challenge Competition, 2021
  • First Prize in HUAWEI Cloud Text-to-SQL Competition, 2020
  • First Prize in CoNLL-2019 Meaning Representation Parsing Shared Task, 2019
  • Awards

  • Tencent Scholarship, 2022
  • First-Class Fresh-PhD Fellowship, 2018
  • National Scholarship, 2018
  • Outstanding Graduate of Harbin Institute of Technology, 2018
  • Stars of Tomorrow Internship Award of Microsoft Research Asia, 2018&2022
  • Education
    Harbin Institute of Technology, Harbin, China

    • Ph.D in Computer Science
    • 2018.09 ~ 2023.10 (Expected)

    • B.E in Computer Science
    • 2014.09 ~ 2018.06
    National University of Singapore, Singapore

    • Visiting Student
    • 2021.03 ~ 2023.09
    Industry Research Experience
    Microsoft Research Asia, Beijing, China

    • Research Intern at Data, Knowledge and Intelligence group
    • 2021.03 ~ 2022.09

    • Research Intern at Knowledge Computing group
    • 2017.07 ~ 2018.07

    Design and source code from Jon Barron.