Davood Rafiei, PhD
Personal Website: https://drafiei.github.io/
Contact
Professor, Faculty of Science - Computing Science
- drafiei@ualberta.ca
Overview
Area of Study / Keywords
Database systems Large language models NLP IR
About
Education
- B.Sc., Computer Engineering, Sharif University of Technology (Tehran), 1990
- M.Sc., Computer Science, University of Waterloo, 1995
- Ph.D., Computer Science, University of Toronto, 1999
Bio
Davood Rafiei is a Professor of Computing Science at the University of Alberta and a member of the Database Systems Research Group. He received his B.Sc. from Sharif University of Technology, his M.Math. from the University of Waterloo, and his Ph.D. from the University of Toronto. He joined the University of Alberta in 2000.
His research spans data management, artificial intelligence, and the Web, with a focus on building scalable systems for querying, integrating, and analyzing large and heterogeneous data collections.
Davood regularly serves on the program committees of leading conferences in databases (SIGMOD, VLDB, ICDE), artificial intelligence and natural language processing (NeurIPS, ACL, COLM), and Web and information retrieval (The Web Conference, SIGIR, CIKM). He has also held a variety of editorial and organizational roles within these research communities.
He was a Visiting Scientist at Google in Mountain View (2007–2008), a Visiting Professor at Kyoto University (2014), and a Visiting Professor at Université Paris Descartes (2015). He is a co-author of two books on natural language interfaces to databases.
Research
Research Interests
- Natural language interfaces to databases
- Data integration and data quality
- Large language models for structured data reasoning
- Information retrieval and question answering
- Web data management and analysis
- Entity resolution and knowledge discovery
Research Summary
I lead the Data Analytics and Language Intelligence (DALI) Lab at the University of Alberta. Our research lies at the intersection of data management, artificial intelligence, and the Web, with the goal of making large and complex data sources easier to access, integrate, and analyze.
A central focus of our work is the development of natural language interfaces to data, enabling users to query and interact with structured and semi-structured data using natural language. We also study data integration, including schema matching, entity resolution, and the construction of unified views over heterogeneous data sources. More recently, our research has explored how large language models can improve reasoning, automation, and user interaction in these domains.
Beyond these core areas, we investigate scalable techniques for searching, querying, and analyzing large collections of structured, semi-structured, and Web data. Our work spans topics such as information retrieval, question answering, Web tables, and knowledge discovery, with an emphasis on systems that combine theoretical foundations with practical impact.
Selected Projects
- Natural Language Interfaces to Databases (Text-to-SQL and Data-to-Text)
- Data Integration and Schema Matching
- Large Language Models for Structured Data Reasoning
- Web Tables and Open Data
- Entity Resolution
- Question Answering and Information Retrieval
Courses
CMPUT 291 - Introduction to File and Database Management
Basic concepts in computer data organization and information processing; entity-relationship model; relational model; SQL and other relational query languages; storage architecture; physical organization of data; access methods for relational data. Programming experience (e.g., Python) is required for the course project. Prerequisites: CMPUT 175 or 274, and 272. Corequisite: one of CMPUT 201 or 275. Credit may be obtained in only one of CMPUT 291, BTM 415, or MIS 415.
Featured Publications
Confidence estimation for text-to-sql in large language models
Maleki, Sepideh Entezari, Pourreza, Mohammadreza, Rafiei, Davood
Proceedings of the AAAI Conference on Artificial Intelligence. 2026 January;
TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations
Dargahi Nobari, Arash, Rafiei, Davood
VLDB. 2025 September;
KidLM: Advancing language models for children--early insights and future directions
Nayeem, Mir Tafseer, Rafiei, Davood
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024 January;
Normtab: Improving symbolic reasoning in llms through tabular data normalization
Nahid, Md Mahadi Hasan, Rafiei, Davood
Findings of the Association for Computational Linguistics: EMNLP 2024. 2024 January;
TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
Mahadi Hasan Nahid, Md, Rafiei, Davood
NAACL. 2024 January;
Dtt: An example-driven tabular transformer for joinability by leveraging large language models
Dargahi Nobari, Arash, Rafiei, Davood
Proceedings of the ACM on Management of Data. 2024 January;
Din-sql: Decomposed in-context learning of text-to-sql with self-correction
Pourreza, Mohammadreza, Rafiei, Davood
Advances in neural information processing systems. 2023 January;
View additional publications