I am a Professor in the Department of Computing Science at the University of Alberta. I earned a PhD from the University of Toronto in 2005 working on Web data management, and since then has worked on databases, the Web, information retrieval and natural language processing, with emphasis on information extraction from semi-structured and unstructured sources. I was a principal investigator and the Leader of the Data Quality Theme of the NSERC Business Intelligence Network.
I am a co-PI of LINCS -- Linked Open Data for Canadian Cultural Research, a CFI-funded consortium building open knowledge graphs to mobilize cultural and heritage artifacts and a lead researcher with the Scotiabank Artificial Intelligence Research Initiative at the University of Alberta. I was a co-PI and the Leader of the Data Quality Theme of the NSERC Business Intelligence Network, and I have supervised 10 PhD, 17 MSc and 3 post-doctoral fellows now working as faculty members, or researchers/engineers in tech companies such as Diffbot, Amazon, Intuit, Borealis AI.
I serve on the Executive Committee of the Canadian Artificial Intelligence Association (CAIAC) and an Associate Editor of Springer's Distributed and Parallel Databases. I have served as Associate Editor of the IEEE Transactions on Knowledge and Data Engineering (2015-2018), Elsevier's Computational Intelligence Journal (2015-2018), and the SIGMOD Record (2010-2014). He was the ACM SIGMOD Information Director and Web Editor of the Record from 2006 to 2012. I have served on the Program Committee of all major conferences on data management, the Web, and natural language processing, on multiple occasions, and I have co-chaired the Program Committee of the 3rd IEEE International Conference on Data Science and Advanced Analytics, the 28th Canadian Conference on Artificial Intelligence, the 1st and the 2nd ACM SIGMOD Workshops on Databases and Social Networks, the 3rd International Workshop on Data Engineering Meets the Semantic Web, and the 5th International XML Database Symposium (co-located with VLDB 2007).
I am the recipient an Alberta Ingenuity New Faculty Award, an IBM Faculty Award, the Best Paper Award at the 2010 IEEE Conference on Data Engineering, and he supervised the recipients of the Best Undergraduate Poster Award at the 2012 ACM SIGMOD Conference. He was a Visiting Scientist at the Max-Planck Institute for Informatics, Germany from July 2014 to April 2015, and a Visiting Professor (BIT) at the Free University of Bozen-Bolzano, Italy, during the Summer of 2008.
My areas of research are knowledge extraction, data management, information retrieval, and natural language processing. I also work on machine learning methods applied to these fields. I have supervised graduate level research on the problems of named entity recognition, entity typing and disambiguation; open relation extraction from text; understanding social processes in Wikipedia article authoring; mining citation networks; and semistructured data management.
I am passionate about open linked data and the Semantic Web, and I work with Digital Humanities colleagues on creating, indexing, and processing knowledge graphs out of heritage, cultural, scholarly, and literary work.
Introduction to information retrieval focusing on algorithms and data structures for organizing and searching through large collections of documents, and techniques for evaluating the quality of search results. Topics include boolean retrieval, keyword and phrase queries, ranking, index optimization, practical machine-learning algorithms for text, and optimizations used by Web search engines. Prerequisites: CMPUT 201 and CMPUT 204, or 275. One of MATH 102, 125, or 127 is strongly recommended.
This course provides information and resources on teaching and research methods in computing science, and also gives an overview of the research done by faculty in the department. Ethics and professional development are included in this course. Required for all graduate students.
A major essay on an agreed topic.