Defesa de qualificação de doutorado em Informática ocorre nesta terça, 10.

A Coordenação do Programa de Pós-Graduação em Informática PPGI/Ufam convida a comunidade acadêmica para sessão pública de defesa de exame de qualificação de doutorado, intitulado: “Generation and ranking of candidate networks of relations for keyword search over relational databases”, do doutorando Pericles da Silva Oliveira.  A defesa ocorre nesta terça-feira, 10, na sala de seminários do Instituto de Computação, às 9h, no setor Norte do campus. A banca examinadora é composta pelos professores Altigran Soares da Silva, do PPGI/Ufam (presidente), João Marcos Bastos Cavalcanti, PPGI/Ufam (membro), Edleno Silva de Moura, PPGI/Ufam (membro) e Marco Antônio Casanova, DI/PUC-Rio (membro). 

 

RESUMO: Several systems proposed for processing keyword queries over relational databases rely on the generation and evaluation of Candidate Networks (CNs), i.e., networks of joined database relations that, when processed as SQL queries, provide a relevant answer to the input keyword query. Although the evaluation of CNs has been extensively addressed in the literature, problems related to efficiently generating and handling CNs have received much less attention. To generate useful CNs is necessary to automatically locating, given a handful of keywords, relations in the database that may contain relevant pieces of information, and determining suitable ways of joining these relations to satisfy the implicit information need expressed by a user when formulating her query. In this pro- posal, we present two contributions related to the processing of Candidate Networks. As our first contribution, we present a novel approach for generating Candidate Networks based on the concept of Steiner Trees. This concept, although commonplace in other problems related to keyword queries over relational databases, has not been considered before for the CN generation problem. We show that it allows the generation of a com- pact set of CNs that leads to superior quality answers, and that demands less resources in terms of processing time and memory used. As our second contribution, we initially argue that the number of possible Candidate Networks that can be generated by any algorithm is usually very high, but that, in fact, only very few of them produce answers relevant to the user and are indeed worth processing. Thus, there is no point in wasting resources processing useless CNs. Then, based on such an argument, we present an algorithm for ranking CNs, based on their probability of producing relevant answers to the user. This relevance is estimated based on the current state of the underlying database using a probabilistic Bayesian model we have developed. By doing so we are able do discard a large number of CNs, ultimately leading to better results in terms of quality and performance. Our claims and proposals are supported by a comprehensive set of experiments we carried out using several query sets and datasets used in previous related work and whose results we report and analyse here.

 

BCMath lib not installed. RSA encryption unavailable