Name: | Description: | Size: | Format: | |
---|---|---|---|---|
7.13 MB | Adobe PDF |
Authors
Abstract(s)
Apesar das evoluções no campo de Reverse Image Search, com algoritmos cada
vez mais robustos e eficazes, continua a haver interesse para que as técnicas de pesquisa
possam ser aprimoradas, melhorando a experiência do utilizador na procura das
imagens que tem em mente.
O objetivo principal deste trabalho foi desenvolver uma aplicação para
dispositivos móveis (smartphones) que permitisse ao utilizador encontrar imagens
através de inputs multimodais. Assim, esta dissertação, para além de propor pesquisas
por diversos modos (palavras-chave, desenho, e imagens da câmara ou existentes no
dispositivo), propõe que o utilizador consiga criar uma imagem por si só através de
desenho, ou editar/alterar uma imagem existente, tendo feedback no momento aquando
de cada alteração/interação. Ao longo da experiência de pesquisa, o utilizador consegue
usar as imagens encontradas (que achar relevantes) e ir aprimorando a pesquisa através
dessa edição, indo de encontro ao que pensa encontrar.
A implementação desta proposta teve como base a Cloud Vision API da Google
responsável pela obtenção dos resultados através do input de imagem, a Google Custom
Search API para a obtenção de imagens através do input por texto, e a framework
ATsketchkit que permitia a criação de desenho, para o sistema iOS da Apple.
Foram realizados testes com um conjunto de utilizadores com diversos níveis de
experiência em pesquisa de imagens e na habilidade de desenho, permitindo aferir a
preferência nos diferentes métodos de input, a satisfação na obtenção dos resultados,
bem como da usabilidade do protótipo.
Despite the evolution in the field of reverse image search, with algorithms becoming more robust and effective, there still interest for improving search techniques, improving the user experience when searching for the images the user has in mind. The main goal of this work was to develop an application for mobile devices (smartphones) that would allow the user to find images through multimodal inputs. Thus, this dissertation, in addition to propose the search for images in different ways (keywords, drawing/sketching, and camera or device images), proposes that the user can create an image by himself through drawing, editing / changing an existing image, having feedback at the time of each change / interaction. Throughout the search experience, the user can use the images found (which it finds relevant) and improve the search through its edition, going against what it thinks to find. The implementation of this proposal was based on a Google Cloud Vision API responsible for obtaining the results, and the ATsketchkit framework that allowed the creation of drawings, for Apple's iOS system. Tests were carried out with a set of users with different levels of experience in image research and different drawing ability, allowing to assess preference in different input methods, satisfaction with the images retrieved, as well as the usability of the prototype.
Despite the evolution in the field of reverse image search, with algorithms becoming more robust and effective, there still interest for improving search techniques, improving the user experience when searching for the images the user has in mind. The main goal of this work was to develop an application for mobile devices (smartphones) that would allow the user to find images through multimodal inputs. Thus, this dissertation, in addition to propose the search for images in different ways (keywords, drawing/sketching, and camera or device images), proposes that the user can create an image by himself through drawing, editing / changing an existing image, having feedback at the time of each change / interaction. Throughout the search experience, the user can use the images found (which it finds relevant) and improve the search through its edition, going against what it thinks to find. The implementation of this proposal was based on a Google Cloud Vision API responsible for obtaining the results, and the ATsketchkit framework that allowed the creation of drawings, for Apple's iOS system. Tests were carried out with a set of users with different levels of experience in image research and different drawing ability, allowing to assess preference in different input methods, satisfaction with the images retrieved, as well as the usability of the prototype.
Description
Keywords
Pesquisa multimodal Reverse image search Visão computacional Multimodal search Reverse image search Computer vision Content-based image retrieval Engenharia Informática . Faculdade de Ciências Exatas e da Engenharia