Name: | Description: | Size: | Format: | |
---|---|---|---|---|
2.46 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
A necessidade de armazenar, processar e analisar os dados é uma realidade cada vez
presente nas empresas onde as decisões de negócio dependem muito das plataformas
digitais. A introdução do conceito de Data Warehouse teve como finalidade facilitar e
melhorar o processo de recolha de indicadores de negócio imprescindíveis.
O conceito de Big Data veio com o aumento da variedade e do volume de dados para fins
de análise. Com esse conceito em mente, foram desenvolvidas tecnologias para fazerem
face aos desafios impostos. A transformação digital no registo de entradas e saídas nos
transportes público leva a grandes volumes de dados que podem ser usados para aplicar
análises de negócio na área [1].
Este projeto visa a recolha de um conjunto de tecnologias na vertente do Big Data e a
avaliação da capacidade de armazenamento, do método de elaboração dos métodos de
ETL e do desempenho na obtenção de resposta a um conjunto de queries, consoante o
aumento do volume de dados de mobilidade, referentes às entradas dos autocarros da
companhia de transportes públicos Horários do Funchal.
É introduzida neste projeto uma revisão de literatura sobre os conceitos de Data
Warehouse, de modelos dimensionais e de Big Data, bem como nas tecnologias existentes
e trabalhos relacionados com manipulação de Big Data. Foi também objeto de análise do
estado da arte a aplicação destas tecnologias nos transportes públicos.
Os resultados apresentados revelam que algumas plataformas conseguem adequar-se bem
ao um aumento do volume de dados, com boas capacidades de desempenho, tanto na
execução de processos de ETL, como na execução de queries de consulta, em comparação
a outras tecnologias, cujo resultados são pouco práticos neste tipo de estudo.
The need to store, process and analyse data is a increasingly present reality in companies where business decisions depend heavily on digital platforms. The purpose of introducing the Data Warehouse concept was to facilitate and improve the process of collecting essential business indicators. The concept of Big Data came with the increase in the variety and the volume of data for analysis purposes. With the concept in mind, technologies were developed to face the imposed challenges. The digital transformation in the registration of entrances and exits in the public transport lead to large volumes of data that can be used to apply business analysis [1]. This project aims to collect a set of technologies in the field of Big Data and evaluate the storage capacity, the method of elaborating ETL methods and the performance in obtaining a response to a set of queries, referring to the entrances of the buses of public transport company Horários do Funchal. This project introduces a literature review on the concepts of Data Warehouse, dimensional models and Big Data, as well as existing technologies and work related to Big Data manipulation. The application of these technologies in public transport was also subject to a state-of-the-art analysis. The presented results reveal that some platforms are able to adapt well to an increase in the volume, with good performance capabilities, both in the execution of ETL processes and in the execution of queries, in comparison to other technologies, whose results are impractical in this type of study.
The need to store, process and analyse data is a increasingly present reality in companies where business decisions depend heavily on digital platforms. The purpose of introducing the Data Warehouse concept was to facilitate and improve the process of collecting essential business indicators. The concept of Big Data came with the increase in the variety and the volume of data for analysis purposes. With the concept in mind, technologies were developed to face the imposed challenges. The digital transformation in the registration of entrances and exits in the public transport lead to large volumes of data that can be used to apply business analysis [1]. This project aims to collect a set of technologies in the field of Big Data and evaluate the storage capacity, the method of elaborating ETL methods and the performance in obtaining a response to a set of queries, referring to the entrances of the buses of public transport company Horários do Funchal. This project introduces a literature review on the concepts of Data Warehouse, dimensional models and Big Data, as well as existing technologies and work related to Big Data manipulation. The application of these technologies in public transport was also subject to a state-of-the-art analysis. The presented results reveal that some platforms are able to adapt well to an increase in the volume, with good performance capabilities, both in the execution of ETL processes and in the execution of queries, in comparison to other technologies, whose results are impractical in this type of study.
Description
Keywords
Data warehouse Big data Modelo dimensional Apache Hadoop Apache Hive Apache Spark Presto SQL Server ETL Docker Transporte público AFC Dimensional model Public transport Engenharia Informática . Faculdade de Ciências Exatas e da Engenharia