Abstract:
Nowadays, connected physical machines manage vast and diverse amounts
of data, often referred to as Big Data. This data originates from numerous heterogeneous sources and serves various purposes, including decision-making,
medical treatment support, diagnosis, and enabling fast and relevant data access, among others. This has presented a significant challenge for companies,
as they grapple with issues related to data storage, analysis, processing, and,
most notably, data integration.For this reason, companies need new tools and
techniques, such as the use of ontologies for data integration and interoperability, to cope with integration difficulties. These ontologies are formally defined
as explicit specifications of a shared conceptual understanding that can be interpreted by both humans and machines.
Our master thesis surveys the most important approaches to data integration
and suggests a new methodology that integrates multiple data sources by using
ontologies and machine learning, to facilitate and enhance data comprehension.