Data integration technologies  

Nowadays a large amount of data is being produced constantly and its processing is a complex task from theoretical and technical perspective. The potential data benefits increase when data is integrated from heterogeneous sources and processed in near real-time, thus minimizing the latency of useful information and knowledge.Stream and context processing, system adaptation technologies are used for this purpose. This course covers data integration technologies, mostly emphasizing data stream processing and integration technologies like Apache Spark and Apache Kafka. They are viewed in the context of data life-cycle, which includes data integration, processing and interpretation and usage of the acquired information for adapting systems near real-time. In data integration the logical integration process and infrastructure solutions play equally important role since data is integrated using distributed, horizontally scalable environment. Near real-time stream integration and system adaption use cases based on Apache Spark, Apache Kafka, Apache Cassandra, Docker and Cloudstack are being covered as part of this course. Outcome: Ability to choose the most suitable data stream integration technology - Exam Ability to define data integration solution on logical level - Exam and practical assignment Ability to integrate data streams - Practical assignment
Presential
English
Data integration technologies
English

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or HaDEA. Neither the European Union nor the granting authority can be held responsible for them. The statements made herein do not necessarily have the consent or agreement of the ASTRAIOS Consortium. These represent the opinion and findings of the author(s).