Fast Data Processing in Hyper-Scale Systems
thesisposted on 16.11.2015, 15:53 authored by Marcel Tilly
The deluge of intelligent objects that are providing continuous access to data and services on one hand and the demand of developers and consumers to handle these data on the other hand require us to think about new communication paradigms and middleware. Based on requirements collected from scenarios from connected car, social networks, and factory of the future this thesis is developing new concepts for fast data processing for hyper-scale systems. In hyperscale systems, such as in the Internet of Things, one emerging requirement is to process, procure, and provide information with almost zero latency. This thesis is introducing new concepts for a middleware to enable fast communication by limiting information flow with filtering concepts using event policy obligations and combining data processing techniques adopted from complex event processing. Fast data processing has to deal with continuous data streams of events, providing a set of operators to manipulate, aggregate, and correlate data. This processing logic needs to be distributed. Distribution helps us to scale on one hand in terms of numbers of data sources (e.g. phones, cars, sensors) and on the other hand to parallelise processing in terms of grouping and partitions (e.g. regional). In our solution, event policies are injected as close as possible to the place where the data is born to optimise traffic. Filters, aggregations and rules help to process the data accordingly. Finally, communication paradigms or interaction patterns support mediation between classical service based request-response interaction and event-based data exchange. This all together builds a middleware enabling fast data processing for hyper-scale systems.