Greg Luck, Hazelcast
Palo Alto, California - Hazelcast, the open source in-memory data grid (IMDG), has announced the 0.4 release of Hazelcast Jet - an application-embeddable, distributed processing engine for big data stream and batch.
Major new functionality in Hazelcast Jet 0.4 includes event-time processing with tumbling, sliding and session windowing. Using these new capabilities, users can access stream processing architecture which provides a flexible mechanism to build and evaluate windows over continuous data streams. Hazelcast Jet is appropriate for applications such as sensor updates in IoT architectures (house thermostats, lighting systems), in-store e-commerce systems and social media platforms.
Stream processing has overtaken batch processing as a preferred method of processing big data sets for companies that require immediate insight into data. However, to get value from data, it must be partitioned i.e. take a fragment of the stream and analyse it. To classify data windows during processing, each data element in the stream needs to be associated with a timestamp. In Hazelcast Jet 0.4 this is achieved via event-time processing (a logical, data-dependent timestamp, embedded in the event itself). However, a major drawback of event-time processing is that events may arrive out of order or late, so you can never be sure if you see all events in a given time window.
To alleviate this issue, the latest release of Hazelcast Jet also includes windowing functionality which enables users to evaluate stream processing jobs at regular time intervals, regardless of how many incoming messages the job is processing. Hazelcast Jet offers three types of windows:
● Fixed/tumbling - time is partitioned into same-length, non-overlapping chunks. Each event belongs to exactly one window.
● Sliding - windows have fixed length, but are separated by a time interval (step) which can be smaller than the window length. Typically the window interval is a multiplicity of the step.
● Session - windows have various sizes and are defined basing on data, which should carry some session identifiers.
Additional features of Hazelcast Jet 0.4 also include:
● Users are now able to use the ICache/Hazelcast integration as a source and sink of data.
● java.util.stream can be used on top of ICache to enable basic data processing.
● Streaming File Connector - improved connector allows users to watch files and directories for changes.
● Hazelcast Jet code samples are now available which can be used as building blocks for Jet applications, providing a gradual learning experience.
In a new latency benchmark study Hazelcast Jet said it outperformed its competitors with a 40ms average latency for stream processing computations which remained flat as messages increased. Flink and Spark's execution latencies were hundreds of ms rising to seconds at the higher message throughputs.
Greg Luck, CEO of Hazelcast, said: "The Jet project is progressing faster than we could have hoped. The new functionality in 0.4 brings stream processing for the first time. As with batch, we are achieving a new performance level, giving us a real edge over alternative market solutions."