19 Apr 2016
Let us see what Apache Flink is?- It is a community-driven open source framework for distributed big data analytics, like Hadoop and Spark. It aims to bridge the gap between Map Reduce-like systems and shared-nothing parallel database systems.
Flink has built around a stream model, which it can apply to batch and SQL processing jobs as well. It includes libraries for complex event processing (essentially, a pattern detection system for streams), machine learning, and graph processing. Flink provides more efficient memory processing than Spark since it has a memory management system that reduces the amount of garbage collection performed by the JVM.
Flink includes several APIs for creating applications that use the Flink engine:
- DataStream API for unbounded streams embedded in Java and Scala, and
- DataSet API for static data embedded in Java, Scala, and Python,
- Table API with an SQL-like expression language embedded in Java and Scala.
The Apache Flink community is pleased to announce the availability of the 1.0.0 release. The community put significant effort into improving and extending Apache Flink since the last release, focusing on improving the experience of writing and executing data stream processing pipelines in the production environment.
Please find the Download link for Apache Flink. You don’t have to install Hadoop to use Flink, but if you plan to use Flink with data stored in Hadoop, pick the version matching your installed Hadoop version.