Spark SQL
•It introduces DataFrames, which is a new data structure for structured (and semi-structured)data.
•DataFrames offers us the possibility of introducing SQL queries in the Spark programs.
•It provides SQL language support, with command-line interfaces and ODBC/JDBC controllers.
Apache Flink
•An open-source framework for distributed stream and batch data
processing.
•It is focused on working with lots of data with very low data latency and
high fault tolerance on distributed systems.
•Its fault tolerance makes it perfect for streaming data processing.
•Trick is, it generates consistent snapshots. And in case of failure, the
system falls back on these snapshots