Anda di halaman 1dari 7

Flume

What is Flume?

• Is a distributed service for collecting, aggregating, and


moving large data to a centralized data store
• Was developed by Apache
• Has the following features:
– Simple
– Reliable
– Fault tolerant
– Used for online analytic applications

6-2
Flume: Architecture

Source Sink

Channel

Agent

HDFS
Web
Server

6-3
Flume Sources (Consume Events)

• Avro source
• Exec source
• Spooling Directory source
• Sequence Generator source
• Syslog source
• HTTP source
Source Sink
• Custom source

Channel

Agent

Web
HDFS
Server

6-4
Flume Channels (Hold Events)

• Memory channel
• JDBC channel
• File channel
• Custom channel

Source Sink

Channel

Agent

Web
HDFS
Server

6-5
Flume Sinks (Deliver Events)

• HDFS sink
• Logger sink
• Avro sink
• IRC sink
• File Roll sink
• Null sink Source Sink
• HBase sink
• AsyncHBaseSink
• ElasticSearchSink Channel

• Custom sink Agent

Web HDFS
Server

6-6
Configuring Flume

1. Create a configuration file (flume.conf).


2. Store the file in the flume-ng/conf directory.
3. Configure individual components.
4. (Optional) Edit flume-env.sh.
5. Verify the installation by running the following command:
$ flume-ng help

6-7

Anda mungkin juga menyukai