Event: The basic data unit of flume data transmission, but also the basic unit of transaction.Ģ. The main types are memory channel, file channel, and kafka channel. channel: In order to reconcile the inconsistency of source and sink data processing efficiency, channel data buffers are introduced. source: docking data upstream, responsible for receiving various types of log data, such as avro, exec, spooling directory, jms, etc., and writing them to the channel sink: Connect to the downstream data, and is responsible for pulling data from the channel and sending it to other business components, such as avro, hdfs, logger, file, etc. The flume data is based on agentProcesses are transmitted in units of events, and an agent process instance includes the following three components. Basic architecture of flume quote the architecture diagramįrom the above figure, we can see that there are mainly Agents, which include source, channel, and sink. The source and sink within the given agent run asynchronously with the events staged in the channel.ġ. The sink removes the event from the channel and puts it into an external repository like HDFS (via Flume HDFS sink) or forwards it to the Flume source of the next Flume agent (next hop) in the flow. The file channel is one example – it is backed by the local filesystem. The channel is a passive store that keeps the event until it’s consumed by a Flume sink. A similar flow can be defined using a Thrift Flume Source to receive events from a Thrift Sink or a Flume Thrift Rpc Client or Thrift clients written in any language generated from the Flume thrift protocol.When a Flume source receives an event, it stores it into one or more channels. For example, an Avro Flume source can be used to receive Avro events from Avro clients or other Flume agents in the flow that send events from an Avro sink. ![]() The external source sends events to Flume in a format that is recognized by the target Flume source. A Flume source consumes events delivered to it by an external source like a web server. A Flume agent is a (JVM) process that hosts the components through which events flow from an external source to the next destination (hop). ![]() Let's take a look at what flume's official website says: Ī Flume event is defined as a unit of data flow having a byte payload and an optional set of string attributes. ![]() Perhaps when talking about log collection, the first thing that comes to mind is flume.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |