Professional Documents
Culture Documents
By
Flume Sources
Ad -> Hadoop Training for Java Developers in just $69/3500INR visit www.HadoopExam.com
(Learn BigData)Hadoop Certification 300+ practice Questions visit www.HadoopExam.com
In case of channels and sinks, events are added and removed from
the channel, will be a part of transaction. However, when you tail the file,
there is no way, that it could be part of a transaction.
Suppose, because of any reason for instance channel fails, then there is no
possibility to rollback this tailed transaction, to put back the data.
And in the log4j you had done the configuration to rotate or rename the
file if it reaches the 1 MB in size and renaming will be done as below.
/user/hadoopexam/access.log1
Ad -> Hadoop Training for Java Developers in just $69/3500INR visit www.HadoopExam.com
(Learn BigData)Hadoop Certification 300+ practice Questions visit www.HadoopExam.com
Now, Flume is done with the access.log1, and it will start reading the file
access.log and it is unaware that there is another file access.log2 was
created and that log would be missed by the Apache Flume for reading.
So, you might have noticed that using the TailSource there are chances
that data could be lost, that is the second reason why TailSource was
discontinued after 0.9 flume release.
Ad -> Hadoop Training for Java Developers in just $69/3500INR visit www.HadoopExam.com
(Learn BigData)Hadoop Certification 300+ practice Questions visit www.HadoopExam.com
The exec source command can be used to run a command outside of Flume. Output of that
command will be than ingested as an event in the Flume.
How to use exec source?
Ans: set the agents source type property to exec as below.
agents.sources.sourceid.type=exec
Define the channels as below to fed all the events to particular channel
agents.sources.sourceid.channels=channel1
You can also configure more than one channel, with space as a separator.
Now, you have to specify one of the mandatory parameter , which is command to be passed to the
operating system as below.
agents.sources.sourceid.command=tail F /user/hadoopexam/access.log
Ad -> Hadoop Training for Java Developers in just $69/3500INR visit www.HadoopExam.com
(Learn BigData)Hadoop Certification 300+ practice Questions visit www.HadoopExam.com
Required
Type
Default
type
Yes
String
channels
Yes
String
exec
Space-separated list of
channels
command
Yes
String
restart
No
boolean
FALSE
restartThrottle
No
long (milliseconds)
10000
logStdErr
No
boolean
FALSE
batchSize
No
int
20
Ad -> Hadoop Training for Java Developers in just $69/3500INR visit www.HadoopExam.com
(Learn BigData)Hadoop Certification 300+ practice Questions visit www.HadoopExam.com
Advertisement
www.HadoopExam.com provides BigData Hadoop Training and
Hadoop Developer and Admin Certification material.
Hbase Certification Material
AWS Solution Architect Certification material
Please visit or watch below YouTube videos for Sample Hadoop
Training.
Module 1 : Hadoop Introduction :
https://www.youtube.com/watch?v=R-qjyEn3bjs
Module 2 : HDFS Introduction :
https://www.youtube.com/watch?v=PK6Im7tBWow
Module 2A : HDFS File Operation Lifecycle
https://www.youtube.com/watch?v=Wu2EGfQY-i4
Dont forget to subscribe YouTube Channel for regular updates.