You are on page 1of 4

Hadoop 11MapReduce

2013-08-13

MapReduce Google 1TB


MapReduce

Map
Reduce

MapReduce

1Client: MapReduce
2JobTracker
3TaskTracker: Map Reduce 4Shared
FileSystem( HDFS

1.
Job waitForCompletion(true) jobtracker
ID 2 InputSplit
JAR
ID jobtracker JAR
mapred.submit.replication 10 3 jobtracker
4
2.
JobTracker job scheduler
5
Job InputSplit 6
InputSplit map
3.
TaskTracker (heartbeat) JobTracker. jobtracker
tasktracker 7 jobtracker
tasktracker jobtracker Hadoop
MapReduce FIFO
Fair Scheduler Capacity Scheduler
jobtracker map
reduce ,tasktracker
4.
tasktracker
JAR tasktracker
tasktracker
8tasktracker JAR
tasktracker TaskRunner
TaskRunner child JVM 9
2

Hadoop wordcount
hadoop jar hadoop-examples-1.0.4.jar wordcount /usr/input /usr/output

JobTracker Map
M1M2 M3 Reduce R1 R2Map Reduce
TaskTracker TaskTracker Java
HDFS InputFormat
ASCII JDBC
InputFormat InputSplit
splite1 splite5
InputFormat RecordReader
<k,v><k,v> map
map context.collect OutputCollector. collect
3

context Mapper Partitioner


Mapper Combiner Mapper
<k,v> list key
list Combiner
Partitioner M1 Combiner
Partitioner
Map Reduce 3 Shuffle
sort reduce
Hadoop MapReduce Map key
Reducer Mapper key
key Reducer
HTTP
Mapper key
<key,value>
Reduce Shuffle sort <key, (list of values)>
Reducer. reduce OutputFormat DFS