You are on page 1of 28

BI BIG DATA

BIG DATA
Gartner : Big Data

Variety:

Velocity:

Volume:

TBZB

oracle

BIG DATA
CRM
10000
Billing
Location

CDRs
Network Devices

Traditional /
Relational
Data Sources

Database &
Warehouse

At-Rest Data
Analytics

Non-Traditional /
Non-Relational
Data Sources

Internet
Blogs, e-Mail

Non-Traditional/
Non-Relational
Data Sources
Traditional/Relational
Data Sources

Internet Scale

Data Analytics,
Data Operations &
Model Building

-- ---

2012
2013
2012

--

4.1

20

4.1.1

20

4.1.1.1

20

4.1.1.2

22

4.1.1.3

24

4.1.1.4

25

4.1.1.5

27

4.1.1.6

29

4.1.1.7

31

4.1.1.8

32

4.1.1.9

33

4.1.1.10 168

34

4.1.2 35

4.1.2.1

36

4.1.2.2

36

4.1.2.3

37

4.1.2.4

38

4.1.2.5

40

4.1.2.6

4.1.

42

4.1.3

4.1.3.1

4.1.4

47

4.1.4.1 114

47

4.1.5

48

4.1.5.1

48

4.1.5.2

50

4.1.6

52

4.1.6.1

52

4.1.6.2

52

44
44

--

Big DataEDA

()
E
D
A

()

(Oracle/G
P)

EDW
(oracle)

EDW
()

ODS

(hadoop)

CRMBSS, OSS




EDA

1-

2-

,


3-

5
5

1
1

URL

2
4

1.URL2.

3./
4.

4
Internet

:1.,URL2.Int
; 3. 4.
5. 14

BIG DATA
CRM
10000
Billing
Location

CDRs
Network Devices

Traditional /
Relational
Data Sources

Database &
Warehouse

At-Rest Data
Analytics

Non-Traditional /
Non-Relational
Data Sources

Internet
Blogs, e-Mail

Non-Traditional/
Non-Relational
Data Sources
Traditional/Relational
Data Sources

Internet Scale

Data Analytics,
Data Operations &
Model Building

Infiniband
20Gbps

1 3000
30000000
2 2
120
3 2
120
4 5
300
5 3T
3*1024*1024
6
242.8MB/s
7 30
150T

Hadoop

ZB

Big Data


MapReduce

Hadoop
HDFS

MapReduce
HIVE

HBASE

--

hadoop

ODS

Heritrix

Hadoop HDFS

HIVE

HIVE
HDFS


2011/12/27 16:35:11 [debug] 243385#0: *11 LatnId=551
2011/12/27 16:35:11 [debug] 243385#0: *11 avscFileName=3504.avsc
2011/12/27 16:35:11 [debug] 243385#0: *11 svcName:DPRINT will be called.
2011/12/27 16:35:11 [debug] 243385#0: *11 BeginWrite:ret=1
2011/12/27 16:35:11 [debug] 243385#0: *11 sim tpcall success!
--------------------------- --------- ------------------- --------------------------

log_time, log_level, thread_info,

log_detail

Select log_time, log_detail from log_table


where log_level=error

HBASE
HBASE

news.
sina.c
om

24
[
]
7


44

MapReduce
MapRecude
Master

500012101.2

DataNode

Map

(201110, 40.27 )
(201110, 149 )
(201110, 25.15 )
(201110, 138.05)
(201111, 197.5 )
(201111, 128.25)
(201111, 302.74)
(201111, 156.45)
(201112, 277.39)
(201112, 129 )
(201112, 156.17)
(201112, 130 )

(201110, 40.27, 149, 25.15, 138.05)


(201111, 197.5, 128.25, 302.74, 156.45)
(201112, 277.39, 129, 156.17, 130)

Reduce

.
.

DataNode

(201110, 352.47)

DataNode

(201111, 784.94)
(201112, 692.56)

ETL

URL

APIURL

URL

URL
HADOOP

URL//

URL

Itv

+
20111018102340-723938881 | 20111018102250601149905 | 20111018102340 | 189xxxxxxxx |
221.179.193.19 | 80 | weibo.cn |
http://weibo.cn/dpool/ttt/home.php?uid=1285846970&g
sid=3_5bc65ef7862f7c9a315084e6aa8204391a29bf2f0d4
bbc5645 |
http://weibo.cn/dpool/ttt/msg.php?uid=1285846970&gsi
d=3_5bc65ef7862f7c9a315084e6aa8204391a29bf2f0d4bb
c5645 | 200 | text/vnd.wap.wml | wap | 550 | 19823 |
10114 | 14021 | BREWApplet/0x20068888(BREW/3.1.5.20;DeviceId:180027;Lang:
zhcn)ucweb-squid | 3 | WAP2.0 | GET | CTWAP

27


Q&A

You might also like