Professional Documents
Culture Documents
Self-Hosting
According to Amazons calculation it generally costs between $19,000 and $25,000
per terabyte per year, at list prices, to build and run a good-sized data warehouse
on your own. Amazon Redshift, all-in, will cost you less than $1,000 per terabyte per
year."
Columnar
Data
Storage
Advanced
Compressio
n
Supports
Sort key
for batter
dynamic
sorts
Can run on
Virtualized
Platforms
Index
Support
Redshift
Teradata
HP Vertica
Oracle
Database
Available
EMC
GreenPlu
m
Available
Available
Available
Available
Available
Available
Available
Available
Supported
Not
Supported
Not
Supported
Not
Supported
Not
Supported
Yes.
Since
Amazon
Redshift is
built upon
PostgreSQL
it has
inherent
capability
to run on
commodity
machines
running
virtual
platforms
Not
Available
Information
not
Available
Not
Supported
Vertica 6.1
does
support
Hardware
Virtual
Machine
but
nowhere
close to
Redshifts
offering of
Data as a
Service
Not
Supported
Information
not
Available
Information
not
Available
No
Information
Available
Supported
Supported
Available
Apache Hadoop is an open-source software framework for distributed storage and distributed
processing of Big Data.
Redshift
Hadoop
Nodes Possible
Max Node Size
Performance
100
Unlimited
16 Tb
Unlimited
Performs better at
Terabyte level
data( which is usually
sufficient for most
businesses)
Ease of Migration
As it uses PostgreSQL as
the underlying database
and SQL queries it is
already familiar to most
developers
Limited. Presently no
support for XML, data
arrays etc
Performs better at
Petabyte level data( only
relevant for large
businesses which will
anyways want to
maintain their own
warehouse)
System administrators
will need to learn Hadoop
architecture and tools as
they are quite different
and developers will need
to learn coding in Pig or
MapReduce.
All datatypes supported
Thus we can conclude that Redshift is more suited to most businesses except the
very large ones (like a database for entire Tata Group) where Hadoop might be a
better choice albeit at a higher cost than Redshift.
http://www.informationweek.com/software/information-management/amazondebuts-low-cost-big-data-warehousing/d/d-id/1107568?
http://dwh-bi-etl-reviews.quora.com/Amazon-Redshift-%E2%80%93-Differentiatorsand-Limitations
http://www.vertica.com/2010/11/23/life-beyond-indices-the-query-benefits-ofstoring-sorted-data/
http://aws.amazon.com/documentation/redshift/
http://snowplowanalytics.com/blog/2013/09/27/how-much-does-snowplow-cost-torun/