Professional Documents
Culture Documents
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
Impala What is it ?
Adhoc real time query for Hadoop Open source Developed by Cloudera Based on Google 2010 dremel paper Direct data access via Impala engine Future Hadoop parquet update will
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
Direct data access Query planning / coordination on data nodes Node based query engine Low latency Perfomance imrovement Query data on HDFS or Hbase Uses same Hive QL syntax ( SQL like ) Has the Hue GUI Allows table joins and aggregation
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
Impala Performance
Impala delivers performance gains
Cached queries
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
Impala Formats
Supported formats
Text & Sequence Files which can be compressed as Snappy GZIP BZIP Future support for
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
Impala Architecture
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
Impala Requirements
What does Impala need to run ?
CentOS 6.2 or RHEL (Red Hat Enterprise Linux) CDH 4.1 (Cloudera Hadoop Distribution) Cloudera Manager ( advised )
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
Contact Us
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
We offer IT project consultancy We are happy to hear about your problems You can just pay for those hours that you need To solve your problems