In this page, I am going to prepare a index page to help we all installing, configuring and deploying a Hadoop ecosystem.
ZooKeeper
Hadoop
- How to deploy a HDFS environment
- Official site
- Hadoop Download Mirror Page
- A native go client for HDFS
- snakebite -- A pure python HDFS client
HBase
Hive
Pig
Spark
Impala
It is based on hive, and written in c++
You can visit its official documents:
- Impala Tutorials
- Post-Installation Configuration for Impala
- Installing Impala from the Command Line
- Using HDFS Caching with Impala (CDH 5.1 or higher only)
Presto
Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.