Below are some reference architectures for Hadoop:
Cluster Hosts and Role Assignments
CDH 5 and Cloudera Manager 5 Requirements and Supported Versions
High Availability support
HDFS Directory Structure recommendation (Eric Sammer):
/data : Contains canonical, raw data sets ingested from other systems. Read only to users.
/user/<username> : Home directories / scratch pads for users.
/etl : Contains ETL process queue directories
/tmp : Sticky-bit set scratch for tools and users (no guarantee on longevity).
/data/<dataset name>/<optional partitions>
Why Hadoop cannot replace traditional RDBMS at present:
Active Directory concepts: