Welcome to Apache™ Hadoop®!
Hadoop 201 -- Deeper into the Elephant
» Prerequisites for Learning Hadoop – Hadoop Training
» Hadoop Cluster – Architecture and Core Components
» Hadoop 1.0 vs Hadoop 2.0
IBM Analytics - Hadoop
Hadoop Dev: IBM BigInsights for Hadoop Developer Community
IBM - What is the Hadoop Distributed File System (HDFS) - United States
Hadoop: The Definitive Guide - Tom White - Google Books
The Architecture of Open Source Applications: The Hadoop Distributed File System
Testing of several distributed file-systems (HDFS, Ceph and GlusterFS) for supporting the HEP experiments analysis Performance, comparison
MapR's Direct Access NFS vs. Hadoop FUSE
Installation
Scalable Spark/HDFS Setup using Docker — Medium
Getting started with HDFS on Kubernetes – Hasura
Mounting
MountableHDFS - Hadoop Wiki
Simplifying data management: NFS access to HDFS - Hortonworks
cemeyer/hadoofus C, FUSE, libhdfs-compatible, out-of-order execution
remis-thoughts/native-hdfs-fuse C, FUSE
cloudera/hdfs-nfs-proxy Java, Nfs4
Tuning
How-to: Deploy Apache Hadoop Clusters Like a Boss - Cloudera Engineering Blog
Hadoop configuration & performance tuning
Ecosystem
The ecosystem is vast, these are what I came across.
spotify/luigi: Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Cascading | Application Platform for Enterprise Big Data
Apache YARN & Hadoop - Hortonworks
Spark tutorial: Get started with Apache Spark | InfoWorld
Spark