Abstract
Hadoop framework has been evolved to manage big data in cloud. Hadoop distributedfile system and MapReduce, the vital components of this framework, provide scalable and fault-tolerantbig data storage and processing services at a lower cost. However, Hadoop does not provide any robustauthentication mechanism for principals’ authentication. In fact, the existing state-of-the-art authenticationprotocols are vulnerable to various security threats, such as man-in-the-middle, replay, password guessing,stolen-verifier, privileged-insider, identity compromization, impersonation, denial-of-service, online/off-linedictionary, chosen plaintext, workstation compromization, and server-side compromisation attacks. Besidethese threats, the state-of-the-art mechanisms lack to address the server-side data integrity and confidentialityissues. In addition to this, most of the existing authentication protocols follow a single-server-based userauthentication strategy, which, in fact, originates single point of failure and single point of vulnerabilityissues. To address these limitations, in this paper, we propose a fault-tolerant authentication protocol suitablefor the Hadoop framework, which is called the efficient authentication protocol for Hadoop (HEAP). HEAPalleviates the major issues of the existing state-of-the-art authentication mechanisms, namely operating-system-based authentication, password-based approach, and delegated token-based schemes, respectively,which are presently deployed in Hadoop. HEAP follows two-server-based authentication mechanism. HEAPauthenticates the principal based on digital signature generation and verification strategy utilizing bothadvanced encryption standard and elliptic curve cryptography. The security analysis using both the formalsecurity using the broadly accepted real-or-random (ROR) model and the informal (non-mathematical)security shows that HEAP protects several well-known attacks. In addition, the formal security verificationusing the widely used automated validation of Internet security protocols and applications ensures that HEAPis resilient against replay and man-in-the-middle attacks. Finally, the performance study contemplates thatthe overheads incurred in HEAP is reasonable and is also comparable to that of other existing state-of-the-art authentication protocols. High security along with comparable overheads makes HEAP to be robust andpractical for a secure access to the big data storage and processing services.