Kerberos on Hadoop

The Kerberos is an authentication protocol which creates tickets to allow communication between nodes on non-secured network. Ticket must be periodically triggered by kinit command by each user. In Kerberos we call users as principals. We can divided principals basically into several groups:

  • System users – principals for communication between services in Hadoop cluster
  • Common users

The Kerberos server itself is known as the Key Distribution Center (KDC). At a high level, it has three parts:

  • A database of the users and services (known as principals) that it knows about and their respective Kerberos passwords
  • An Authentication Server (AS) which performs the initial authentication and issues a Ticket Granting Ticket (TGT)
  • A Ticket Granting Server (TGS) that issues subsequent service tickets based on the initial TGT

How it works?

A user principal requests authentication from the AS (1). The AS returns a TGT that is encrypted using the user principal’s Kerberos password (2), which is known only to the user principal and the AS. The user principal decrypts the TGT locally using its Kerberos password, and from that point forward, until the ticket expires, the user principal can use the TGT to get service tickets from the TGS (3). Service tickets are what allow a principal to access various services (4)(5)(6).

Because cluster resources (hosts or services) cannot provide a password each time to decrypt the TGT, they use a special file, called a keytab, which contains the resource principal’s authentication credentials. The set of hosts, users, and services over which the Kerberos server has control is called a realm.

To use Hadoop command, you need to use kinit command to get a Kerberos ticket first: kinit [-kt user_keytab username]. Once it’s done, you can list the ticket with: klist.

Terminology

Term Description
Key Distribution Center, or KDC The trusted source for authentication in a Kerberos-enabled environment.
Kerberos KDC Server The machine, or server, that serves as the Key Distribution Center (KDC).
Kerberos Client Any machine in the cluster that authenticates against the KDC.
Principal The unique name of a user or service that authenticates against the KDC.
Keytab A file that includes one or more principals and their keys.
Realm The Kerberos network that includes a KDC and a number of Clients.
KDC Admin Account An administrative account used by Ambari to create principals and generate keytabs in the KDC.

Resources

Spark fundamental Yarn walkthrough

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×