• Authentication – Verifying credentials to reliably identify a user
• Authorization – Limiting the user’s access to a given resource
• User – Individual identified by underlying authentication system
• Group – A set of users, maintained by the authentication system
• Privilege – An instruction or rule that allows access to an object
• Role – A set of privileges; a template to combine multiple access rules
• Authorization models – Defines the objects to be subject to authorization rules and the granularity of actions allowed. For example, in the SQL model, the objects can be databases or tables, and the actions are SELECT, INSERT, and CREATE. For the Search model, the objects are indexes, collections and documents; the access
modes are query and update.
HDFS ACLs overview:
The user identity mechanism is extrinsic to HDFS itself. There is no provision within HDFS for creating user identities, establishing groups, or processing user credentials.
If you’ve ever used POSIX ACLs on a Linux file system, then you already know how ACLs work in HDFS. Best practice is to rely on traditional permission bits to implement most permission requirements, and define a smaller number of ACLs to augment the permission bits with a few exceptional rules.
To set and get file access control lists (ACLs), use the file system shell commands, setfacl and getfacl.
<!-- To list all ACLs for the file located at /user/hdfs/file --> sudo -u hdfs hdfs dfs -getfacl /user/hdfs/file <!-- -R: Use this option to recursively list ACLs for all files and directories. --> sudo -u hdfs hdfs dfs -getfacl -R /user/hdfs/file
hdfs dfs -setfacl [-R] [-b|-k -m|-x <acl_spec> <path>]|[--set <acl_spec> <path>] <!-- COMMAND OPTIONS <path>: Path to the file or directory for which ACLs should be set. -R: Use this option to recursively set ACLs for all files and directories. -b: Revoke all permissions except the base ACLs for user, groups and others. -k: Remove the default ACL. -m: Add new permissions to the ACL with this option. Does not affect existing permissions. -x: Remove only the ACL specified. <acl_spec>: Comma-separated list of ACL permissions. --set: Use this option to completely replace the existing ACL for the path specified. Previous ACL entries will no longer apply. -->
<!-- To give user ben read & write permission over /user/hdfs/file --> hdfs dfs -setfacl -m user:ben:rw- /user/hdfs/file <!-- To remove user alice's ACL entry for /user/hdfs/file --> hdfs dfs -setfacl -x user:alice /user/hdfs/file <!-- To give user hadoop read & write access, and group or others read-only access --> hdfs dfs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /user/hdfs/file
For the following folder we will set “execs” group to have read permission:
–rw–r—– 3 bruce sales 0 2014–03–04 16:31 /sales–data
hdfs dfs -setfacl -m group:execs:r-- /sales-data
Check results by running getfacl.
hdfs dfs -getfacl /sales-data
# file: /sales-data
# owner: bruce
# group: sales
Default ACLs define the ACL that newly created child files and directories receive automatically.
Set default ACL on parent directory.
> hdfs dfs -setfacl -m default:group:execs:r-x /monthly-sales-data
> hdfs dfs -mkdir /monthly-sales-data/JAN
> hdfs dfs -mkdir /monthly-sales-data/FEB
Verify HDFS has automatically applied default ACL to sub-directories.
hdfs dfs -getfacl -R /monthly-sales-data
# file: /monthly-sales-data/FEB
# owner: bruce
# group: sales
The default ACL is copied from the parent directory to the child file or child directory at time of creation. Subsequent changes to the parent directory’s default ACL do not alter the ACLs of existing children.
For more information on using HDFS ACLs, see the HDFS Permissions Guide on the Apache website.
LDAP and AD concepts:
What are CN, OU, DC?
String X.500 AttributeType ------------------------------ CN commonName L localityName ST stateOrProvinceName O organizationName OU organizationalUnitName C countryName STREET streetAddress DC domainComponent UID userid
What does the string from that query mean?
The string (
"CN=Dev-India,OU=Distribution Groups,DC=gp,DC=gl,DC=google,DC=com") is a path from an hierarchical structure (DIT = Directory Information Tree) and should be read from right (root) to left (leaf).
It is a DN (Distinguished Name) (a series of comma-separated key/value pairs used to identify entries uniquely in the directory hierarchy). The DN is actually the entry’s fully qualified name.
Here you can see an example where I added some more possible entries.
The actual path is represented using green.
The following paths represent DNs (and their value depends on what you want to get after the query is run):