Thursday, February 8, 2018

Securing Apache Sqoop - part III

This is the third and final post about securing Apache Sqoop. The first post looked at how to set up Apache Sqoop to perform a simple use-case of transferring a file from HDFS to Apache Kafka. The second post showed how to secure Apache Sqoop with Apache Ranger. In this post we will look at an alternative way of implementing authorization in Apache Sqoop, namely using Apache Sentry.

1) Install the Apache Sentry Sqoop plugin

If you have not done so already, please follow the steps in the earlier tutorial to set up Apache Sqoop. Download the binary distribution of Apache Sentry (2.0.0 was used for the purposes of this tutorial). Verify that the signature is valid and that the message digests match, and extract it to ${sentry.home}.

a) Configure sqoop.properties

We need to configure Apache Sqoop to use Apache Sentry for authorization. Edit 'conf/sqoop.properties' and add the following properties:
  • org.apache.sqoop.security.authentication.type=SIMPLE
  • org.apache.sqoop.security.authentication.handler=org.apache.sqoop.security.authentication.SimpleAuthenticationHandler
  • org.apache.sqoop.security.authorization.handler=org.apache.sentry.sqoop.authz.SentryAuthorizationHandler
  • org.apache.sqoop.security.authorization.access_controller=org.apache.sentry.sqoop.authz.SentryAccessController
  • org.apache.sqoop.security.authorization.validator=org.apache.sentry.sqoop.authz.SentryAuthorizationValidator
  • org.apache.sqoop.security.authorization.server_name=SqoopServer1
  • sentry.sqoop.site.url=file:./conf/sentry-site.xml
In addition, we need to add some of the Sentry jars to the Sqoop classpath. Add the following property to 'conf/sqoop.properties', substituting the value for "${sentry.home}":
  • org.apache.sqoop.classpath.extra=${sentry.home}/lib/sentry-binding-sqoop-2.0.0.jar:${sentry.home}/lib/sentry-core-common-2.0.0.jar:${sentry.home}/lib/sentry-core-model-sqoop-2.0.0.jar:${sentry.home}/lib/sentry-provider-file-2.0.0.jar:${sentry.home}/lib/sentry-provider-common-2.0.0.jar:${sentry.home}/lib/sentry-provider-db-2.0.0.jar:${sentry.home}/lib/shiro-core-1.4.0.jar:${sentry.home}/lib/sentry-policy-engine-2.0.0.jar:${sentry.home}/lib/sentry-policy-common-2.0.0.jar

b) Add Apache Sentry configuration files

Next we will configure the Apache Sentry authorization plugin. Create a new file in the Sqoop "conf" directory called "sentry-site.xml" with the following content (substituting the correct directory for "sentry.sqoop.provider.resource"):

It essentially says that the authorization privileges are stored in a local file, and that the groups for authenticated users should be retrieved from this file. Finally, we need to specify the authorization privileges. Create a new file in the config directory called "sentry.ini" with the following content, substituting "colm" for the name of the user running the Sqoop shell:

2) Test authorization 

Now start Apache Sqoop ("bin/sqoop2-server start") and start the shell ("bin/sqoop2-shell"). "show connector" should list the full range of Sqoop Connectors, as authorization has succeeded. To test that authorization is correctly disabling access for unauthorized users, change the "ALL" permission in 'conf/sentry.ini' to "WRITE", and restart the server and shell. This time access is not granted and a blank list should be returned for "show connector".

No comments:

Post a Comment