Monday, February 12, 2018

The Apache Sentry security service - part I

Apache Sentry is a role-based authorization solution for a number of big-data projects. I have previously blogged about how to install the authorization plugin to secure various deployments, e.g.:
For all of these tutorials, the authorization privileges were stored in a configuration file local to the deployment. However this is just a "test configuration" to get simple examples up and running quickly. For production scenarios, Apache Sentry offers a central security service, which stores the user roles and privileges in a database, and provides an RPC service that the Sentry authorization plugins can invoke on. In this article, we will show how to set up the Apache Sentry security service in a couple of different ways.

1) Installing the Apache Sentry security service manually

Download the binary distribution of Apache Sentry (2.0.0 was used for the purposes of this tutorial). Verify that the signature is valid and that the message digests match, and extract it to ${sentry.home}. In addition, download a compatible version of Apache Hadoop (2.7.5 was used for the purposes of this tutorial). Set the "HADOOP_HOME" environment variable to point to the Hadoop distribution.

First we need to specify two configuration files, "sentry-site.xml" which contains the Sentry configuration, and "sentry.ini" which defines the user/group information for the user who will be invoking on the Sentry security service. You can download sample configuration files here. Copy these files to the root directory of "${sentry.home}". Edit the 'sentry.ini' file and replace 'user' with the user who will be invoking on the security service (such as "kafka" or "solr"). The other entries will be ignored - 'sentry-site.xml' defines that a user must belong to the "admin" group to invoke on the security service successfully.

Finally configure the database and start the Apache Sentry security service via:
  • bin/sentry --command schema-tool --conffile sentry-site.xml --dbType derby --initSchema
  • bin/sentry --command service -c sentry-site.xml
2) Installing the Apache Sentry security service via docker

Instead of having to download and configure Apache Sentry and Hadoop, a simpler way to get started is to download a pre-made docker image that I created. The DockerFile is available here and the docker image is available here. Note that this docker image is only for testing use, as the security service is not secured with kerberos and it uses the default credentials. Download and run the docker image with:
  • docker pull coheigea/sentry
  • docker run -p 8038:8038 coheigea/sentry
Once the container has started, we need to update the 'sentry.ini' file with the username that we are going to use to invoke on the Apache Sentry security service. Get the "id" of the running container via "docker ps" and then run "docker exec -it <id> bash". Edit 'sentry.ini' and change 'user' to the username you are using.

In the next tutorial we will look at how to manually invoke on the security service.

No comments:

Post a Comment