Setting up ElasticSearch with Basic Auth and SSL for use with Python

photos/p1420602336.52.jpg

I'm interested in learning to use ElasticSearch, so I thought I'd document how I set it up on my EC2 instance. Because I wanted to write code on my laptop, I needed to expose ElasticSearch over the public internet, which added a bit of extra complexity. Here is a rough outline of the process:

I'm running Ubuntu 14.04 on the server.

Installing packages

Let's begin by installing Java, supervisord and nginx:

$ sudo apt-get install openjdk-7-jre-headless supervisor nginx

Now we can grab ElasticSearch. At the time of writing, the current version is 1.4.2. The commands below will create an elasticsearch user for the ElasticSearch process (kind of like how you might use www-data to run your web server), then install the ElasticSearch app into the new user's home directory.

$ export ESVER="1.4.2"
$ export ESHOME="/home/elasticsearch"
$ cd /tmp
$ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-$ESVER.tar.gz
$ tar xzf elasticsearch-$ESVER.tar.gz
$ sudo useradd elasticsearch
$ sudo mkdir -p $ESHOME/bin/
$ sudo cp -R /tmp/elasticsearch-$ESVER/ $ESHOME/bin/
$ sudo ln -s $ESHOME/bin/elasticsearch-$ESVER $ESHOME/bin/elasticsearch
$ sudo mkdir $ESHOME/bin/elasticsearch/logs/
$ sudo chown -R elasticsearch:elasticsearch /home/elasticsearch/

Kicking the tires

Now's a good time to see if our install worked correctly. Once you've verified the server starts up, kill it with Ctrl+C.

$ sudo su elasticsearch
$ cd $HOME/bin/elasticsearch/bin
$ ./elasticsearch

Supervisor

To run this process automatically on boot and generally manage processes I like to use supervisord. Daemon configs are placed in /etc/supervisor/conf.d. We will add a new file elastic-search.conf (or .ini depending on your config) to this directory containing information for running ElasticSearch. I specified some java options instructing ElasticSearch to use very little RAM -- depending on the size of your server you may want to increase these values.

[program:elasticsearch]
command=/home/elasticsearch/bin/elasticsearch/bin/elasticsearch
directory=/home/elasticsearch/bin/elasticsearch/
user=elasticsearch
environment=JAVA_OPTS="-Xmx64m -Xms32m",ES_MIN_MEM=32m,ES_MAX_MEM=64m
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/home/elasticsearch/bin/elasticsearch/logs/supervisor.out.log
stderr_logfile=/home/elasticsearch/bin/elasticsearch/logs/supervisor.out.err

The following commands will start ElasticSearch with supervisord:

$ sudo supervisorctl reread
$ sudo supervisorctl update
$ sudo supervisorctl status
elasticsearch           RUNNING    pid 31632, uptime 0:00:20

We can use curl to check on things:

$ curl localhost:9200

Output:

{
  "status" : 200,
  "name" : "Blah blah",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "1.4.2",
    "build_hash" : "927caff6f05403e936c20bf4529f144f0c89fd8c",
    "build_timestamp" : "2014-12-16T14:11:12Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.2"
  },
  "tagline" : "You Know, for Search"
}

Self-signed SSL certificate

Because I plan to connect to ElasticSearch from my laptop using basic auth, I need to make sure the connection is secure with a self-signed SSL certificate.

$ sudo su
# mkdir /etc/nginx/certs
# cd /etc/nginx/certs
# openssl genrsa 2048 > host.key
# openssl req -new -x509 -nodes -sha1 -days 3650 -key host.key > host.cert
# openssl x509 -noout -fingerprint -text < host.cert > host.info
# cat host.cert host.key > host.pem

Create basic auth credentials

Basic auth credentials will be stored in /etc/nginx/es-password. We will use htpasswd to generate the credentials file. The following line will create a user elasticsearch and prompt for a password:

# htpasswd -c /etc/nginx/es-password elasticsearch

Nginx

The final step on the server side will be to set up Nginx to proxy public internet connections to the ElasticSearch server running on localhost. Create a file named elastic-search in /etc/nginx/sites-enabled/ and add the following contents:

upstream elasticsearch {
  server 127.0.0.1:9200;
  keepalive 15;
}

server {
  listen 9999;

  ssl on;
  ssl_certificate /etc/nginx/certs/host.cert;
  ssl_certificate_key /etc/nginx/certs/host.key;
  ssl_session_timeout 5m;
  ssl_protocols TLSv1.2 TLSv1.1 TLSv1;
  ssl_ciphers HIGH:!aNULL:!eNULL:!LOW:!MD5;
  ssl_prefer_server_ciphers on;

  auth_basic "ElasticSearch";
  auth_basic_user_file /etc/nginx/es-password;

  location / {
    proxy_pass http://elasticsearch;
    proxy_http_version 1.1;
    proxy_set_header Connection "Keep-Alive";
    proxy_set_header Proxy-Connection "Keep-Alive";
  }
}

Now restart Nginx to pick up the changes:

$ sudo /etc/init.d/nginx restart

EC2 Security Groups

If you are using EC2, you may need to add a rule to your instance's security group to allow inbound connections on port 9999 (or whatever port you specified in the Nginx config).

Connecting from the Laptop

With ElasticSearch up and running on the server, let's install the python driver and test things out.

$ pip install elasticsearch requests

To connect from Python we will use the requests-specific transport, which provides SSL and basic auth support. Open a python shell and try the following commands:

>>> from elasticsearch import Elasticsearch, RequestsHttpConnection
>>> es = Elasticsearch(
...     ['your.hostname.com:9999'],
...     connection_class=RequestsHttpConnection,
...     http_auth=('elasticsearch', 'yourpassword'),
...     use_ssl=True,
...     verify_certs=False)
>>> es.info()
{u'cluster_name': u'elasticsearch',
 u'name': u'Blah blah',
 u'status': 200,
 u'tagline': u'You Know, for Search',
 u'version': {u'build_hash': u'927caff6f05403e936c20bf4529f144f0c89fd8c',
  u'build_snapshot': False,
  u'build_timestamp': u'2014-12-16T14:11:12Z',
  u'lucene_version': u'4.10.2',
  u'number': u'1.4.2'}}

That's it! Now it's time to start actually writing some code...

Comments (7)

Gary C. | jan 11 2015, at 02:29pm

I wish I could take credit for that, Charles, but it isn't my work. I'm still trying to wrap my mind around how to integrate Elasticsearch (ES) into an existing web app project that currently makes exclusive use of PostgreSQL. I'm considering removing all of the existing Pg full-text search functionality and replacing it with ES.

Charles | jan 10 2015, at 10:10pm

Thanks, Gary! I ran across your project the other day while googling around, it looks very interesting.

Gary C. | jan 10 2015, at 11:27am

One minor typo, Charles: You import Elasticsearch (which is correct), but you instantiate with ElasticSearch (capital "S").

Thank you, again, for the excellent and informative post.

Gary C. | jan 10 2015, at 10:45am

Charles,

Thank you for taking the time to document your progress. I appreciate the relevance of many of your articles. I thought you may also be interested in https://github.com/oliver006/elasticsearch-gmail. It helped fill-in some of the gaps in my understanding.

Nicolas | jan 08 2015, at 08:11am

Thanks for documenting this!

Charles | jan 07 2015, at 10:31am

Tomato, tomahto?

Someone | jan 07 2015, at 10:03am

Elasticsearch can be (and should be) installed via repository http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-repositories.html

There are also no reason to install supervisor. Just use /etc/init.d/elasticsearch.

Any reason to not use this way?


Commenting has been closed, but please feel free to contact me