5/17/2018

Solr DIH Configuration File with an encrypted password

The data import handler is configured in solrconfig.xml via a requestHandler, name="/dataimport", which references a DIH configuration document of your choosing.
solrconfig.xml example:
<requestHandler name="/dataimport" class="solr.DataImportHandler"> 
 <lst name="defaults">
  <str name="config">config.xml</str>
  </lst> 
 </requestHandler>
The config file has a lot of options, in short this is where you configure a database connection string and reference your jdbc jar file. Full details are here.  By default any of the examples that come with the Solr distribution use a plain text username and password.  This can be potentially viewed from the front end:
http://hostname:8983/solr/ > Select Collection from the drop-down > Click data Import > expand configuration
Obviously we do not want to store our username and password in plain text.  The config file includes an option to encrypt the password and then store the key in a separate file. (If you're interested in the contributors discussing the security implementation, there are more details here.)
The process for configuring this encryption is as follows:
  1. Encrypt your password
    (-n is very important for the echo commands, it ensures there are no newline characters, which can be problematic.  Errors related to neglecting -n include: "Error decoding password" and "Bad password, algorithm, mode or padding")
    1. Write your current DB password to a file
      echo -n "mypassword" > /data/solrtmp/collection/conf/pwd.txt
      
    2. Encrypt the password:
      openssl enc -aes-128-cbc -a -salt -in /data/solrtmp/collection/conf/pwd.txt
      The result of the above command should be a hashed value, which will be used as the password value in the config file.
      During encryption, you will be asked to enter a key.
    3. Write the key, used above to hash the password, to a new file:
      (you can name the file anything you like)
      echo -n "mykey" > /data/solrtmp/collection/conf/key.txt
      
    4. Remove the plain text password file:
      rm pwd.txt
  2. Configure file permissions, ensuring only the solr account can access this file:

    sudo chown solr:solr key.txt
    run as solr:
    chmod 600 key.txt
  3. Copy the decryption key to all servers, or repeat the above steps on each server
    (Any directory will work, just make sure it's the same across the cluster and the config)
  4. Put the details in your config file:
    In your config.xml file (it can be named anything), enter the values for user, password, and encryptKeyFile:
    <dataConfig> 
     <dataSource driver="oracle.jdbc.OracleDriver" url="jdbc:oracle:thin:@.../..." user="solrservice" password="U2FsdGVkX1/mzOZi9P2iBUPEbtaHo/7SO+nOQTqqHrw=" encryptKeyFile="/data/solrtmp/collection/conf/key.txt" /> 
     <document> 
    ... 
     </document> 
     </dataConfig>
  5. If you run in solrcloud mode, you will need to upload the config.xml file to zookeeper:

You now have configured a hashed database password for solr to use with the data-import-handler.  You can test things out by attempting to run a full-import.

Commands that may be of interest while running an import:

Status for a data import can be viewed by executing the following HTTP command:
http://hostname:8983/solr/[collection]/dataimport?command=status

If you know the request ID, you can do the following:
http://hostname:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1526317641074