Part 1 | Part 2 | Part 3

Overview:

This is the first of three articles discussing the process to stand up an OSM Nominatim instance. If you found this page, you probably are looking for:

  • An alternate geocoding alternative to the costly enterprise API licenses
  • Ability to reverse geocode (lat/long to physical street address) … fast and reliably
  • Use acquired geocoding data offline or within back-end analytic/reporting systems with minimal license restrictions (see http://www.openstreetmap.org/copyright)

To the Rescue: Open Street Maps (OSM) Nominatim

Open Street Maps (OSM) Nominatim project is full reverse geocoding solution to search OSM data by name and address and to generate synthetic addresses. It can be found at

A OSM Nominatim instance can be spun up to target your use case, and there are data sub-sets that are limited to states, regions, countries and continents; geofabrik.de is one of many GIS sites providing daily OSM planet file extracts. There are several Nominatim incubator projects in the wild that address all areas of GIS.

A Word of Warning …

This path is not for the faint of heart, though rewarding it is. The OSM blend that will be discussed in this and following articles will consist of a stock OSM Nominatim system,  loaded with the OSM North America planet file extract and supplemented with the US Census TIGER address data. Even with this reduced location set, the data load and provisioning process will take days to complete within a local CentOS environment with continuous use of 16GB ram, four-core processing power, and about 1TB of drive space.

Upon completion of this three-part article, the PostgreSQL data set files, which will be about 240GB, and can be copied up to your target operating virtual/cloud server of choice, such as Linode. This process can be repeated offline to refresh your data. With that said, lets get to the prerequisites.

Prerequisites:

  • Familiarity with CentOS 7 and basic administration, as this is a Linux LAMP open-source solution, but with PostgreSQL (aka Postgres) as the RDMS
  • Time and patience:
    • The server base CentOS 7 configuration process will take about 25 minutes and usual time for yum updates
    • The OSM Nominatim, scripts and Postgres configuration will take 45 – 60 minutes, depending on your comfort level
    • The following article shall discuss the data loading process, which will take several days of uninterrupted operation to complete.
    • A UPS is highly recommended, as any hiccup in the process will require a restart.
  • Good amount of RAM, the VMWare system here at the home bunker lab had 16GB allocated to the instance.
  • Lots of hard-drive space, as you will be copying and expanding a big data set.
    • I had 1TB of space available, and used about 300GB of that during this process.
  • As many cores as you can muster. I had a 4-core 2.5 GHz processor that was solely dedicated to the VM instance during the installation process.
  • I recommend reading through many of the other Nominatim setup blogs and articles, as there are several variations to this brew.
  • See disclaimer

1. Base Server Configuration

echo "Step 1: Assuming you have a fresh load of CentOS 7, let's start with an update" 
yum update -y

echo "Step 2: Standard CentOS repositories don't contain all the required packages"
echo "you need to enable the EPEL repository as well"
yum install -y epel-release
yum install -y wget unzip nano
yum install -y gdal-python
yum install -y java

echo "Step 3: Now some additional packages required by Nominatim"

yum install -y postgresql-server postgresql-contrib postgresql-devel postgis postgis-utils 
git cmake make gcc gcc-c++ libtool policycoreutils-python 
php-pgsql php php-pear php-pear-DB libpqxx-devel proj-epsg 
bzip2-devel proj-devel geos-devel libxml2-devel boost-devel expat-devel zlib-devel+

echo "Optional Step: If you want to run the test suite, you need to install the following"
yum install -y python-pip python-Levenshtein python-psycopg2 php-phpunit-PHPUnit
pip install --user --upgrade pip setuptools lettuce==0.2.18 six==1.9 haversine Shapely pytidylib

2. Memory Overcommit

There shall be a lot of processing during the loading, and large files shall be read. Here is the memory overcommit settings that I used with success. Edit the /etc/sysctl.conf as follows:

nano /etc/sysctl.conf

mm.overcommit_memory = 1
vm.swappiness=10
vm.vfs_cache_pressure = 50

3. OSM Osmosis Tool Install

Osmosis is a small java-based command line tool to assist in the import and reconciliation of OSM data sources and core data.

su root
cd /opt
mkdir osmosis
cd osmosis
wget http://bretth.dev.openstreetmap.org/osmosis-build/osmosis-latest.tgz

tar xvfz osmosis-latest.tgz

rm osmosis-latest.tgz

chmod a+x bin/osmosis

ls -las bin/osmosis
ln -s /opt/osmosis/bin/osmosis /bin/osmosis
ln -s /opt/osmosis/bin/osmosis-extract-apidb-0.6 /bin/osmosis-extract-apidb-0.6
ln -s /opt/osmosis/bin/osmosis-extract-mysql-0.6 /bin/osmosis-extract-mysql-0.6

4. Setup nominatim User

Nominatim will run as a global service on your machine. It is therefore best to install it under its own separate user account. In the following we assume this user is called nominatim and the installation will be in /srv/nominatim.

useradd -d /srv/nominatim -s /bin/bash -m nominatim

usermod -aG wheel nominatim
passwd nominatim

echo "Make sure that system servers can read from the home directory"
chmod a+x /srv/nominatim

echo "Lets also create an osmadmin user to continue installation and manage Postres"
useradd osmadmin -s /bin/bash -m
passwd osmadmin
usermod -aG wheel osmadmin

5. Base Server Config: PostgreSQL

We need the default PostgreSQL 9.2 installed to have the default include and lib directories and components for the Nominatim CMAKE scripts.

NOTE: I attempted to use the latest PostgreSQL 9.6, but encountered various issues mapping the geocoding libraries, so I fell back to the stock 9.2 version. In later updates to this article, I will revisit installing a later version of PostgreSQL to take advantage of the new features and GIS functions.

  1. Initial postgres configuration:
systemctl stop postgresql
mkdir /home/postgres

usermod -d /home/postgres postgres

cp -r /etc/skel/. /home/postgres
chown postgres:postgres /home/postgres
 
usermod -s /bin/bash postgres
 
sudo postgresql-setup initdb
 sudo systemctl enable postgresql

2. Tuning postgresql.conf for OSM Nominatim:

nano /var/lib/pgsql/data/postgresql.conf  

listen_addresses = '*'
port = 5432
shared_buffers (2GB)
maintenance_work_mem (2GB)
work_mem (50MB)
effective_cache_size (2GB)
synchronous_commit = off
checkpoint_segments = 100 # only for postgresql <= 9.4
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
# The numbers in brackets behind some parameters seem to work fine for 32GB RAM machine. 
# Adjust to your setup.
# For the initial import, you should also set:
fsync = off
full_page_writes = off

3. As the Nominatim instructions note,don’t forget to re-enable full_page_writes and fsync them after the initial import or you risk database corruption. Autovacuum must not be switched off because it ensures that the tables are frequently analyzed

4.Now start the postgresql service after updating this config file.
systemctl restart postgresql

5.Logout as root, and login as osmadmin

6.Add two postgres users: one for the user that does the import and another for the webserver which should access the database only for reading:

sudo -u postgres createuser -s nominatim

sudo -u postgres createuser apache 

7.Logout and back in as root

8.Change password of the linux postgres user so you can use that to login to postgres. 
passwd postgres

6. Enable Ports 80 and 443

Allow the default HTTP and HTTPS port, ports 80 and 443, through firewalld:

sudo firewall-cmd --permanent --add-port=80/tcp

sudo firewall-cmd --permanent --add-port=443/tcp

echo "Reload the firewall config"
sudo firewall-cmd --reload

echo "List enabled rules"
sudo firewall-cmd --list-all

The above should display something similar to:

public (active) target: default icmp-block-inversion: no interfaces: eth0 sources: services: dhcpv6-client ssh ports: 443/tcp 80/tcp protocols: masquerade: no forward-ports: sourceports: icmp-blocks: rich rules ...

7. Installing and Building OSM Nominatim

If you are still with us, its now time to build the OSM Nominatim core and configure the associated PHP scripts.

  1. Get the source code from Github and change into the source directory
    cd /srv/nominatim/
    sudo git clone --recursive git://github.com/twain47/Nominatim.git
    cd Nominatim
    ls -las
  2. The code must be built in a separate directory. Create this directory, then configure and build Nominatim in there:
sudo mkdir /srv/nominatim/build
cd /srv/nominatim/build
sudo cmake /srv/nominatim/Nominatim
sudo make

3. While still in the /srv/nominatim/build directory, you need to create a minimal configuration file that tells nominatim the name of your webserver user and the URL of the website.

sudo tee /srv/nominatim/build/settings/local.php << EOF
<?php
 @define('CONST_Database_Web_User', 'apache');
 @define('CONST_Website_BaseURL', '/nominatim/');
EOF

IMPORTANT: Make sure there are no spaces or blank lines at the beginning of /srv/nominatim/build/settings/local.php

4. In my build, it seemd that Nominatim like to be run from a specific nested directory, so lets copy the build scripts there:
cp /srv/nominatim/build /srv/nominatim/Nominatim/ -R

5. Change ownership of service directory to nominate
chown nominatim:nominatim /srv/nominatim/ -R 

8. Setting Up Apache Server

  1. You need to create an alias to the website directory in your apache configuration. Add a separate nominatim configuration to your webserver:
sudo tee /etc/httpd/conf.d/nominatim.conf << EOFAPACHECONF
<Directory "/srv/nominatim/Nominatim/build/website">
  Options FollowSymLinks MultiViews
  AddType text/html   .php
  Require all granted
</Directory>
 
Alias /nominatim /srv/nominatim/Nominatim/build/website
EOFAPACHECONF

9. CentOS SELINUX and Nominatim

No CentOS build is complete without facing the intracasies of SELinux … ha. Here is the process I used. Note that I ran the restorcon after each statement.

sudo semanage fcontext -a -t httpd_sys_content_t "/srv/nominatim/Nominatim/(website|lib|settings)(/.*)?" 
sudo restorecon -R -v /srv/nominatim/Nominatim
 
sudo semanage fcontext -a -t httpd_sys_content_t "/srv/nominatim/Nominatim/build/(website|lib|settings)(/.*)?"
sudo restorecon -R -v /srv/nominatim/Nominatim/build
 
sudo semanage fcontext -a -t httpd_sys_content_t "/srv/nominatim/build/(website|lib|settings)(/.*)?"
sudo restorecon -R -v /srv/nominatim 
 
sudo semanage fcontext -a -t lib_t "/srv/nominatim/build/module/nominatim.so" 
sudo restorecon -R -v /srv/nominatim/build/module 
 
sudo semanage fcontext -a -t lib_t "/srv/nominatim/Nominatim/module/(/.*)?" 
sudo restorecon -R -v /srv/nominatim/Nominatim/module  
 
sudo semanage fcontext -a -t lib_t "/srv/nominatim/Nominatim/build/module/nominatim.so" 
sudo restorecon -R -v /srv/nominatim/Nominatim/build/module
ls -lZ

3 .Then reload apache
sudo systemctl restart httpd

4. shutdown -r now   to reset SELINUX

10. Make a Snapshot of the Server

This is a good time to create a VM snapshot of the server and configuration. As stated before, the data import process takes a significant time, and having a snapshot can help it you need to restart this process, or create a cloned system.

Next step …

References

At this point, we are done with the base server configuration and OSM Nominatim software installation. The next article shall address OSM planet file loading and US Census data upload. paulsDevBlog.End();