Apache Integration¶
This page explains how to integrate DocFetcher Server into an existing Apache web server running on Linux so that DocFetcher Server’s web interface can be accessed via a subdirectory of the website, e.g., https://example.com/search/. It is assumed that Apache is already up and running and accepts HTTPS traffic, and that the reader has some familiarity with Apache.
In a nutshell, this is how our setup works:
When DocFetcher Server runs, it will grab a port, e.g., 31190 for HTTP and 31191 for HTTPS, and handle requests on that port.
Apache will be configured to redirect traffic arriving at the
/searchsubdirectory to the/search/subdirectory.Apache will be configured to redirect traffic arriving at the
/search/subdirectory to DocFetcher Server’s port.
The redirection from /search to /search/ is important: Without it, if a client were to visit https://example.com/search instead of https://example.com/search/, DocFetcher Server’s web interface would not work correctly, due to the quirky ways in which relative URLs work.
For the redirection from /search to /search/ we can use mod_rewrite, and for the redirection from /search/ to DocFetcher Server’s port we can use mod_proxy and mod_proxy_http. Run the following command to enable these Apache modules:
sudo a2enmod rewrite proxy proxy_http
Now open the Apache HTTPS site configuration in a text editor, e.g.:
sudo nano /etc/apache2/sites-available/default-ssl.conf
At this point there’s a choice between DocFetcher Server handling HTTP or HTTPS traffic on the port Apache redirects to. If the DocFetcher Server instance is just an internal server, then HTTP is fine. But if for some reason it needs to handle traffic from over the internet in addition to traffic coming from Apache, then HTTPS is needed.
If you’re going with HTTP, and assuming DocFetcher Server expects HTTP traffic on port 31190, you can append the following at the end of the <VirtualHost> block:
RewriteEngine on
RewriteRule ^/search$ /search/ [R]
ProxyPass /search/ http://127.0.0.1:31190/
ProxyPassReverse /search/ http://127.0.0.1:31190/
If DocFetcher Server is running on a different computer than Apache, replace the occurrences of 127.0.0.1 with the address of the computer DocFetcher Server is running on. Furthermore, be sure not to leave out any trailing slashes, these are important.
If you’re going with HTTPS, and assuming DocFetcher Server expects HTTPS traffic on port 31191, you can append the following at the end of the <VirtualHost> block:
RewriteEngine on
RewriteRule ^/search$ /search/ [R]
SSLProxyEngine on
SSLProxyVerify none
SSLProxyCheckPeerCN off
SSLProxyCheckPeerName off
SSLProxyCheckPeerExpire off
ProxyPass /search/ https://127.0.0.1:31191/
ProxyPassReverse /search/ https://127.0.0.1:31191/
Again, replace 127.0.0.1 with the correct address if necessary, and leave all trailing slashes intact.
The four directives SSLProxyVerify, SSLProxyCheckPeerCN, SSLProxyCheckPeerNamev and SSLProxyCheckPeerExpire above are actually only needed if you don’t provide DocFetcher Server with a proper SSL certificate, so it falls back to the self-signed SSL certificate it ships with.
The redirection to HTTPS requires mod_ssl, so run this command to make sure it’s enabled:
sudo a2enmod ssl
Finally, restart Apache, using a command like the following:
sudo systemctl restart apache2