First Launch¶

Unpacking the Application¶

Unpacking: The first step of installing DocFetcher Server is to unpack the downloaded package to a location of your choice. On Windows and Linux, this means unpacking the contents of a zip archive. On macOS, it means mounting and opening a disk image and moving the application folder out of it. Do NOT unpack or move the application on top of an older or newer version of itself, otherwise application files may get mixed up, leading to unpredictable application behavior. Furthermore, you need to ensure that the user account under which the application will be run has read and write permissions for the application folder.

Portable application: After having unpacked the application, what you have is a self-contained piece of software that can be run directly from its folder, and that stores all settings and indexes in its folder. – In other words, it is basically a portable application.

Non-portable use: While the application is portable by design, it can also be made non-portable by replacing one or more subfolders in the application folder with NTFS junctions or symlinks. In particular, if you want the application to store its settings and/or indexes anywhere other than in its own folder, you can replace the subfolder conf and/or the subfolder indexes with NTFS junctions or symlinks. Do note that multiple instances of DocFetcher Server sharing the same conf and indexes folders is not supported and may lead to file corruption and crashes.

Upgrading: As a result of DocFetcher Server being a portable application and not having an auto-update feature, the proper procedure to upgrade to a newer version of the software is to copy or move the conf and indexes subfolders from the old version to the new version. If you’ve modified the launch scripts or other application files in any way, these must be copied or moved also. And again, do NOT unpack the new version on top of the old version.

Starting the Server (Windows)¶

This section explains how to start the server on Windows. If you’re using Linux or macOS instead, skip to the next section.

Executables: On Windows, the DocFetcher Server folder contains the following executables:

DocFetcherServer.exe
DocFetcherServerw.exe
server-install.exe
server-start.exe
server-stop.exe
server-uninstall.exe

The number of these executables and the similarity of their names may cause some confusion, so a careful reading of the following explanation is recommended.

Windows service: First of all, DocFetcher Server isn’t a regular application that you can just run by double-clicking an executable. Instead, DocFetcher Server must be installed and run as a Windows service. To install it, double-click the server-install.exe executable and confirm when the executable asks for admin rights. And if you no longer need DocFetcher Server, you also cannot simply remove it by deleting the application folder. Instead, you have to first uninstall the Windows service, by running server-uninstall.exe.

Do not rename or move the application folder: After installing DocFetcher Server as a Windows service, do not rename or move the DocFetcher Server folder, as this will break the Windows service. Before renaming or moving the folder, you first have to uninstall the Windows service.

Foreground vs. background: While DocFetcher Server is installed as a Windows service, you can run it either in the foreground or in the background. The former is useful for testing and troubleshooting because you get to see the application’s output in a console window. The latter is how you would normally run the application. Running the application both in the foreground and in the background is not supported; if you try that, the DocFetcher Server instance that comes first will run and the other instance will fail.

Starting and stopping: To run DocFetcher Server in the foreground, double-click the DocFetcherServer.exe executable. This opens a console window in which the application will run. To exit the application, press Ctrl + C in the console window twice (!). To run DocFetcher Server in the background, double-click the server-start.exe executable. To stop the background process, double-click the server-stop.exe executable. After starting the application either as a foreground or a background process, nothing more seems to be happening, but that’s how it’s supposed to work – the server is just up and running and ready to serve clients through their web browsers.

Starting when system boots: It should be noted that after installing the Windows service by running server-install.exe, the Windows service will not be started immediately, but it will be started in the background when the system reboots.

Windows service GUI: The remaining executable we haven’t discussed yet is DocFetcherServerw.exe. This executable can be run once the Windows service is installed, and it provides a simple GUI for:

configuring the Windows service,
starting and stopping the Windows service, and
seeing whether the Windows service is running at the moment.

Visibility of mapped drives: By default, DocFetcher Server’s Windows service runs under the Local System account, which is why mapped drives may not be visible when you try to select them for indexing within DocFetcher Server. This is because mapped drives are a user-account specific concept. Probably the easiest solution is to index the network resource via its UNC path. The alternative is to perform some complex system-admin gymnastics to create a mapped drive that is accessible from the Local System account, or whatever account the DocFetcher Server instance is run under. Note that you can use the DocFetcherServerw.exe GUI to specify the user account under which DocFetcher Server is run. But normally you would use the Local System account because that way DocFetcher Server will be run even when no users are logged in.

Log files: In addition to the console output that you get when running DocFetcher Server in the foreground, the application’s log files may also help with troubleshooting: C:\Windows\System32\LogFiles\Apache\docfetcherserver*.log

Starting the Server (Linux and macOS)¶

As a foreground process: To start DocFetcher Server on Linux or macOS, open a terminal window, use the cd command to navigate to the DocFetcher Server folder, e.g., cd /path/to/docfetcherserver, and then run the command ./server-start-foreground.sh. This will launch DocFetcher Server as a foreground process. If the command didn’t work, try the following command to mark all scripts in the current folder as executable: chmod +x *.sh. After starting the server process, there will be a flurry of terminal output, and then it will seem like the process is hanging. However, that’s how it’s supposed to work – the server is just up and running and ready to serve clients through their web browsers. You can stop the server process by pressing Ctrl + C (Linux) or Command + . (macOS) in the terminal.

As a background process: Later on, when the application is ready to be used for real, you will want to run it as a background process. In the latter mode, the commands ./server-start.sh and ./server-stop.sh will start and stop the background process, respectively. You will probably also want to use whatever facilities your operating system provides to make the server-start.sh script run when the system boots up.

Testing and troubleshooting: For testing and troubleshooting purposes, running DocFetcher Server as a foreground process is more suitable, as it allows one to see its output and any potentially useful error messages in the terminal.

The Web Interface¶

Accessing the web interface: After following the above instructions to get the server up and running, you can now access its web interface locally, i.e., on the same computer it runs on, by opening your web browser and visiting the address http://localhost:31190/. Warning: Do not omit the trailing slash in the address, the web interface won’t work correctly without it!

Server warm-up: When running DocFetcher Server as a background process, one thing to keep in mind is that after starting up the server, it may take a while for the server to become ready to serve content to clients. The main reason for this is that the server needs to load its indexes into memory, and this takes longer the bigger and more numerous the indexes are.

User Area and Admin Area: DocFetcher Server’s web interface is divided into two areas, the User Area and the Admin Area. The User Area is where users can run searches and retrieve the result files. The Admin Area is where server admins can manage indexes and user access. You can navigate between User Area and Admin Area by clicking the aptly labeled links “User Area” and “Admin Area” placed at the top or bottom of the respective page.

Indexing before searching: Having just set up a DocFetcher Server instance, if you go to the User Area and try to enter a query in the search field, you will get an error message rather than results. This is because searching files requires indexing those files beforehand, something DocFetcher Server won’t do unless you tell it to. The topic of indexing is covered in a later section, Indexing.

Searching via URL: When opening the User Area in your browser, you can append a ?q= parameter to the URL to run a search immediately after the User Area is opened. For example, if the User Area is at the address https://example.com/search, you can navigate to https://example.com/search/?q=dog cat to search for dog cat immediately after the User Area is opened. If the query contains special characters, they must be encoded according to an encoding scheme known as percent-encoding or URL encoding. For example, the query +dog +cat would have to be encoded as %2Bdog+%2Bcat. The resulting URL would then be: https://example.com/search/?q=%2Bdog+%2Bcat. You can find various online tools on the web to perform this encoding, just search for “url encode online”.

Browser requirements: Officially, only the most recent version of the following four web browsers is supported: Chrome, Firefox, Safari and Edge. Older versions of these browsers and any version of other browsers may also work, but are not officially supported. In addition, cookies and the so-called local storage must be enabled in the browser. The web interface uses these to remember the states of various checkboxes across sessions, for instance. JavaScript must be enabled as well, since the web interface makes heavy use of it. Last but not least, the web interface was designed only for desktop browsers, not browsers on tablets and smartphones.

Client Limit¶

Client limit: Following the pricing model of DocFetcher Server, every instance of the application has a specific client limit, i.e., a limit on how many users can simultaneously access the User Area. For example, if your DocFetcher Server instance has a client limit of 10 users, then at most 10 users will be able to access the User Area simultaneously at any given moment. Any additional users trying to access the User Area will see a rejection page – that is, until one of 10 active users navigates away from the User Area and thus becomes inactive. Also, note that only the User Area counts towards the client limit, not the Admin Area.

Client limit upgrade: According to the pricing model of DocFetcher Server, you can purchase an upgrade of the application to a higher client limit. Downgrading to a lower client limit is not possible. To help you decide whether a client limit upgrade is in order, there’s a “Client Limit” section in the Admin Area that tells you the client limit of your DocFetcher Server instance and how many active users there are at the moment. How to upgrade to a higher client limit is explained here.

Client limit and browser sessions: An important detail to be aware of is that the “clients” that are counted towards the client limit aren’t identified by IP addresses, but by browser sessions. For example, if a user were to (theoretically) access the User Area of a DocFetcher Server instance through a Chrome window, a Chrome Incognito window, a Firefox window and a Firefox private window, all at the same time and on the same computer, then this would count as four clients, not one. On the other hand, if the user were to access the User Area through two non-Incognito Chrome tabs or windows, then this would count as only one client – that’s because those Chrome tabs or windows would actually share the same browser session. The reason why DocFetcher Server counts browser sessions rather than IP addresses is entirely technical: IP addresses are generally not a reliable way to identify users, because, for example, multiple clients accessing a server from behind a proxy server would be seen by the server as having the same IP address.

Session stealing: The intended pricing model of DocFetcher Server is that each server instance supports a fixed maximum number of users, with a fixed price for each additional user. However, the fact that DocFetcher Server counts users by browser sessions rather than IP addresses leads to some counter-intuitive limitations:

Accessing a DocFetcher Server instance from multiple browsers on the same computer counts as multiple clients rather than as a single client. (For example, imagine a web developer testing a web app in multiple browsers on the same computer.)
Accessing a DocFetcher Server instance from multiple computers counts as multiple clients, even if there’s actually only one person switching back and forth between these computers. (For example, imagine a system administrator managing multiple computers.)

For these special use cases, DocFetcher Server offers an optional workaround called session stealing, which means that once the client limit is reached, new users are allowed to take over existing sessions, thus “stealing” them from their previous owners. The latter would then get served the rejection page instead. In this way, it becomes more convenient for a user to switch between browsers on a single computer, or to switch between multiple computers.

Enabling session stealing: The ability to steal sessions is enabled by default for a client limit of 1, and disabled by default for higher client limits. To enable session stealing, go to the Client Limit section of the Admin Area. Once it is enabled, it shows up as a Steal Session link on the rejection page.

How the session to steal is chosen: Upon clicking the Steal Session link, the session to steal is chosen automatically, according to the following algorithm:

If the current user is logged in, try to steal an existing session from that exact same user.
If that fails, try to steal an existing session from a user with the same IP address as the current user.
If that also fails, steal the oldest session across all users.