OCR module installation

Introduction

The OCR module is an optional module that indexes all content (i.e.: in addition to traditional documents, also images, audio, video) for full-text search.

NOTE: it supports only ASCII characters and left to right reading.

Installation requirements

The OCR module can be installed on all system shards to automatically balance the work load.

OCR module operation

OCR module operation are described below:

Phase Description
1

Screenshot evidence images and all types of documents, awaiting conversion, are saved in a separate queue from evidence awaiting analysis.

2

The OCR module reads the image or document from the queue and converts them into text. This operation can last from one to 5-10 seconds according to the number of words to be acquired.

3

Each image or document text is saved in the database and tagged as full-text.

4 Conversion times and tags for the single image are saved in the module log.
5

The text is made available to the Analyst in the page with the list of evidence for a search in the Info field and in the detailed evidence page.

Space occupied by tagged text in the database

Each piece of screenshot evidence occupies more space in the database because it is always accompanied by its tagged text. The increase in space cannot be predicted since it depends on both the number of screenshots acquired from the agent and the number of words in each screenshot.

OCR module work load

The OCR module occupies a lot of the CPU when converting a screenshot, but is run with a lower priority than other processes.

Thus the CPU load will only have an effect when the system shows the converted image text during evidence analysis.

It is best to immediately install on Shard and not on Master Node, already full of processes.

Symptoms of excessive load

Check how long it takes for the text to be displayed in the single evidence detail and check the times recorded in the log when acquiring images. If these are deemed excessive and another server is free (i.e.: that housing another shard database or Master Node) install another OCR module.

This way the work load will be divided amongst all installed modules.

OCR module installation

To install an OCR module in back end environment:

Steps Result

Insert the CD with the installation package. Run file in folder : the first wizard window appears.

Click Next.

Follow the steps below until installation has completed: the module will begin converting images the first time a screenshot type of evidence is received.

-
Checking correct OCR module operations

To check whether image conversion to text is too slow, check how long it takes for the button to appear in the evidence details page.

Uninstall

The OCR module can be uninstalled from the Windows Control Panel.

NOTE: uninstalling an OCR module does not compromise converted and tagged text.