You are here: RCS installation > RCS installation > OCR module installation

OCR module

Introduction

The OCR module is an optional module that indexes all content (i.e.: in addition to traditional documents, also images, audio, video) for full-text search. Furthermore, it runs "face detection" in images to help intelligence create target profiles.

NOTE: it supports only ASCII characters and left to right reading.

Installation

The OCR module is automatically installed and enabled when Master node and any additional shards are installed.

NOTE: the module is only enabled if included in the license.

OCR module operation

OCR module operation are described below:

Phase Description
1 Screenshot evidence images and all types of documents, awaiting conversion, are saved in a separate queue from evidence awaiting analysis.
2 The OCR module reads the image or document from the queue and converts them into text. This operation can last from one to 5-10 seconds according to the number of words to be acquired.
3 Each image or document text is saved in the database and tagged as full-text.
4 Conversion times and tags for the single image are saved in the module log.
5 The text is made available to the Analyst in the page with the list of evidence for a search in the Info field and in the detailed evidence page.
Space occupied by tagged text in the database

Each piece of screenshot evidence occupies more space in the database because it is always accompanied by its tagged text. The increase in space cannot be predicted since it depends on both the number of screenshots acquired from the agent and the number of words in each screenshot.

OCR module work load

The OCR module occupies a lot of the CPU when converting a screenshot, but is run with a lower priority than other processes.

Thus the CPU load will only have an effect when the system shows the converted image text during evidence analysis.

Symptoms of excessive load

Check how long it takes for the text to be displayed in the single evidence detail and check the times recorded in the log when acquiring images. If deemed excessive, add a shard to the current installation.

This way the work load will be divided amongst all installed modules.

Checking correct OCR module operations

To check whether image conversion to text is too slow, check how long it takes for the button to appear in the evidence details page.

Disabling or re-enabling the OCR module

To disable or re-enable the OCR module, from the Master node Windows command prompt, run the following commands:

Result: the OCR module is disabled/re-enabled along with all shards.

NOTE: disabling an OCR module does not compromise converted and tagged text.

RCS9.6 | User manual | © COPYRIGHT 2015