Basic Operation

Overview

Kemangi takes a plain text file, and process each line of the text as your command.

Plain Text File

A text file that can be opened using basic text editor in your system (Windows: notepad, Linux: gedit, vi, etc). If it shows up cleanly, then it is a plain text file.

Typical plain text file has .txt extension.

To do text preprocessing, you need to supply input file, list of tasks, and output target.

Input File

Simply by clicking “Browse” button. Remember that the file must be a plain text. Kemangi can’t process more complex extensions like .doc or .odt.

List of Tasks

You can add several tasks, and Kemangi will run the task in the given order. To add a task, click “Add task”. A window will appear and you can choose what kind of task to be added.

Some tasks may need additional inputs. For example, own stop words removal demands text file containing list of stop words provided by you.

Some tasks may also includes another tasks. For example, stop words removal already includes case folding.

Output Target

Simply by clicking “Browse” button. Then proceed as if you are going to save a file as usual.

Start Processing

Click the “Start” button and text processing will start.

When unexpected error occurs (required internet, but your connection is lost), Kemangi will print the latest preprocessed text “intermediateResult.bak” in Kemangi’s directory. It is a plain text file, and you can open it with any basic text editor.