Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

As part of transcribing recordings, Conversation Analyzer categorizes the textual contents of the transcript, by identifying key phrases based on the defined rules, and recording the subcategory or category those rules belong to. A category is a collection of subcategories, which in turn contain a series of rules. Each rule consists of a word or phrase and the party who said that word or phrase. If the transcript contains the word or phrase and was spoken by the specified party, Conversation Analyzer matches it against the category.

...

Valid Expression and Find field values contain only alphanumeric, apostrophe and space characters; that is, values can contain spaces (U+0020), apostrophes (U+0027), and characters from the following Unicode categories:

Unicode Category Name
Description
Ll

Letter, Lowercase.

For example, a-z, ᵯ, ḅ, ṥ, ở, ﬓ

Lu

Letter, Uppercase.

For example, A-Z, Ý, Ŧ, Ǣ, Щ, 𝕐

LtLetter, Titlecase.

For example, Dž, ᾎ, ᾟ, ᾭ

Lo

Letter, Other (e.g. ª, ܗ, 爨)

The Mongolian Letter "Manchu Ali Gali Lha" (U+18AA,) is not allowed within expression and find values. This character is used internally within the categorisation engine. If the character appears within spoken text, Conversation Analyzer treats the character as an apostrophe.

LmLetter, Modifier.

For example, ʰ, ᵓ, 〲, ꟹ

MnMark, Nonspacing.

For example, ុ, ᜴

NdNumber, Decimal Digit.

For example, 0-9, ۳, ૮, ๗

Pc

Punctuation, Connector.

For example, _, ‿, ⁀, ⁔, ︳, ︴, ﹍, ﹎, ﹏, _

This category includes ten characters; the most commonly used is the LOW LINE character (_), u+005F.

...