Along with applying categorization rules to a conversation transcript, Conversation Analyzer applies substitution rules to refine the output. Substitution rules replace words that are often incorrectly transcribed and improve the spelling of words. You will most likely require these rules for proper nouns, such as place, company or product names. For example, Conversation Analyzer may transcribe 'Basingstoke' as 'Beijing spoke'. Create rules that replace the incorrect word or words. Substitution rules also replace sensitive information such as credit card details—you details — you can, for example, replace specified text with text such as '(redacted)', '(removed)', or 'xxxxxxxxxxxxxx'.
...
In the following example, the categorization profile—profile — SubstitutionRules
—contains — contains three substitution rules.
...
The categorization expression language describes the format of the value in the Search phrase field. The language supports simple values where the presence of the exact word or phrase would result in a match. For information about writing expressions, see Categorization expression language in .
Replace with
When creating or editing a substitution rule, define the value that will replace the found text in the Replace with field.
...
Expand |
---|
title | Examples of overlapping rules |
---|
|
Info |
---|
Example 1. We want to replace "credit card" with "payment method" and remove credit card number. Transcription text: My credit card is 1234567890123456 Substitution rules: Rule 1: Search phrase: credit card Replace with: payment method Rule 2: Search phrase: credit card #* Words between: 5 Replace with: (credit card information redacted) Intended text: My (credit card information redacted) Processed text: My payment method is 1234567890123456 Why: Rules 1 and 2 overlap. In this scenario, Conversation Analyzer applies rule 1—because 1 — because rule 1 has higher priority—and priority — and discards rule 2. The result is that the credit card number is still exposed Solution: Redact first, substitute after. |
Info |
---|
Example 2. We want to remove all strings of three or more numbers because they can contain sensitive information. However, we want to label PIN numbers differently to credit card numbers. Transcription text: My PIN is 1234 Substitution rules: Rule 1: Search phrase: ###* Replace with: (redacted) Rule 2: Search phrase: credit card ################ Words between: 5 Replace with: (credit card has been redacted) Rule 3: Search phrase: PIN #### Words between: 5 Replace with: (PIN has been redacted) Intended text: My (PIN has been redacted) Processed text: My PIN is (redacted) Why: Rules 1 and 3 overlap. In this scenario, Conversation Analyzer applies rule 1—because 1 — because rule 1 has higher priority—and priority — and discards rule 3. The result is that instead of applying the more specific rule "(PIN has been redacted)", we applied the more general one. Solution: Write more specific rules first, followed by more general—catch-all—rules general — catch-all — rules later. |
Info |
---|
Example 3. Due to the highly sensitive nature of passwords, we want to remove user account names, and wipe out the whole text containing password. Transcription text: My account name is administrator and my password is Jupiter, with upper case J Substitution rules: Rule 1: Search phrase: account name is * Replace with: (account name redacted) Rule 2: Search phrase: * password * Replace with: (password redacted) Intended text: My (account name redacted) and (password redacted) Processed text: My (account name redacted) Why: In this scenario, Conversation Analyzer applies rule 1, because rule 1 has higher priority than rule 2. In removing the account name, the whole of the password text is removed too. Rule 2 does not match the remaining text. Solution: Write your rules in order of most sensitive to least sensitive. Avoid using operators like * and ~ as much as possible. |
Info |
---|
Example 4. For a dogwalking service, we want to improve the transcription with more accurate, business-related words. Transcription text: I have a big hunting dog Substitution rules: Rule 1: Search phrase: big hunting dog Replace with: hound Rule 2: Search phrase: I have * dog Replace with: I am a dog owner Rule 3: Search phrase: have Replace with: look after Processed text: I look after a hound Why: In this scenario, Conversation Analyzer applies rule 1. Rule 2 overlaps rule 1 so Conversation Analyzer discards rule 2. Rule 3 overlaps rule 2 only, but because Conversation Analyzer has discarded rule 2, rule 3 can be applied. Solution: Write your substitution rules in order of importance. |
|
...