XXXX redacted: Difference between revisions
Content added Content deleted
Thundergnat (talk | contribs) m (Clarify) |
Thundergnat (talk | contribs) m (more clarification) |
||
Line 2: | Line 2: | ||
You've been given a contract from a three letter abbreviation government agency. They want a program to automatically redact sensitive information from documents to be released to the public. They want fine control over what gets redacted though. |
You've been given a contract from a three letter abbreviation government agency. They want a program to automatically redact sensitive information from documents to be released to the public. They want fine control over what gets redacted though. |
||
Given a piece of free-form, possibly Unicode text, (assume text only, no markup or formatting codes) they want to be able to redact: whole words, (case sensitive or insensitive) or partial words, (case sensitive or insensitive). Further, they want the option to "overkill" redact a partial word. Overkill redact means if the word ''contains'' the redact target, even |
Given a piece of free-form, possibly Unicode text, (assume text only, no markup or formatting codes) they want to be able to redact: whole words, (case sensitive or insensitive) or partial words, (case sensitive or insensitive). Further, they want the option to "overkill" redact a partial word. Overkill redact means if the word ''contains'' the redact target, even if is only part of the word, redact the '''entire''' word. |
||
For our purposes, a "word" here, means: a character group, separated by white-space and possibly punctuation; not necessarily strictly alphabetic characters. To "redact" a word or partial word, replace each character in the redaction target with a capital letter 'X'. There should be the same number of graphemes in the final redacted word as there were in the non-redacted word. |
For our purposes, a "word" here, means: a character group, separated by white-space and possibly punctuation; not necessarily strictly alphabetic characters. To "redact" a word or partial word, replace each character in the redaction target with a capital letter 'X'. There should be the same number of graphemes in the final redacted word as there were in the non-redacted word. |