I'm working on modernizing Rosetta Code's infrastructure. Starting with communications. Please accept this time-limited open invite to RC's Slack.. --Michael Mol (talk) 20:59, 30 May 2020 (UTC)

Talk:Determine sentence type

From Rosetta Code


I'm a little puzzled why this task has a "Recursion" category tag. AFAICT, none of the existing examples use recursion, and I would be hard pressed to come up with a way that recursion could be useful for this task. --Thundergnat (talk) 09:03, 7 November 2021 (UTC)

I can't see what this task has to do with recursion either. Another puzzling aspect is that the text you're supposed to use is actually 4 sentences, not one. So just by doing the task, you automatically qualify for extra credit! --PureFox (talk) 09:24, 7 November 2021 (UTC)

Is the task description written in English or American?[edit]

The task description is grammatically incorrect in either:

Use this sentence: "hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it"

is not a sentence in either English or American.

In English it should be:

Use (these sentences) (this paragraph): "Hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it."

In American it should be:

Use (these sentences) (this paragraph): "Hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it".

(see use of the period (or full stop) when a sentence ends with a quotation).

It is also possible to use exclamation marks within a sentence. Full stops are frequently used within a sentence, e.g. N. Galloway. The last sentence uses 4. A sentence may also end with !?. The task description "Search for the last used punctuation in a sentence" may cover this but the solutions don't, Factor excepted. More test cases are required.

Factor uses a list of 'common' abbreviations including i.e. but not e.g. (see How and when do you write etc, ie and eg?), which advises:

However, some style guides do say that ‘eg’ and ‘ie’ should have full stops. (And to emphasise that fact, my grammar checker has just ‘helpfully’ underlined those terms now that I’ve typed them.)

Sometimes even dictionaries don’t help. Collins English Dictionary, for example, says that e.g., eg. and eg are all acceptable. But confusingly, it lists only i.e. (not ie or ie.), which makes no sense.

In short: you can write etc, ie and eg with or without full stops. But make sure you pick one style for all abbreviations and stick to it.

Factor appears to require a mixture , which makes no sense. --Nigel Galloway (talk) 16:03, 8 November 2021 (UTC)

Consider it to be Engl. Its sort-of Engl-ish.
If it was a pre-requisite that tasks be grammatically and semantically correct, about half (or more) of the existing tasks would not exist. While I don't disagree about the suspect task text, or the (lack of) coverage of edge conditions, whining about it without proposing any alternatives is counter-productive. To be fair, just about any rule you could come up regarding the structure and layout of English (American or otherwise) has some counter-example, so it will be difficult to come up with a comprehensive solution. The Lingua::EN::Sentence module used in the Raku entry has several large tables of abbreviations, grammatical constructs, and exceptions to try to intelligently break blocks of text into sensible sentences; but even that is probably only about 95-98% accurate, especially if you start throwing deliberately obfuscatory constructs at it. Is it good enough? Depends on your set of circumstances. Is is better than a few regexes? I would hope so, but again, depends on your requirements. --Thundergnat (talk) 17:24, 8 November 2021 (UTC)
Sooner or later we will be seeing junk/ironic tasks generated by GPT-3, or something like it.
I don't think this has started yet, but we do seem to be falling into some kind of hyper-inflationary gravity well or general collapse of quality. I don't know why ...
Time to think about making the pre-requisites fractionally more visible and robust ?
Triage by punctuation mark seems to lack a certain something ... Hout (talk) 19:40, 8 November 2021 (UTC)
I hope you are not accusing me of "whining about it without proposing any alternatives" and being "counter-productive". I proposed 2 alternatives, one following English grammatical conventions and one following American grammatical conventions. The task author wrote the task description using 5 sentences, only one of which had End of Sentence Punctuation, an exclamation mark. I have followed your suggestion and added End of Sentence Punctuation using English grammatical conventions. Note that the task now does not make sense, there is no such thing as Neutral Sentence Type. Every sentence must have End of Sentence Punctuation, even in American!--Nigel Galloway (talk) 14:07, 9 November 2021 (UTC)
You are right about "Neutral sentences", but is there such a thing as a Serious sentence either ?
Your change has invalidated several (most ?) of the existing samples, including that of the task author, also making it impossible to detect a "neutral sentence" (which appears to be defined for the task as something that does't end in ?, ! or .).
The task doesn't specify that the sentebces are English or American or anything else.
Would it be better if the task said:
Parse the following according to this (somewhat informal) grammar:
paragraph = sentence+ neutral-sentence?
sentence = ( exclamation | question | serious )
exclamation = neutral-sentence "!"
question = neutral-sentence "?"
serious = neutral-sentence "."
neutral-sentence = <any-character-except-?!.>+
(postfix ? means optional, postfix + means 1-or-more, things in double-quotes indicate literal characters that must appear as written.
The names of the terminals and non-terminals in the grammar should not be interpreted as conferring any meaning as words in any particular natural language, even if they are spelt the same way (apart from "any-character-except-..."). --Tigerofdarkness (talk) 18:47, 9 November 2021 (UTC)
It would not. This is just a complicated way of expressing the algorithm, give a string S for each character C in S if C is '.' output 'S', if C is '!' output 'E', if C is '?' output 'Q'. If the last character in S is not one of '.', '?', or '!' output 'N'. The string could be "xxxxxxx.xxxxxx?xxxx!xxxx", this 'task' has nothing to do with sentences! --Nigel Galloway (talk) 17:51, 11 November 2021 (UTC)