Talk:I before E except after C

From Rosetta Code

EE pronunctiation[edit]

It may be more useful to evaluate plausibility of "I before E except after C whenever the pronunciation is EE".

So we are looking for plausibility of "I before E whenever the pronunciation is EE (except after C)" and the plausibility of "E before I after C, or whenever the pronunciation is not EE".

Could we integrate that into this task?

Markhobley 10:30, 7 January 2013 (UTC)

Hi Mark. I did see that mentioned as a way to make the phrase more accurate but I dismissed using it as:
  1. I would need to get hold of word pronumctiations.
  2. I really wanted to add to the call that this particular aide-mémoire should be dropped altogether.
--Paddy3118 15:11, 7 January 2013 (UTC)
For the pronunciations, just add a flag against each value: Y - pronunciation is EE, N - pronunciation is not EE.
We could then evalulate plausibility with and without factoring the pronunciation and compare the results.
This would make sense in terms or real evaluation of the phrase. Should it be "I before E except after C", or "I before E except after C whenever the pronunciation is EE", or are both phrases not plausible. The solutions to this task could provide an answer.
What happens after C, if the pronunciation is not EE? Does the EI become IE? Maybe the solutions could also determine plausibility of "Pronunciation is NOT EE, but still EI after C"
Markhobley 01:04, 21 January 2013 (UTC)
Hi Mark, can you think of a way to automate "just add a flag against each value ..."? --Paddy3118 07:30, 21 January 2013 (UTC)

Multiple IE or EI in the same word[edit]

Should we count words or occurrences of "ie" and "ei"?

How should we handle words that have both "ie" and "ei", or multiple "ie" or multiple "ei"? --PauliKL 09:35, 21 January 2013 (UTC)

It is OK if a word is in more than one group. --Paddy3118 20:22, 21 January 2013 (UTC)
Since these are rare, they do not have a significant influence on the result. Note also that it's entirely possible that the dictionary will change (or will have changed), but we do not expect that this will be enough of a change to matter. If we were concerned with exact counts, instead of plausibility, the task would need to be structured differently. --Rdm (talk) 17:08, 1 April 2013 (UTC)
Specifically, four words with both ei and ie: eightieth liechtenstein meier weierstrass, one word with multiple instances of ie: siegfried, and four words with multiple instances of ei: einstein einsteinian einsteinium weinstein -- that's a total of nine words, none of which have a significant 'c' prefix (the only word with 'c' uses it in 'ch'), and that's just not enough to matter for this task. (A much bigger issue is "what is it that decides whether two distinct spellings are the same word or a different word".) --Rdm (talk) 17:19, 1 April 2013 (UTC)
Thanks for the stats Rdm. I guess I could add "Words that could be in multiple categories should be counted in those multiple categories" to the task to nail it down, as I wasn't thinking of doing anything more than showing how tenuous the "rule" was. At the moment though I think people referring to the talk page should be OK. --Paddy3118 (talk) 07:06, 2 April 2013 (UTC)

Stretch goal[edit]

Added after this comment in the J entry:

Note that if we looked at frequency of use for words, instead of considering all words to have equal weights, we might come up with a different answer.

I found a list of word frequencies and saw that it could be done. --Paddy3118 (talk) 18:53, 16 April 2013 (UTC)

c solution[edit]

lex yacc flex bison aren't separate languages at rosettacode, nor should they be. I maintain the flex program is a proper c solution and examples deserve spotlight. --LambertDW 01:41, 7 May 2013 (UTC)

On modern C++ as a scripting language[edit]

The blog post Translating a Rosetta Code entry from Perl to C++ by A. Sinan Unur comments on the existing C++ code as well as comparing modern C++ with scripting languages for this task.
--Paddy3118 (talk) 18:35, 3 July 2015 (UTC)