Selection bias in clinical sciences: Difference between revisions
Selection bias in clinical sciences (view source)
Revision as of 08:40, 28 September 2022
, 1 year agoclean up task formatting
(Initial task creation and Python example) |
m (clean up task formatting) |
||
Line 3:
In epidemiology, retrospective analyses have well-known limitations compared to prospective studies.
One such limitation is the
and untreated groups about whom the data is collected. For example, a treatment may have only been
given to persons who were less severely ill, which would bias the results in favor of such subjects
Line 12:
retrospective study is the topic of this task.
The genuine, historical example approximated in this task is of a study done of persons who, over a course
of 180 days, may or may not have become infected with Covid-19. Prior to becoming ill, these subjects may
or may not have taken an available medication, which was usually taken in doses of 3, 6, or 9 mg daily.
study medication:
* The probability of starting treatment medication for anyone not already taking it was 0.5% per day. For those who started medication, the chance of continuing the treatment was increased 50-fold to 25% each day, since most who started the medication continued to take it to some extent.
* Study dose per day is random between 3, 6 and 9 mg. The daily cumulative dosage is used to determine the group the subject is in, unless a subject develops Covid-19. If a subject was diagnosed with Covid-19, their group at the time of that diagnosis is used in the statistical analysis of that group.
;Task:
▲ * Use at least 1000 subjects in the simulation over the 180 days (historically, the study size was 80,000).
* Statistics used are to be the Kruscal statistic for the analysis of multiple groups, with the boolean study outcome variable whether the subject got Covid-19 during the study period, analyzed versus category.▼
▲ study outcome variable whether the subject got Covid-19 during the study period, analyzed versus category.
;Stretch task
▲ * You should get a statistical result highly favoring the REGULAR group.
▲; Stretch task: show monthly outcomes.
A note regarding outcome: Note that by simulation design all subjects must have an IDENTICAL risk, that is 0.1 per cent or p = 0.001 per day, of developing Covid-19. Because of the design, any statistical differences between the groups CANNOT come from an influence of the treatment on that risk, but must come from some other feature of the study design.
;See also:
Line 147 ⟶ 136:
s.had_covid for s in population if s.category == IRREGULAR]
regular = [s.had_covid for s in population if s.category == REGULAR]
print('\
Line 153 ⟶ 142:
</syntaxhighlight>{{out}}
<pre>
Total subjects: 1,000
Day 30:
Untreated: N = 872, with infection = 25
Irregular Use: N = 128, with infection = 2
Regular Use: N = 0, with infection = 0
Day 60:
Untreated: N = 755, with infection = 55
Irregular Use: N = 222, with infection = 8
Regular Use: N = 23, with infection = 1
Day 90:
Untreated: N = 671, with infection = 70
Irregular Use: N = 219, with infection = 13
Regular Use: N = 110, with infection = 4
At midpoint, Infection case percentages are:
Untreated : 10.432190760059612
Irregulars: 5.936073059360731
Regulars : 3.6363636363636362
Day 120:
Untreated: N = 600, with infection = 88
Irregular Use: N = 189, with infection = 17
Regular Use: N = 211, with infection = 8
Day 150:
Untreated: N = 514, with infection = 108
Irregular Use: N = 194, with infection = 21
Regular Use: N = 292, with infection = 16
Day 180:
Untreated: N = 447, with infection = 119
Irregular Use: N = 189, with infection = 26
Regular Use: N = 364, with infection = 26
At study end, Infection case percentages are:
Untreated : 26.62192393736018 of group size of 447
Irregulars: 13.756613756613756 of group size of 189
Regulars : 7.142857142857143 of group size of 364
Final statistics: KruskalResult(statistic=55.48204323818349, pvalue=8.95833684545873e-13)
</pre>
|