crushllka.blogg.se

P-hacking versus data dredging
P-hacking versus data dredging




p-hacking versus data dredging

Figure created by Tim Brock of, I never saw the "data mining bias" term before. The garden of forking paths: Why multiple comparisons can be a problem, even when there is no 'fishing expedition' or 'p-hacking' and the research hypothesis was posited ahead of time. Range of possible contrasts, then uncorrected p-values can be highly misleading.įigure title is inspired by Gelman, A., & Loken, E. On a measure of relative hand skill), which is in reality selected from a wide ADHD and Typical young, urban females differ If the researcher presents what looks like a To guard against false positives, but in practice these are often ignored. Whenĭoing numerous comparisons, there are legitimate methods for adjusting p-values 56 (assuming the different choices are independent). Of a finding at least one ‘significant’ result rises to 1. If all these combinations of possibilities were to be considered, the chance

p-hacking versus data dredging

Then decide to subdivide the sample by gender (M vs F) (orange nodes) andĪccording to whether children are from urban (U) rather than rural (R) areas (khaki

p-hacking versus data dredging

He might then realise that results varyįor measures of hand skill (S) vs hand preference (P) (blue nodes). To do a subgroup analysis, because the association looks different in older (O)Īnd younger children (Y) (purple nodes). To find the comparison is nonsignificant. Who compares ADHD (A) and typical (T) children (green node) may be disappointed The green node corresponds to the two-groupĬomparison, where probability of obtaining a p-value <. Suppose further that there is no trueĪssociation in the population. Has measures of both hand skill and hand preference at 6 years and 10 years, as Is interested in testing whether handedness is associated with attentionĭeficit hyperactivity disorder (ADHD), and has access to a large dataset that Large dataset and a flexible approach to analysis. An illustration of how opportunities for false positives can mount up when one has a






P-hacking versus data dredging