I am trying to find out whether there is a significant difference in the mean value of a biomarker between two groups. I am using
t.test
in R. Themean(SD)
values are1142(1079)
and864(922)
in the groups. But the p-values of the test shows the difference is not statistically significant. Can someone please help me? I am sharing thedput
of the dataframe below.structure(list(ANGPTL7 = c(2.5, 205, 885, 1915, 835, 1685, 625, 1615, 84.9999999999999, 1175, 2695, 235, 1025, 2.5, 2915, 825, 255, 1085, 1815, 2.5, 205, 985, 2.5, 705, 435, 555, 2045, 135, 15, 975, 2285, 1905, 515, 74.9999999999999, 25, 815, 1075, 2.5, 1115, 3115, 64.9999999999999, 64.9999999999999, 325, 595, 285, 2.5, 2.5, 345, 5.00000000000001, 215, 3465, 555, 855, 3745, 25, 305, 2.5, 2.5, 15, 115, 565, 94.9999999999999, 1005, 575, 405, 2.5, 1855, 1795, 145, 2555, 1705, 74.9999999999999, 735, 375, 2.5, 475, 1675, 1105, 345, 385, 3195, 115, 1475, 205, 545, 1265, 485, 1135, 2595, 3305, 305, 575, 1415, 2925, 3125, 2795, 3125, 1775, 1125, 15, 1695, 1225, 1625, 3175, 3185, 1445, 3065, 785, 855, 1115, 145, 595, 435, 185, 345, 2455, 1885), OSA_status = c("Non-OSA", "OSA", "Non-OSA", "OSA", "Non-OSA", "OSA", "Non-OSA", "OSA", "Non-OSA", "Non-OSA", "Non-OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "Non-OSA", "Non-OSA", "OSA", "OSA", "Non-OSA", "Non-OSA", "OSA", "Non-OSA", "OSA", "OSA", "OSA", "OSA", "Non-OSA", "Non-OSA", "Non-OSA", "OSA", "OSA", "OSA", "OSA", "Non-OSA", "OSA", "OSA", "OSA", "Non-OSA", "Non-OSA", "Non-OSA", "Non-OSA", "Non-OSA", "OSA", "Non-OSA", "Non-OSA", "Non-OSA", "OSA", "Non-OSA", "OSA", "OSA", "OSA", "OSA", "Non-OSA", "Non-OSA", "OSA", "OSA", "Non-OSA", "OSA", "OSA", "Non-OSA", "Non-OSA", "Non-OSA", "Non-OSA", "Non-OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "Non-OSA", "OSA", "Non-OSA", "OSA", "Non-OSA", "Non-OSA", "OSA", "Non-OSA", "OSA", "Non-OSA", "OSA", "OSA", "OSA", "OSA", "Non-OSA", "Non-OSA", "Non-OSA", "Non-OSA", "Non-OSA", "OSA", "Non-OSA", "Non-OSA", "Non-OSA", "OSA", "Non-OSA", "OSA", "OSA", "OSA", "OSA", "Non-OSA", "Non-OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "OSA", "Non-OSA", "OSA", "OSA")), row.names = c(NA, -117L), class = c("tbl_df", "tbl", "data.frame"))
EDIT This data example is great for R users but a small or large pain for everyone else. This format may or may not be easier, depending.
id ANGPTL7 OSA_status 1 2.5 Non-OSA 2 205.0 OSA 3 885.0 Non-OSA 4 1915.0 OSA 5 835.0 Non-OSA 6 1685.0 OSA 7 625.0 Non-OSA 8 1615.0 OSA 9 85.0 Non-OSA 10 1175.0 Non-OSA 11 2695.0 Non-OSA 12 235.0 OSA 13 1025.0 OSA 14 2.5 OSA 15 2915.0 OSA 16 825.0 OSA 17 255.0 Non-OSA 18 1085.0 Non-OSA 19 1815.0 OSA 20 2.5 OSA 21 205.0 Non-OSA 22 985.0 Non-OSA 23 2.5 OSA 24 705.0 Non-OSA 25 435.0 OSA 26 555.0 OSA 27 2045.0 OSA 28 135.0 OSA 29 15.0 Non-OSA 30 975.0 Non-OSA 31 2285.0 Non-OSA 32 1905.0 OSA 33 515.0 OSA 34 75.0 OSA 35 25.0 OSA 36 815.0 Non-OSA 37 1075.0 OSA 38 2.5 OSA 39 1115.0 OSA 40 3115.0 Non-OSA 41 65.0 Non-OSA 42 65.0 Non-OSA 43 325.0 Non-OSA 44 595.0 Non-OSA 45 285.0 OSA 46 2.5 Non-OSA 47 2.5 Non-OSA 48 345.0 Non-OSA 49 5.0 OSA 50 215.0 Non-OSA 51 3465.0 OSA 52 555.0 OSA 53 855.0 OSA 54 3745.0 OSA 55 25.0 Non-OSA 56 305.0 Non-OSA 57 2.5 OSA 58 2.5 OSA 59 15.0 Non-OSA 60 115.0 OSA 61 565.0 OSA 62 95.0 Non-OSA 63 1005.0 Non-OSA 64 575.0 Non-OSA 65 405.0 Non-OSA 66 2.5 Non-OSA 67 1855.0 OSA 68 1795.0 OSA 69 145.0 OSA 70 2555.0 OSA 71 1705.0 OSA 72 75.0 OSA 73 735.0 OSA 74 375.0 OSA 75 2.5 OSA 76 475.0 OSA 77 1675.0 OSA 78 1105.0 Non-OSA 79 345.0 OSA 80 385.0 Non-OSA 81 3195.0 OSA 82 115.0 Non-OSA 83 1475.0 Non-OSA 84 205.0 OSA 85 545.0 Non-OSA 86 1265.0 OSA 87 485.0 Non-OSA 88 1135.0 OSA 89 2595.0 OSA 90 3305.0 OSA 91 305.0 OSA 92 575.0 Non-OSA 93 1415.0 Non-OSA 94 2925.0 Non-OSA 95 3125.0 Non-OSA 96 2795.0 Non-OSA 97 3125.0 OSA 98 1775.0 Non-OSA 99 1125.0 Non-OSA 100 15.0 Non-OSA 101 1695.0 OSA 102 1225.0 Non-OSA 103 1625.0 OSA 104 3175.0 OSA 105 3185.0 OSA 106 1445.0 OSA 107 3065.0 Non-OSA 108 785.0 Non-OSA 109 855.0 OSA 110 1115.0 OSA 111 145.0 OSA 112 595.0 OSA 113 435.0 OSA 114 185.0 OSA 115 345.0 Non-OSA 116 2455.0 OSA 117 1885.0 OSA
Answer
I agree with @pikachu that the standard deviations are too large
compared with the difference between means for a t test to find
a significant difference.
Thank you for posting your data. It is always a good idea to take a look at some graphic displays
of the data before doing formal tests.
Stripcharts of observations in the two groups do not show a meaningful difference
in locations relative to the variability of the samples.
stripchart(ANGPTL7 ~OSA_status, pch="|", ylim=c(.5,2.5))
Here are boxplots of the two groups. The ‘notches’ in the the sides of
the boxes are nonparametric confidence intervals, calibrated so that
overlapping notches tend to indicate no significant difference in location.
boxplot(ANGPTL7 ~ OSA_status, notch=T,
col="skyblue2", horizontal=T)
Even with sample sizes as large as these, I would be reluctant to do a two-sample t test on account of the marked skewness of the data. I would do a nonparametric two-sample Wilcoxon rank sum test (which also shows no
significant difference).
wilcox.test(ANGPTL7 ~ OSA_status)
Wilcoxon rank sum test with continuity correction
data: ANGPTL7 by OSA_status
W = 1456.5, p-value = 0.2139
alternative hypothesis: true location shift is not equal to 0
Attribution
Source : Link , Question Author : arshad , Answer Author : BruceET