R/post-hoc-test.R
run_posthoc_test.Rd
Multiple group test, such as anova and Kruskal-Wallis rank sum test, can be used to uncover the significant feature among all groups. Post hoc tests are used to uncover specific mean differences between pair of groups.
a phyloseq::phyloseq
object
character, the variable to set the group
character, the methods used to transform the microbial
abundance. See transform_abundances()
for more details. The
options include:
"identity", return the original data without any transformation (default).
"log10", the transformation is log10(object)
, and if the data contains
zeros the transformation is log10(1 + object)
.
"log10p", the transformation is log10(1 + object)
.
the methods used to normalize the microbial abundance data. See
normalize()
for more details.
Options include:
a integer, e.g. 1e6 (default), indicating pre-sample normalization of the sum of the values to 1e6.
"none": do not normalize.
"rarefy": random subsampling counts to the smallest library size in the data set.
"TSS": total sum scaling, also referred to as "relative abundance", the abundances were normalized by dividing the corresponding sample library size.
"TMM": trimmed mean of m-values. First, a sample is chosen as reference. The scaling factor is then derived using a weighted trimmed mean over the differences of the log-transformed gene-count fold-change between the sample and the reference.
"RLE", relative log expression, RLE uses a pseudo-reference calculated using the geometric mean of the gene-specific abundances over all samples. The scaling factors are then calculated as the median of the gene counts ratios between the samples and the reference.
"CSS": cumulative sum scaling, calculates scaling factors as the cumulative sum of gene abundances up to a data-derived threshold.
"CLR": centered log-ratio normalization.
arguments passed to specific normalization methods
confidence level, default 0.95
one of "tukey", "games_howell", "scheffe", "welch_uncorrected", defining the method for the pairwise comparisons. See details for more information.
a postHocTest object
data(enterotypes_arumugam)
ps <- phyloseq::subset_samples(
enterotypes_arumugam,
Enterotype %in% c("Enterotype 3", "Enterotype 2", "Enterotype 1")
) %>%
phyloseq::subset_taxa(Phylum == "Bacteroidetes")
pht <- run_posthoc_test(ps, group = "Enterotype")
pht
#> postHocTest-class object
#> Pairwise test result of 13 features, DataFrameList object, each DataFrame has five variables:
#> comparisons : pair groups to test which separated by '-'
#> diff_mean: difference in mean proportions
#> pvalue : post hoc test p values
#> ci_lower : lower confidence interval
#> ci_upper : upper confidence interval
#> Posthoc multiple comparisons of means using tukey method