Perform significant test by comparing the pairwise log ratios between all features.
a phyloseq-class
object.
character, the variable to set the group.
character vector, the confounding variables to be adjusted.
default character(0)
, indicating no confounding variable.
character to specify taxonomic rank to perform
differential analysis on. Should be one of
phyloseq::rank_names(phyloseq)
, or "all" means to summarize the taxa by
the top taxa ranks (summarize_taxa(ps, level = rank_names(ps)[1])
), or
"none" means perform differential analysis on the original taxa
(taxa_names(phyloseq)
, e.g., OTU or ASV).
character, the methods used to transform the microbial
abundance. See transform_abundances()
for more details. The
options include:
"identity", return the original data without any transformation.
"log10", the transformation is log10(object)
, and if the data contains
zeros the transformation is log10(1 + object)
.
"log10p", the transformation is log10(1 + object)
.
the methods used to normalize the microbial abundance data. See
normalize()
for more details.
Options include:
"none": do not normalize.
"rarefy": random subsampling counts to the smallest library size in the data set.
"TSS": total sum scaling, also referred to as "relative abundance", the abundances were normalized by dividing the corresponding sample library size.
"TMM": trimmed mean of m-values. First, a sample is chosen as reference. The scaling factor is then derived using a weighted trimmed mean over the differences of the log-transformed gene-count fold-change between the sample and the reference.
"RLE", relative log expression, RLE uses a pseudo-reference calculated using the geometric mean of the gene-specific abundances over all samples. The scaling factors are then calculated as the median of the gene counts ratios between the samples and the reference.
"CSS": cumulative sum scaling, calculates scaling factors as the cumulative sum of gene abundances up to a data-derived threshold.
"CLR": centered log-ratio normalization.
"CPM": pre-sample normalization of the sum of the values to 1e+06.
named list
. other arguments passed to specific
normalization methods. Most users will not need to pass any additional
arguments here.
method for multiple test correction, default none
,
for more details see stats::p.adjust.
significance level for each of the statistical tests, default 0.05.
lower bound for the proportion for the W-statistic, default 0.7.
a microbiomeMarker object, in which the slot
of
marker_table
contains four variables:
feature
, significantly different features.
enrich_group
, the class of the differential features enriched.
effect_size
, differential means for two groups, or F statistic for more
than two groups.
W
, the W-statistic, number of features that a single feature is tested
to be significantly different against.
In an experiment with only two treatments, this tests the following hypothesis for feature \(i\):
$$H_{0i}: E(log(\mu_i^1)) = E(log(\mu_i^2))$$
where \(\mu_i^1\) and \(\mu_i^2\) are the mean abundances for feature \(i\) in the two groups.
The developers of this method recommend the following significance tests
if there are 2 groups, use non-parametric Wilcoxon rank sum test
stats::wilcox.test()
. If there are more than 2 groups, use nonparametric
stats::kruskal.test()
or one-way ANOVA stats::aov()
.
Mandal et al. "Analysis of composition of microbiomes: a novel method for studying microbial composition", Microbial Ecology in Health & Disease, (2015), 26.
# \donttest{
data(enterotypes_arumugam)
ps <- phyloseq::subset_samples(
enterotypes_arumugam,
Enterotype %in% c("Enterotype 3", "Enterotype 2")
)
run_ancom(ps, group = "Enterotype")
#> microbiomeMarker-class inherited from phyloseq-class
#> normalization method: [ TSS ]
#> microbiome marker identity method: [ ANCOM ]
#> marker_table() Marker Table: [ 13 microbiome markers with 4 variables ]
#> otu_table() OTU Table: [ 235 taxa and 24 samples ]
#> sample_data() Sample Data: [ 24 samples by 9 sample variables ]
#> tax_table() Taxonomy Table: [ 235 taxa by 1 taxonomic ranks ]
# }