Perform differential analysis using ALDEx2

run_aldex(
  ps,
  group,
  taxa_rank = "all",
  transform = c("identity", "log10", "log10p"),
  norm = "none",
  norm_para = list(),
  method = c("t.test", "wilcox.test", "kruskal", "glm_anova"),
  p_adjust = c("none", "fdr", "bonferroni", "holm", "hochberg", "hommel", "BH", "BY"),
  pvalue_cutoff = 0.05,
  mc_samples = 128,
  denom = c("all", "iqlr", "zero", "lvha"),
  paired = FALSE
)

Arguments

ps

a phyloseq::phyloseq object

group

character, the variable to set the group

taxa_rank

character to specify taxonomic rank to perform differential analysis on. Should be one of phyloseq::rank_names(phyloseq), or "all" means to summarize the taxa by the top taxa ranks (summarize_taxa(ps, level = rank_names(ps)[1])), or "none" means perform differential analysis on the original taxa (taxa_names(phyloseq), e.g., OTU or ASV).

transform

character, the methods used to transform the microbial abundance. See transform_abundances() for more details. The options include:

  • "identity", return the original data without any transformation (default).

  • "log10", the transformation is log10(object), and if the data contains zeros the transformation is log10(1 + object).

  • "log10p", the transformation is log10(1 + object).

norm

the methods used to normalize the microbial abundance data. See normalize() for more details. Options include:

  • "none": do not normalize.

  • "rarefy": random subsampling counts to the smallest library size in the data set.

  • "TSS": total sum scaling, also referred to as "relative abundance", the abundances were normalized by dividing the corresponding sample library size.

  • "TMM": trimmed mean of m-values. First, a sample is chosen as reference. The scaling factor is then derived using a weighted trimmed mean over the differences of the log-transformed gene-count fold-change between the sample and the reference.

  • "RLE", relative log expression, RLE uses a pseudo-reference calculated using the geometric mean of the gene-specific abundances over all samples. The scaling factors are then calculated as the median of the gene counts ratios between the samples and the reference.

  • "CSS": cumulative sum scaling, calculates scaling factors as the cumulative sum of gene abundances up to a data-derived threshold.

  • "CLR": centered log-ratio normalization.

  • "CPM": pre-sample normalization of the sum of the values to 1e+06.

norm_para

arguments passed to specific normalization methods

method

test method, options include: "t.test" and "wilcox.test" for two groups comparison, "kruskal" and "glm_anova" for multiple groups comparison.

p_adjust

method for multiple test correction, default none, for more details see stats::p.adjust.

pvalue_cutoff

cutoff of p value, default 0.05.

mc_samples

integer, the number of Monte Carlo samples to use for underlying distributions estimation, 128 is usually sufficient.

denom

character string, specifiy which features used to as the denominator for the geometric mean calculation. Options are:

  • "all", with all features.

  • "iqlr", accounts for data with systematic variation and centers the features on the set features that have variance that is between the lower and upper quartile of variance.

  • "zero", a more extreme case where there are many non-zero features in one condition but many zeros in another. In this case the geometric mean of each group is calculated using the set of per-group non-zero features.

  • "lvha", with house keeping features.

paired

logical, whether to perform paired tests, only worked for method "t.test" and "wilcox.test".

Value

a microbiomeMarker object.

References

Fernandes, A.D., Reid, J.N., Macklaim, J.M. et al. Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome 2, 15 (2014).

See also

Examples

data(enterotypes_arumugam)
ps <- phyloseq::subset_samples(
    enterotypes_arumugam,
    Enterotype %in% c("Enterotype 3", "Enterotype 2")
)
run_aldex(ps, group = "Enterotype")
#> operating in serial mode
#> Warning: Not all reads are integers, the reads are ceiled to integers.
#>    Raw reads is recommended from the ALDEx2 paper.
#> operating in serial mode
#> computing center with all features
#> New names:
#>  `` -> `...1`
#>  `` -> `...2`
#>  `` -> `...3`
#>  `` -> `...4`
#>  `` -> `...5`
#>  `` -> `...6`
#>  `` -> `...7`
#>  `` -> `...8`
#>  `` -> `...9`
#>  `` -> `...10`
#>  `` -> `...11`
#>  `` -> `...12`
#>  `` -> `...13`
#>  `` -> `...14`
#>  `` -> `...15`
#>  `` -> `...16`
#>  `` -> `...17`
#>  `` -> `...18`
#>  `` -> `...19`
#>  `` -> `...20`
#>  `` -> `...21`
#>  `` -> `...22`
#>  `` -> `...23`
#>  `` -> `...24`
#>  `` -> `...25`
#>  `` -> `...26`
#>  `` -> `...27`
#>  `` -> `...28`
#>  `` -> `...29`
#>  `` -> `...30`
#>  `` -> `...31`
#>  `` -> `...32`
#>  `` -> `...33`
#>  `` -> `...34`
#>  `` -> `...35`
#>  `` -> `...36`
#>  `` -> `...37`
#>  `` -> `...38`
#>  `` -> `...39`
#>  `` -> `...40`
#>  `` -> `...41`
#>  `` -> `...42`
#>  `` -> `...43`
#>  `` -> `...44`
#>  `` -> `...45`
#>  `` -> `...46`
#>  `` -> `...47`
#>  `` -> `...48`
#>  `` -> `...49`
#>  `` -> `...50`
#>  `` -> `...51`
#>  `` -> `...52`
#>  `` -> `...53`
#>  `` -> `...54`
#>  `` -> `...55`
#>  `` -> `...56`
#>  `` -> `...57`
#>  `` -> `...58`
#>  `` -> `...59`
#>  `` -> `...60`
#>  `` -> `...61`
#>  `` -> `...62`
#>  `` -> `...63`
#>  `` -> `...64`
#>  `` -> `...65`
#>  `` -> `...66`
#>  `` -> `...67`
#>  `` -> `...68`
#>  `` -> `...69`
#>  `` -> `...70`
#>  `` -> `...71`
#>  `` -> `...72`
#>  `` -> `...73`
#>  `` -> `...74`
#>  `` -> `...75`
#>  `` -> `...76`
#>  `` -> `...77`
#>  `` -> `...78`
#>  `` -> `...79`
#>  `` -> `...80`
#>  `` -> `...81`
#>  `` -> `...82`
#>  `` -> `...83`
#>  `` -> `...84`
#>  `` -> `...85`
#>  `` -> `...86`
#>  `` -> `...87`
#>  `` -> `...88`
#>  `` -> `...89`
#>  `` -> `...90`
#>  `` -> `...91`
#>  `` -> `...92`
#>  `` -> `...93`
#>  `` -> `...94`
#>  `` -> `...95`
#>  `` -> `...96`
#>  `` -> `...97`
#>  `` -> `...98`
#>  `` -> `...99`
#>  `` -> `...100`
#>  `` -> `...101`
#>  `` -> `...102`
#>  `` -> `...103`
#>  `` -> `...104`
#>  `` -> `...105`
#>  `` -> `...106`
#>  `` -> `...107`
#>  `` -> `...108`
#>  `` -> `...109`
#>  `` -> `...110`
#>  `` -> `...111`
#>  `` -> `...112`
#>  `` -> `...113`
#>  `` -> `...114`
#>  `` -> `...115`
#>  `` -> `...116`
#>  `` -> `...117`
#>  `` -> `...118`
#>  `` -> `...119`
#>  `` -> `...120`
#>  `` -> `...121`
#>  `` -> `...122`
#>  `` -> `...123`
#>  `` -> `...124`
#>  `` -> `...125`
#>  `` -> `...126`
#>  `` -> `...127`
#>  `` -> `...128`
#> microbiomeMarker-class inherited from phyloseq-class
#> normalization method:              [ none ]
#> microbiome marker identity method: [ ALDEx2_t.test ]
#> marker_table() Marker Table:       [ 8 microbiome markers with 5 variables ]
#> otu_table()    OTU Table:          [ 235 taxa and  24 samples ]
#> sample_data()  Sample Data:        [ 24 samples by  9 sample variables ]
#> tax_table()    Taxonomy Table:     [ 235 taxa by 1 taxonomic ranks ]