Create a strollur object from mothur outputs

The read_mothur function reads various file types created by mothur, and creates a `strollur` object.

To generate the various input files you can follow Pat's Miseq example analysis.

Usage

read_mothur(
  fasta = NULL,
  count = NULL,
  taxonomy = NULL,
  otu_list = NULL,
  asv_list = NULL,
  phylo_list = NULL,
  design = NULL,
  cons_taxonomy = NULL,
  otu_shared = NULL,
  asv_shared = NULL,
  phylo_shared = NULL,
  sample_tree = NULL,
  sequence_tree = NULL,
  dataset_name = ""
)

Arguments

fasta: filename, a FASTA formatted file containing sequence strings. fasta file
count: filename, a mothur count file
taxonomy: filename, a mothur taxonomy file, created by classify.seqs
otu_list: filename, a mothur list file containing otu bin assignments. The otu_list file is created by cluster, cluster.split, and cluster.fit
asv_list: filename, a mothur list file containing asv bin assignments. The asv_list file is created by cluster using the 'unique' method.
phylo_list: filename, a mothur list file containing phylotype bin assignments. The phylo_list file is created by phylotype.
design: filename, a mothur design file
cons_taxonomy: filename, a mothur consensus taxonomy file constaxonomy file. The cons_taxonomy file is created by classify.otu.
otu_shared: filename, a mothur shared file containing otu bin sample abundance assignments.
asv_shared: filename, a mothur shared file containing asv bin sample abundance assignments.
phylo_shared: filename, a mothur shared file containing phylotype bin sample abundance assignments.
sample_tree: filename, a tree that relates samples. The sample tree is created by tree.shared. We recommend running tree.shared with subsample = true, and using the 'ave.tre' output for best results.
sequence_tree: filename, a tree that relates sequences. The sequence tree is created by clearcut. We DO NOT recommend using sequence trees. With the ever growing size of modern datasets, sequence tree can be difficult / impossible to build without hitting a memory limitation.
dataset_name: A string containing a name for your dataset.

Value

A strollur object

Note

consensus taxonomy, The `strollur` object will generate consensus taxonomies for you based on the sequence taxonomy assignment. You only need to provide the ".cons.taxonomy" file if you are not providing sequence taxonomy assignments.
shared / rabund file, The `strollur` object will generate shared and rabund data for you based on the otu assignment in the list file and the count data. You only need to provide the ".shared" file if you are not providing the list and count files.

References

Schloss,P.D., Westcott,S.L., Ryabin,T., Hall,J.R., Hartmann,M., Hollister,E.B., Lesniewski,R.A., Oakley,B.B., Parks,D.H., Robinson,C.J., Sahl,J.W., Stres,B., Thallinger,G.G., Van Horn,D.J. and Weber,C.F. (2009), Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology 75:7537-7541. <doi:10.1128/AEM.01541-09>

Examples

# For dataset's including sequence data:

data <- read_mothur(
  fasta = strollur_example("final.fasta.gz"),
  count = strollur_example("final.count_table.gz"),
  taxonomy = strollur_example("final.taxonomy.gz"),
  design = strollur_example("mouse.time.design"),
  otu_list = strollur_example("final.opti_mcc.list.gz"),
  asv_list = strollur_example("final.asv.list.gz"),
  phylo_list = strollur_example("final.tx.list.gz"),
  sample_tree = strollur_example("final.opti_mcc.jclass.ave.tre"),
  dataset_name = "miseq_sop"
)
#> Added 2425 sequences.
#> Assigned 2425 sequence abundances.
#> Assigned 2425 sequence taxonomies.
#> Assigned 531 otu bins.
#> Assigned 2425 asv bins.
#> Assigned 63 phylotype bins.
#> Assigned 19 samples to treatments.

# For dataset's with only otu data:

data <- read_mothur(
  otu_shared = strollur_example("final.opti_mcc.shared"),
  cons_taxonomy = strollur_example(
    "final.cons.taxonomy"
  ),
  design = strollur_example("mouse.time.design"),
  sample_tree = strollur_example("final.opti_mcc.jclass.ave.tre"),
  dataset_name = "miseq_sop"
)
#> Assigned 531 otu bins.
#> Assigned 19 samples to treatments.
#> Assigned 531 otu bin taxonomies.