mothur v.1.13.0

As everyone heads back to a new academic year, we are happy to announce the release of mothur v.1.13.0!

Since the last release we have been head deep in processing the raw 16S rRNA data for the US Human Microbiome Project - that’s 90 *million* sequences for those of you keeping track at home. That also means that mothur is strong enough to process a large number of sequences. We are quite excited about this because many people did not think that an OTU-based analysis without using heuristics was possible. We were also able to provide phylotype-based and phylogeny-based analysis of the data

  • all using mothur. Now comes the difficult task of converting data into results, conclusions, and ultimately meaningful biology. The side effect of all this means that we have revamped a number of features to improve their speed through better programming an parallelization of various commands (e.g. unifrac.weighted and unifrac.unweighted. The basic pipeline that we used is explained in the Costello stool analysis.

In addition, we are grateful for the questions, suggestions, and bug reports that have been trickling in. You’ll notice that many of the feature updates listed below are the requests of people on the mothur forum. We have also added several calculators for measuring community evenness and expanded the options for 8 different functions. In the last release we provided a wrapper for clearcut, which required you to have a separate clearcut executable. From here out, this is no longer necessary as clearcut is hard-wired into mothur. We’re looking forward to doing the same thing with CatchAll and Metastats. As always, please keep the praise, complaints, suggestions, and 6 packs coming.

Pat Schloss

Feature updates

  • Added “[error]” to the start of every error message that is outputed to the logfile to make it easy to grep errors from large logfiles.
  • screen.seqs no longer outputs *bad.fasta, *bad.names, *bad.groups
  • heatmap.bin can now take a relabund file as an input.
  • added relabund parameter to read.otu command.
  • added clearcut source to mothur so that clearcut exe is no longer needed.
  • increased speed of unifrac commands, changed default of random parameter to false.
  • changed remove.seqs and get.seqs to make dups=T the default

Bug fixes

  • fixed bug with venn command if no valid calculators are given.
  • fixed bug with collect.single, rarefaction.single and summary.single when run in shared mode after one another the second command would only output results for last group.
  • deleted .names.temp that was left behind on windows version.
  • changed classify.otu to assume the list file contains all sequences instead of just uniques.
  • sffinfo trimming first base.
  • classify.seqs was adding an extra unclassified at the end of all classifications.
  • phylo.diversity - the collect and rarefaction data seems to suggest that each group being analyzed has the same number of sequences - the total number of sequences. It should instead be the number of sequences in each group. So if the largest sample only contains 1000 sequences it should only go down 1000 sequences, not the total number of unique sequences across all groups.
  • fixed bug that did not apply hard to cutoff output of cutoff change in cluster.split and cluster.
  • fixed bug that occurred with some of the calculators that caused negative results if the numbers were larger than the largest possible int for a given platform.
  • fixed ambiguous output for align.seqs flip=T. -
  • fixed bug that truncated accession names in mpi-enabled version if a file with Windows line endings was used. -
  • modified pre.cluster to make it faster
  • fixed bug with tree.shared -