mothur v.1.22.0

After a three month layoff, we are happy to announce the release of mothur v.1.22.0. A lot has been going on around here (including the “release” of a new Schloss son) so don’t think we’re slacking!

There are a number of important updates to mothur that you are sure to enjoy. First, to mark the acceptance of a manuscript in PLoS ONE describing a sequence analysis pipeline we are now announcing the presence of several commands - trim.flows, shhh.flows, and seq.error; shhh.flows is our implementation of Chris Quince’s PyroNoise algorithm and seq.error allows you to compare mock community sequence data to reference sequences to measure error rates. Second, we have also added a number of features to various commands that you are sure to find useful. For example, it is now possible to give a group file to pre.cluster and the chimera checking tools to perform those algorithms on a sample-by-sample basis. The results should be comparable, but this is a more elegant way of doing the analyiss and you will be rewarded by the ability to parallelize the operations and make the overall process much faster. Third, Window’s users will be happy to learn that we are in the process of parallelizing all of the appropriate commands. This release allows windows users to parallelize align.seqs, dist.seqs, summary.seqs, and classify.seqs. This list will grow in future releases.

We are also no longer going to update the Costello stool analysis pipeline. This was necessary because the original data did not provide the flowgram data making it impossible to implement our new pipeline, which incorporates shhh.flows. So, the new pipeline, which is described in the PLoS ONE manuscript, is described at the Schloss SOP wiki page. This is the culmination of many years of effort and we are happy to have this behind us! The upshot is that we were able to reduce the error rate from ~0.8% to ~0.01%. If you would like a copy of the manuscript before it is posted online, please shoot us an email. We will be running another workshop in Michigan from December 19-21 that will provide all the fine points of this pipeline, available options, and how to modify it for your analysis. If you or anyone in your lab would be interested in attending let us know.

As always, keep the citations coming! :)

Pat Schloss

New commands

  • trim.flows - trim flowgram data to get ready to run through shhh.flows
  • shhh.flows - the mothur-based re-implementation of PyroNoise (this had been hidden as shhh.seqs in earlier versions)
  • seq.error - a function that assesses error rates in sequencing data
  • - summarizes taxonomy file.
  • count.groups - counts the number of sequence in a given group or set of groups

Feature updates

  • align.seqs, dist.seqs, summary.seqs, classify.seqs - added multiple processors option for Windows users
  • make.shared - removed ordergroup parameter
  • sub.sample - improved speed for sampling of fasta and list files
  • rarefaction.single - single output file for each rarefaction calculator when running with a shared file.
  • sub.sample - now outputs a unique fasta file and new name file (
  • unique.seqs - name file prints in same order as fasta file
  • summary.seqs - added a column indicating the number of sequences represented by the different cutoffs in the output
  • summary.seqs - added mean values to output
  • pre.cluster - added group and bygroup parameters
  • chimera.uchime - added group parameter to so that when checking with reference=self, you can check on a bygroup basis. When using reference=self and a group file, you can also use multiple processors.
  • chimera.slayer - added group parameter to so that when checking with reference=self, you can check on a bygroup basis.
  • cluster.split - changed default cutoffs for command. cutoff=0.25 and taxlevel=3.
  • added leading ‘0’ to OTU number labels. -

Bug fixes

Wiki updates