mothur v.1.8.0

We are happy to release mothur v.1.8.0 on the great American holiday of Groundhog Day! Well it’s been a wild couple of months since the last release of mothur - within hours of sending out the announcement for the last version dotur 2.0 - Ruth - was born. Blame the delay on a new baby, Christmas and New Years, and a crazy January full of grant deadlines and conferences. But as you’ve come to expect, we have added a number of new features and bug fixes. We’ve been encouraged by the expanding traffic on the mothur forum and appreciate everyone’s willingness to answer other people’s questions.

We are pretty excited about the new commands that we have added to mothur and the long list of added features to previous commands. I am grateful that people finally pushed us towards incorporating the readline library, which allows you to use the arrow keys to move around and recall previous commands in the mac and linux environments. Thank you for all your suggestions and please keep them coming! It is amazing how far mothur has come in the last year and much of it is due to the input of users. Starting with this release we are also providing precompiled binaries for the Mac OS - this will keep you from having to have XCode to compile the source code. Let us know what you think of this option.

Going forward we have many new ideas and are anxious to get them out to you. Some things we’re working on include incorporating non-parametric estimation tools, wrapping clearcut so you can build trees within mothur, and parallelizing the clustering pipeline. As always, feel free to cite mothur liberally and if a paper you publish uses mothur, feel free to tell us about it on the user forum.

New commands

  • mothur uses the readline library to allow use of arrow keys and other shortcuts in interactive mode.
  • added mgcluster command which incorporates the functionality of the mg-dotur program.
  • added pcoa command to generate principal coordinate plots
  • added pre.cluster command, which clusters sequences using a method similar to that described in the forthcoming paper by Huse et al.
  • added set.dir command allowing you to set an input file directory and / or redirect the output files generated by mothur.
  • added otu.hierarchy command which relates otus at different distance levels
  • added average neighbor and nearest neighbor methods to hcluster command

New features

  • mothur now allows for comments in fasta files in the following format, which will make analyzing output from the RDP easier:

     >#=GC_SS_cons  ::::::::««<_______>>>>>,\{ \{\{ \{-\{ \{\{ \{\{ \{,\{ \{\{ \{\{ \{\{ \{\{-.---\{ \{\{-.\{ \{.\{.,,...  >FYV22AL02GBAKV  ------------------------------------------------.-------.--.-.--...

    The sequence =GC_SS_cons is ignored and only FYV22AL02GBAKV is read in.

  • added flip and threshold parameters as well as a progress indicator to align.seqs command.
  • added list parameter to list.seqs, get.seqs and remove.seqs commands.
  • added probs parameter to the classify.seqs command, so you can choose not to report the confidence scores with the bayesian method.
  • added iters and name parameters to classify.seqs command.
  • added distance search method to classify.seqs
  • added sorted parameter to get.oturep command, so the output can be sorted by sequence name, OTU number, OTU size, or group. The default is no sorting.
  • added large parameter to get.oturep command, so if your distance matrix is too large to fit in ram you can still run the command.
  • removed the groups parameter from get.sharedseqs and in its place added the unique and shared parameters.
  • added file checking to read.dist to fix error that occured in the libshuff command if the files did not match.
  • you can now use a distance matrix as input for the heatmap.sim command.
  • added unclassified bins to classify.seqs command output.
  • summary.shared(all=false) by default
  • added distance parameter to unifrac.weighted and unifrac.unweighted commands that allows you to output a distance matrix from the command.
  • added random parameter to unifrac.weighted and unifrac.unweighted commands that allows you to shut off the comparison to random trees.
  • get.oturep command now outputs a .rep.names file.
  • get.oturep command - when selecting the rep, if there are multiple sequence with the minimum maximum distance to other members in the bin the sequence with the smallest average distance is selected. This is a change that may produce different results from those calculated by previous mothur versions. previously the first sequence found with the minimum maximum distance was selected.
  • you can now read a shared file and then run the .single commands.
  • you can now input multiple fasta files to the align.seqs and classify.seqs commands by separating filenames with dashes. i.e. fasta=abrecovery.fasta-amazon.fasta
  • added a timestamp to the mothur.logfiles (it’s unix time)

Bug fixes

  • added warning about average neighbor clustering near cutoff.
  • fixed a bug in how the morisitahorn index was being calculated
  • mothur’s logfiles are now time stamped and not overwritten with each execution of mothur.
  • fixed bug in aligner that caused a bus error if your candidate sequence had more bases than your longest template sequence.
  • fixed bug in merge.files command that gave a “cannot open file” error in the windows version.
  • added formatting to mothur to make phylip formatted distance files compatible with other software tools.
  • corrected error in help for align.seqs and classify.seqs which gave incorrect defaults for the gapopen and gapextend defaults.
  • mothur now recognizes ./ and ../ in file names