We will be offering an R workshop December 18-20, 2019. Learn more.


From mothur
Revision as of 16:53, 30 September 2019 by Westcott (Talk | contribs)

Jump to: navigation, search

The split.groups command reads a list, fasta file or flow and group file or count or fastq file and generates a fasta, flow or fastq files for each group in the groupfile. To run this tutorial please download Esophagus dataset.

Default Settings

mothur > split.groups(fasta=esophagus.fasta, group=esophagus.groups)


mothur > split.groups(flow=esophagus.flow, count=esophagus.count_table)

This command generates 3 files: esophagus.B.fasta, esophagus.C.fasta and esophagus.D.fasta


mothur > split.groups(list=full.list, count=full.count_table)


mothur > split.groups(fastq=full.fastq, count=full.count_table)



The name parameter allows you add a names file with your fasta file and a name file will be generated for each group.

mothur > split.groups(fasta=esophagus.fasta, group=esophagus.groups, name=esophagus.names)


The count file is similar to the name file in that it is used to represent the number of duplicate sequences for a given representative sequence. A count file will be created for each group.

mothur > split.groups(fasta=esophagus.fasta, count=esophagus.count_table)


The groups parameter allows you to select groups to create files for. For example if you set groups=B-C, you will only get a esophagus.B.fasta, esophagus.B.names, esophagus.C.fasta, esophagus.C.names files.

mothur > split.groups(fasta=esophagus.fasta, group=esophagus.groups, name=esophagus.names, groups=B-C)


The flow parameter is used to input your flow file.


  • 1.28.0 - Added count parameter
  • 1.32.0 - Bug Fix: not splitting properly with a count file.
  • 1.40.0 - Adds flow parameter.
  • 1.42.0 - Adds fastq parameter. #499
  • 1.43.0 - Addslist parameter. #624