We will be a mothur workshop in December. Learn more.

Split.groups

From mothur
Jump to: navigation, search

The split.groups command reads a fasta file or flow and group file or count and generates a fasta or flow file for each group in the groupfile. To run this tutorial please download Esophagus dataset.


Default Settings

mothur > split.groups(fasta=esophagus.fasta, group=esophagus.groups)

or

mothur > split.groups(flow=esophagus.flow, count=esophagus.count_table)

This command generates 3 files: esophagus.B.fasta, esophagus.C.fasta and esophagus.D.fasta


Options

name

The name parameter allows you add a names file with your fasta file and a name file will be generated for each group.

mothur > split.groups(fasta=esophagus.fasta, group=esophagus.groups, name=esophagus.names)

count

The count file is similar to the name file in that it is used to represent the number of duplicate sequences for a given representative sequence. A count file will be created for each group.

mothur > split.groups(fasta=esophagus.fasta, count=esophagus.count_table)

groups

The groups parameter allows you to select groups to create files for. For example if you set groups=B-C, you will only get a esophagus.B.fasta, esophagus.B.names, esophagus.C.fasta, esophagus.C.names files.

mothur > split.groups(fasta=esophagus.fasta, group=esophagus.groups, name=esophagus.names, groups=B-C)

flow

The flow parameter is used to input your flow file.

Revisions

  • 1.28.0 - Added count parameter
  • 1.32.0 - Bug Fix: not splitting properly with a count file.
  • 1.40.0 - Adds flow parameter.