create.database
The create.database command reads a list file, *.cons.taxonomy, *.rep.fasta, *.rep.names or *.rep.count_table and optional group file, and creates a database file. To run the following tutorial please download: Example Files
Default Settings
The create.database command parameters are repfasta, list, shared, relabund, repname, constaxonomy, group and label. List, relabund or shared, and count & constaxonomy are required. NOTE: Make SURE the repfasta, repnames or count and constaxonomy are for the same label as the listfile.
mothur > get.oturep(list=final.an.list, cutoff=0.03, fasta=final.fasta, column=final.dist, name=final.names)
mothur > classify.otu(list=final.an.list, name=final.names, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(list=final.an.list, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy)
or with a count file:
mothur > get.oturep(list=final.an.unique_list, cutoff=0.03, fasta=final.fasta, column=final.dist, count=final.count_table)
mothur > classify.otu(list=final.an.unique_list, count=final.count_table, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(list=final.an.unique_list, label=0.03, repfasta=final.an.0.03.rep.fasta, count=final.an.0.03.rep.count_table, constaxonomy=final.an.0.03.cons.taxonomy)
or with a shared file:
mothur > get.oturep(list=final.an.list, cutoff=0.03, fasta=final.fasta, column=final.dist, name=final.names)
mothur > classify.otu(list=final.an.list, name=final.names, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(shared=final.an.shared, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy)
or with a relabund file:
mothur > get.oturep(list=final.an.list, cutoff=0.03, fasta=final.fasta, column=final.dist, name=final.names)
mothur > classify.otu(list=final.an.list, name=final.names, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(relabund=final.an.relabund, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy)
If you open the final.an.database file you will see:
OTUNumber Abundance repSeqName repSeq OTUConTaxonomy
1 6307 GQY1XT001C296C A-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
2 5124 GQY1XT001A3TJI G-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
3 3177 GQY1XT001CS2B8 G-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
4 2947 GQY1XT001CD9IB G-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
...
list
The list parameter allows you to provide your list file. mothur creates this file by running the cluster, cluster.split or phylotype commands.
shared
The shared parameter allows you to provide your shared file. mothur creates this file by running the make.shared command.
relabund
The relabund parameter allows you to provide your relative abundance file. mothur creates this file by running the get.relabund command.
count
The count file is the count file outputted by get.oturep(fasta=yourFastaFile, list=yourListfile, column=yourDistFile, count=yourCountFile)
constaxonomy
The constaxonomy file is the taxonomy file outputted by classify.otu(list=yourListfile, name=final.names, taxonomy=yourTaxonomyFile)
Options
repfasta
The repfasta file is fasta file outputted by get.oturep(fasta=yourFastaFile, list=yourListfile, column=yourDistFile, name=yourNameFile) and is optional.
repname
The repname file is the name file outputted by get.oturep(fasta=yourFastaFile, list=yourListfile, column=yourDistFile, name=yourNameFile) and is optional.
group
The group file is optional and will just give you the abundance breakdown by group.
mothur > create.database(list=final.an.list, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy, group=final.groups)
If you open the final.an.database file you will see:
OTUNumber F003D000 ... F003D150 repSeqName repSeq OTUConTaxonomy
1 422 1012 ... 492 GQY1XT001C296C A-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);...
2 413 186 ... 707 GQY1XT001A3TJI G-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);...
3 279 238 ... 342 GQY1XT001CS2B8 G-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);...
4 255 194 ... 410 GQY1XT001CD9IB G-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);...
...
label
The label parameter allows you to specify a label to be used from your list file.
Revisions
- 1.25.0 - First Introduced
- 1.26.0 - added shared parameter
- 1.31.0 - added count parameter
- 1.39.0 - Makes refasta and repnames parameters optional
- 1.40.0 - Speed and memory improvements for shared files. #357 , #347
- 1.41.0 - Adds relabund option. #478