18S rDNA classification issues

Use this forum to address questions about the use of various commands available in mothur
Grothjan
Posts: 19
Joined: Thu May 29, 2014 12:06 pm

Re: 18S rDNA classification issues

Post by Grothjan » Mon Jul 27, 2015 1:55 pm

I've been trying to wrap my head around some of the more complex portions of coding in R so I can give it a shot. How would you go about trying to cull the taxonomies?

awil
Posts: 16
Joined: Wed Apr 04, 2012 11:22 pm

Re: 18S rDNA classification issues

Post by awil » Mon Dec 21, 2015 2:08 am

I have the same problem to classify 18S taxonomy. I'm not sure that do this issue are fixed? It is difficult to arrange taxonomy in 6 levels.
Thank you very much.

Grothjan
Posts: 19
Joined: Thu May 29, 2014 12:06 pm

Re: 18S rDNA classification issues

Post by Grothjan » Thu Feb 25, 2016 5:59 pm

The short answer is no. As far I can tell this problem still persists within the V123 taxonomy as well. I've come back to this problem as of late and have gone straight to the source and have been working with the SILVA files but it is slow going. I'll post an update if I have a breakthrough.

cryomics
Posts: 4
Joined: Wed Mar 30, 2016 1:59 am

Re: 18S rDNA classification issues

Post by cryomics » Wed Mar 30, 2016 2:50 am

As others have noted, using classify.seqs with the SILVA database provided you get the top 6 levels:

Code: Select all

M03580_15_000000000-AHFYP_1_2115_20833_19910	Eukaryota(100);Opisthokonta(100);Holozoa(100);Metazoa_(Animalia)(100);Eumetazoa(100);Bilateria(100);
But if you make your own taxonomy file you can get more:

Code: Select all

 grep '>' silva.nr_v123.align | cut -f1,3 -d$'\t' | cut -f2 -d'>' > silva.nr_v123.long.tax 

Code: Select all

M03580_15_000000000-AHFYP_1_2115_20833_19910	Eukaryota(100);Opisthokonta(100);Holozoa(100);Metazoa_(Animalia)(100);Eumetazoa(100);Bilateria(100);Arthropoda(100);Crustacea(100);Maxillopoda(100);Copepoda(100);Calanoida(100);unclassified;unclassified;unclassified;
What are the things that will break downstream by having more than 6 levels in these files?

cryomics
Posts: 4
Joined: Wed Mar 30, 2016 1:59 am

Re: 18S rDNA classification issues

Post by cryomics » Wed Mar 30, 2016 6:36 pm

Ok, I wrote a script in R that pulls in the SILVA taxonomy map, then re-maps the headers from silva.nr_v123.align to conform to a set of specified taxonomic levels. If there is no level given it invents one based on the closest higher taxonomic level. For my purposes this is better than a million "Incertae Sedis" but YMMV.

you can find the script on github: convert_silva_taxonomy.r

giderk
Posts: 3
Joined: Thu Dec 01, 2016 1:32 pm

Re: 18S rDNA classification issues

Post by giderk » Wed Mar 08, 2017 10:51 pm

@cryomics I tried to run your r-script and it went well until...

Code: Select all

tax.in <- read.table("silva.nr_v123.full",header=F,stringsAsFactors=F,sep="\t")
I get an error because I don't have this file. Where can I obtain it?

westcott
Posts: 1725
Joined: Thu Sep 03, 2009 7:47 am

Re: 18S rDNA classification issues

Post by westcott » Mon Mar 13, 2017 5:12 pm

The silva reference files are available here, https://mothur.org/wiki/Silva_reference_files.

pschloss
Site Admin
Posts: 3142
Joined: Wed Sep 02, 2009 3:40 pm
Location: University of Michigan
Contact:

Re: 18S rDNA classification issues

Post by pschloss » Tue Mar 21, 2017 3:47 pm

In @cryomics R code there is a line you need to run in bash:

Code: Select all

grep '>' silva.nr_v123.align | cut -f1,3 | cut -f2 -d'>' > silva.nr_v123.full
This will generate silva.nr_v123.full

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest