Skip to contents

Assign sequence classifications to a strollur object

Note, if you assign sequence taxonomies and assign bins, strollur will find the consensus taxonomy for each bin for you.

Usage

xdev_assign_sequence_taxonomy_tidy(
  data,
  table,
  reference = NULL,
  sequence_name = "sequence_name",
  level = "level",
  taxonomy = "taxonomy",
  confidence = "confidence",
  verbose = TRUE
)

Arguments

data,

a strollur object

table,

a data.frame containing sequence taxonomy assignments

reference,

a list created by the function [new_reference]. Optional.

sequence_name,

a string containing the name of the column in 'table' that contains the sequence names. Default column name is 'sequence_name'.

level,

a string containing the name of the column in 'table' that contains the taxonomy levels. Default column name is 'level'.

taxonomy,

a string containing the name of the column in 'table' that contains the sequence taxonomies. Default column name is 'taxonomy'.

confidence,

a string containing the name of the column in 'table' that contains the taxonomies confidence. Default column name is 'confidence'.

verbose,

a boolean whether or not you want progress messages. Default = TRUE.

Value

an updated strollur object

Examples


sequence_classifications <- readRDS(strollur_example("miseq_tidy_taxonomy.rds"))
str(sequence_classifications)
#> 'data.frame':	14550 obs. of  4 variables:
#>  $ sequence_name: chr  "M00967_43_000000000-A3JHG_1_2101_16474_12783" "M00967_43_000000000-A3JHG_1_2101_16474_12783" "M00967_43_000000000-A3JHG_1_2101_16474_12783" "M00967_43_000000000-A3JHG_1_2101_16474_12783" ...
#>  $ level        : int  1 2 3 4 5 6 1 2 3 4 ...
#>  $ taxonomy     : chr  "Bacteria" "\"Bacteroidetes\"" "\"Bacteroidia\"" "\"Bacteroidales\"" ...
#>  $ confidence   : int  100 100 99 99 88 88 100 100 100 100 ...

data <- new_dataset("my_dataset")

xdev_assign_sequence_taxonomy_tidy(data, sequence_classifications)
#> Assigned 2425 sequence taxonomies.
#> my_dataset:
#> 
#> 
#> Number of unique seqs: 2425 
#> Total number of seqs: 2425 
#> 
#> Total number of sequence classifications: 2425 
#> 

# With the reference parameter you can add information about the reference
# you used to classify your sequences. You can also add references using the
# 'add_references' function.

reference <- new_reference("trainset9_032012.pds.zip", "9_032012",
              "classification by mothur2 v1.0 using default options", "",
"https://mothur.s3.us-east-2.amazonaws.com/wiki/trainset9_032012.pds.zip")

xdev_assign_sequence_taxonomy_tidy(data, sequence_classifications, reference)
#> Assigned 2425 sequence taxonomies.
#> Added 1 resource references.
#> my_dataset:
#> 
#> 
#> Number of unique seqs: 2425 
#> Total number of seqs: 2425 
#> 
#> Total number of sequence classifications: 2425 
#> Total number of resource references: 1 
#>