Assign sequence classifications to a strollur object
Note, if you assign sequence taxonomies and assign bins, strollur will find the consensus taxonomy for each bin for you.
Usage
xdev_assign_sequence_taxonomy_tidy(
data,
table,
reference = NULL,
sequence_name = "sequence_name",
level = "level",
taxonomy = "taxonomy",
confidence = "confidence",
verbose = TRUE
)Arguments
- data,
a strollur object
- table,
a data.frame containing sequence taxonomy assignments
- reference,
a list created by the function [new_reference]. Optional.
- sequence_name,
a string containing the name of the column in 'table' that contains the sequence names. Default column name is 'sequence_name'.
- level,
a string containing the name of the column in 'table' that contains the taxonomy levels. Default column name is 'level'.
- taxonomy,
a string containing the name of the column in 'table' that contains the sequence taxonomies. Default column name is 'taxonomy'.
- confidence,
a string containing the name of the column in 'table' that contains the taxonomies confidence. Default column name is 'confidence'.
- verbose,
a boolean whether or not you want progress messages. Default = TRUE.
Value
an updated strollur object
Examples
sequence_classifications <- readRDS(strollur_example("miseq_tidy_taxonomy.rds"))
str(sequence_classifications)
#> 'data.frame': 14550 obs. of 4 variables:
#> $ sequence_name: chr "M00967_43_000000000-A3JHG_1_2101_16474_12783" "M00967_43_000000000-A3JHG_1_2101_16474_12783" "M00967_43_000000000-A3JHG_1_2101_16474_12783" "M00967_43_000000000-A3JHG_1_2101_16474_12783" ...
#> $ level : int 1 2 3 4 5 6 1 2 3 4 ...
#> $ taxonomy : chr "Bacteria" "\"Bacteroidetes\"" "\"Bacteroidia\"" "\"Bacteroidales\"" ...
#> $ confidence : int 100 100 99 99 88 88 100 100 100 100 ...
data <- new_dataset("my_dataset")
xdev_assign_sequence_taxonomy_tidy(data, sequence_classifications)
#> Assigned 2425 sequence taxonomies.
#> my_dataset:
#>
#>
#> Number of unique seqs: 2425
#> Total number of seqs: 2425
#>
#> Total number of sequence classifications: 2425
#>
# With the reference parameter you can add information about the reference
# you used to classify your sequences. You can also add references using the
# 'add_references' function.
reference <- new_reference("trainset9_032012.pds.zip", "9_032012",
"classification by mothur2 v1.0 using default options", "",
"https://mothur.s3.us-east-2.amazonaws.com/wiki/trainset9_032012.pds.zip")
xdev_assign_sequence_taxonomy_tidy(data, sequence_classifications, reference)
#> Assigned 2425 sequence taxonomies.
#> Added 1 resource references.
#> my_dataset:
#>
#>
#> Number of unique seqs: 2425
#> Total number of seqs: 2425
#>
#> Total number of sequence classifications: 2425
#> Total number of resource references: 1
#>
