Store and Transfer Data Associated with Amplicon Sequence Analysis • strollur

Overview

The strollur package stores the data associated with your Amplicon Sequence Analysis. This includes nucleotide sequences, abundance, sample and treatment assignments, taxonomic classifications, sequence bin assignments, metadata, trees and various reports. It is designed to facilitate data analysis across multiple R packages with utility functions to import from mothur, qiime2, dada2 and phyloseq.

add() adds sequences, reports, metadata, and resource references
assign() assigns abundances, classifications, bins, samples and treatments and more
names() gets the names of sequences, bins, samples, treatments and reports
count() gets the number of sequences, bins, samples and treatments
abundance() gets the abundances for sequences, bins, samples, and treatments
report() gets FASTA sequences, sequence and classification reports, bin assignments, sample assignments, metadata, sequence data reports, custom reports, resource references and scrapped data reports.
summary() summarizes sequences, your custom reports, and scrapped data

Installation

You can install the CRAN version with:

install.packages("strollur")

Development version

You can install the development version of strollur from GitHub with:

pak::pak("mothur/strollur")

Usage

The example below adds FASTA sequence data, assigns sequence abundance, samples and treatments, as well as assigning bins and taxonomic data to a strollur object.

fasta_data <- read_fasta(strollur_example("final.fasta.gz"))
abundance_table <- readRDS(strollur_example("miseq_abundance_by_sample.rds"))
bin_table <- readRDS(strollur_example("miseq_list_otu.rds"))
classification_data <- read_mothur_taxonomy(taxonomy = strollur_example("final.taxonomy.gz"))

data <- new_dataset(dataset_name = "example")

add(data, table = fasta_data, type = "sequence")
#> Added 2425 sequences.
assign(data, table = abundance_table, type = "sequence_abundance")
#> Assigned 2425 sequence abundances.
assign(data, table = bin_table, type = "bin", bin_type = "otu")
#> Assigned 531 otu bins.
assign(data, table = classification_data, type = "sequence_taxonomy")
#> Assigned 2425 sequence taxonomies.

data
#> example:
#> 
#>             starts ends nbases ambigs polymers numns   numseqs
#> Minimum:         1  375    249      0        3     0      1.00
#> 2.5%-tile:       1  375    252      0        3     0   2850.08
#> 25%-tile:        1  375    252      0        4     0  28491.75
#> Median:          1  375    252      0        4     0  56982.50
#> 75%-tile:        1  375    253      0        5     0  85473.25
#> 97.5%-tile:      1  375    253      0        6     0 111114.93
#> Maximum:         1  375    256      0        6     0 113963.00
#> Mean:            1  375    252      0        4     0      0.00
#> 
#> Number of unique seqs: 2425 
#> Total number of seqs: 113963 
#> 
#> Total number of samples: 19 
#> Total number of treatments: 2 
#> Total number of otus: 531 
#> Total number of otu bin classifications: 531 
#> Total number of sequence classifications: 2425

Getting help

If you encounter an issue, please file an issue on GitHub. Please include a minimal reproducible example with your issue.

Contributing

Is there a feature you’d like to see included, please let us know! Pull requests are welcome on GitHub.

Code of Conduct

Please note that the strollur project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.