Home

Overview

Welcome to the Shiny Genotyping Server

The DAPC Server is designed for the analysis of multilocus genotyping data, specifically micro/minisatellite data and Multilocus VNTR analysis (MLVA), on haploid microorganisms (bacteria and fungi)

Github

I recommand to download the code and use the application from github to avoid several server problems and for more feedback

https://github.com/Aucomte/ShinyGenotyping

Citations

the DAPC tab was adapted from the code written by the adegenet team. They have their own DAPC shiny application : https://github.com/thibautjombart/adegenet

Citation for adegenet:

Jombart T.(2008) adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24: 1403-1405. doi:10.1093/bioinformatics/btn129 [link to paper]

Jombart T. and Ahmed I. (2011) adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics. doi: 10.1093/bioinformatics/btr521

Citation for the DAPC:

Jombart T, Devillard S and Balloux F (2010) Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genetics 11:94. doi:10.1186/1471-2156-11-94 [link to paper]

http://adegenet.r-forge.r-project.org/

Citation for poppr:

Kamvar ZN, Tabima JF, Grünwald NJ (2014). “ extit{Poppr}: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction.” PeerJ, 2, e281. ISSN 2167-8359, doi: 10.7717/peerj.281, https://doi.org/10.7717/peerj.281.

Kamvar ZN, Brooks JC, Grünwald NJ (2015). “Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality.” Front. Genet., 6, 208. doi: 10.3389/fgene.2015.00208, https://doi.org/10.3389/fgene.2015.00208.

Input data

input
Genotype object

First of all, please submit your input data (in csv or txt tabulated format), either repetition file or genemapper output (see file example). Do not forget to hit submit files! After submitting, fill in the genotype object tab to indicate your loci, the variable you choose as population and then create the genind object that will be used for the rest of the analyse.

Genemapper table :

Metadata table :

repetition table :

Final Table :

genind = adegenet class for individual genotypes
(rdocumentation)
Before everithing select the loci and the population (ex: Country for the datatest) you want to work with, the his the button Submit. If you change the parameters, do not forget to hit sumbit again.

Each haplotype have been numeroted and have a specific allelic profile. Each strain have a corresponding haplotype.

Table 1: Haplotypes and Strains.

Table 2: Allelic profiles of the haplotypes.

Find K : the number of clusters

Snapclust
FindClusters

Two methods to select the right number of clusters for DAPC clusterisation:
- Snapclust is a fast maximum-likelihood method, combining the advantages of both model-based and geometric approaches (Beugin et al., 2018). The optimal number of clusters (k) are estimated using both the Akaike, Kullback and Bayesian Information Criterion (AIC, KIC, BIC, respectively). Ten runs of the Expectation-Maximisation (EM) algorithm are advised to estimate an accurate K and the probability of assignment (Q) of each individual into each of the k inferred.
- The function find.clusters runs successive k-means clustering with increasing number of clusters (k) and the optimal number of clusters is selected based on lowest Bayesian information criterion (BIC) (Jombart et al., 2010). 10 to 20 runs are advised to estimate an accurate K.
References:
Beugin, M.-P., Gayet, T., Pontier, D., Devillard, S. and Jombart, T. (2018) A fast likelihood solution to the genetic clustering problem. Methods in Ecology and Evolution, 9, 4. doi: 10.1111/2041-210X.12968.
Jombart, T., Devillard, S. and Balloux, F. (2010) Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genet., 11, 94. doi: 10.1186/1471-2156-11-94.

Ideally, the lowest AIC/BIC corresponds to the best model.

SnapClust Analysis

Compoplot
Group Representation

Overview

Github

Citations

Citation for adegenet:

Citation for the DAPC:

Citation for poppr:

Input data

INPUT

Create genotype object

Genemapper table :

Metadata table :

repetition table :

Final Table :

Table 1: Haplotypes and Strains.

Table 2: Allelic profiles of the haplotypes.

Statistics

PCA

WARNING : works only if a population is defined for genind object. Can be long to run.

Status

Data show/hide

Display show/hide

Diversity by locus, estimated by PopGeneReport :

Pairwise FST :

Basic statistics per locus (hierfstat) :

The number of alleles used for rarefaction :

Rarefied allele counts :

missing data by locus and by population:

Find K : the number of clusters

DAPC Analysis

Graphical parameters

Aesthetics

Aesthetics

Graphical parameters

Cross-validation

Scatter Plot

Summary

Compoplot

Number of selected vs. unselected alleles

List of selected alleles

Names of selected alleles

Contributions of selected alleles to discriminant axis

Loading Plot

SnapClust Analysis

SNAPCLUST PARAMETER

Data

Display