SIAT

Effortlessly visualize and analyze a measure of symptom intensity as a quantitative response variable in connection to several experimental factors.

Quickly get a sense of what is in your disease assay data.

First of all, go to the 'Input Table' thumbnail to upload a file with your data set. It must be formated in a 'long format' with one row per symptom measurement and columns describing the levels of the experimental factors associated with this numeric value (e.g plant genotype, strain, replicate ID, experiment ID, etc). From there you can use the tools accessible on the left handside menu to filter, aggregate, visualize, transform and export your data in an intutitive and user-friendly fashion.

A short description of the data analysis tools:

Home : This is page you are currently reading with an overview of SLAT.

Input table : This is where to upload your data set with values organized in a long or 'tidy' format (https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html). You can provide files containing one of several field and decimal separators. It is then possible to filter your data and export it. Downstream analysis steps with the various tools will be performed on this filtered data.

Mean/Sd : This enables the computation of standard aggregate values (mean, standard deviation, count of observations) of the quantitative response variable conditioned on the levels of one or several experimental factors.

boxplot : Plot individual datapoints on a box and whisker plot, conditioned on up to three experimental variables (e.g. distribution of symptom by strain and genotype across several experiments.)

barplot : Plot symptom mean and standard deviation as a barplot, conditioned on up to three experimental variables (e.g. distribution of symptom by strain and genotype across several experiments.).

Heatmap : For very large data sets, it may be usefull to display the means of symptom measurements in the compact form of a heatmap with levels of experimental variables as rows and columns. It also enables clustering of factor levels in rows and columns.

Plot Time Series : Plot symptom mean and standard deviation as a function of a time variable (e.g. date of the experiment). The plot can be further conditioned on other experimental variables.

Anova : Allows to conduct Analysis of variance to identify experimental factors significantly affecting the magnitude of symptoms. Also performs Tukey multiple comparisons of means and display results in a searchable table.

Means Comparison : When a factor is found to significantly impact symptoms readout, this tool performs multiple means comparison with a variety of methods to find means that are significantly different from each other and add the significant comparisons on top of a data plot.

PCA : Principal component analysis, a dimension reduction technique for multivariate datasets. This is usefull for example to identify clusters of isolates with similar virulence patterns or conversely identify groups of host genotypes with related susceptibility profiles.

Categorical Analysis : In phytopathology, people often convert quantitative symptom measures into an ordinal scale representing a disease index. This index is then used to group pathogen isolates in 'races' based on identical disease index profiles on a set of plant host genotypes. This tool enable to perform this convertion, to visualize the results and export a table of races.

Upload your data file using the 'Browse...' button. It must be formated in a 'long' or 'tidy' format with one row per symptom measurement and columns describing the levels of the experimental factors associated with this numeric value (e.g plant genotype, strain, replicate ID, experiment ID, etc). Specify the type of field and decimal separators used to represent data in your file. Once, there is no error message (Data Validation) and your data displays correctly in the table below, you can start your analysis with the tools.

Download a test file

CSV File

Browse...

Select the response variable

Separator

Semicolon

Tab

decimal

Comma

Dot

Data Validation:

Normality test : Shapiro-Wilk:

logarithmic transformation

You can a filter your dataset by specifying the criteria in the boxes on top of the columns below. All analysis will be done with the filtered dataset.

Group data points by levels of one or several experimental factor(s) and calculate aggregated values : the number of observations (Count), the median, the mean and the standard deviation (Sd) of the selected response variable (e.g. symptom length).

Select the response variable

Select experimental factors

Data summary by variable:

Examine the influence of experimental factors on the continuous response variable (e.g. symptom intensity) using analysis of variance (ANOVA). If two experimental factors are selected (maximum) the model will automatically include an interaction term. Note that these tests will only be valid if your data uses a balanced experimental design and meets ANOVA's assumptions.

Select the response variable

Select the factor(s)

Anova :

Download Anova Plot

Tukey HSD tests results: post hoc comparisons on each combination of factor levels in the model.

Principal component analysis : Dimensionality reduction method which transforms a large dataset into a simplified representation capturing most of the information of the original dataset. This is usefull in exploratory data analysis, for example to identify strains with a similar virulence profile on a set of plant genotypes.

Select the response variable (e.g. symtpols intensity)

Select factor that will define individuals (e.g. strains)

Select factor that will define variables (e.g. plant genotype)

reduct variables

center variables

Number of axis

Axis to plot

Download Plot ind

Download Plot Var

Download Plot VP

Download Plot Both

Heatmap visualization of averrage values of the response variable as a function of the levels of two experimental variables (rows and column). Can optionally order rows and column based on a hierarchical clustering approach (dendrogram added on top and/or on the side of the color matrix).

Select the response variable

Select the factor displayed in rows

Select the factor displayed in columns

Add clusterization for:

col

row

Convert your symptom intensity data into a qualitative index for plant pathogen race analysis: Pathogen races (also referred to as physiological races or pathotypes) are defined by their profile of pathogenicity on a defined set of differential host cultivars (i.e. a set of host genotypes that each carry a distinct profile of resistance genes). Oftentimes, pathogenicity is defined as an ordered categorical variable (ordinal) with levels depicting disease outcome (e.g. Resistant < Moderately Susceptible < Susceptible)

This tool takes mean symptom measures and convert them into categories defined by the user. This categorical data is then plotted as a Heatmap where it is straightforward to observe the clustering of virulence profiles into races. Furthermore, unique pathogenicity profiles (i.e. races) in the data set are computed and assigned to each strain in the table summarizing the output data. Finally, the categories distribution for each individual (levels of the variable displayed in row) is displayed in the stacked barplot.

Select the response variable

Select the factor displayed in rows (e.g. Strain)

Select the factor displayed in columns (e.g. Plant line)

Clusterisation :

col

row

Subdivise your dataset in several categories of resistance:

Number of categories

Threshold between the categories 1 & 2

Threshold between the categories 2 & 3

Threshold between the categories 1 & 2

Threshold between the categories 2 & 3

Threshold between the categories 3 & 4

Threshold between the categories 1 & 2

Threshold between the categories 2 & 3

Threshold between the categories 3 & 4

Threshold between the categories 4 & 5

Threshold between the categories 1 & 2

Threshold between the categories 2 & 3

Threshold between the categories 3 & 4

Threshold between the categories 4 & 5

Threshold between the categories 5 & 6

Plot individual data points together with 'standard' box and whisker representations, conditionned on experimental factors.

Select the response variable (y)

Select the factor for the x-axis (x)

Select a factor for coloring based on its levels (fill)

Select a third factor to generate one plot per level of this factor in a grid

Order the X axis by median

Download Plot Visu

This tool plots aggregates of data values (Mean and standard variation) conditioned on one or several experimental variables.

Select the response variable (y)

Select the factor for the x-axis (x)

Select a factor for coloring based on its levels (fill)

Select a third factor to generate one plot per level of this factor (grid)

Download Barplot

This tool may be particularly usefull if there is a time variable in you data set (e.g. date of the experiment) and you want to plot values along time on the x-axis. It plots aggregates of data values (Mean and standard variation) conditioned on one or several experimental variables.

Select the response variable (y)

Select the variable for the x-axis

Specify its time format (e.g. 27/02/2018 -> dmy)

not a time format

dmy

ymd

Select a factor for plots facetting (grid y)

Select a factor for plots facetting (grid x)

Select a factor for grouping/coloring on each sub-plot (z)

Smoothing

smooth

Download Plot Time

Download Report