New ERGO Feature: Violin Plots for Expression Analysis

We’ve added to ERGO’s rich visualizations with a new plot type: the violin plot. A violin plot is a way to visualize an underlying distribution of values (in our case, log fold change). Similar in utility to a box plot, a violin plot has a few advantages. Instead of just representing the median, quartiles, minimum and maximum, a violin plot uses a kernel density estimation algorithm to visualize the distribution of underlying points.

export - 2021-07-12T174646.542.png

In the above plot we can see that though the median log fold change for “Reactive oxygen detoxification” is negative, there are genes with a positive log fold change (lfc) in this category centered around 2.5 lfc. In addition, “Bacterial Photosynthesis” has a positive median lfc and we can see most of the genes in this category center around the median.

export - 2021-07-13T105815.204.png

A kernel density estimator (KDE) is similar to a histogram. While histograms are dependent on choice of block and window sizes and can be discontinuous; KDE uses a kernel function to create a smooth continuous plot that estimates the probability density of a random variable. This allows us to make inferences about the underlying population based upon a limited number of samples. In a violin plot, the density function is mirrored, which occasionally makes it look like a violin.

Screen Shot 2021-07-13 at 11.26.41 AM.png

ERGO provides two different kernel functions: Gaussian and Epanechnikov. In addition to kernel function, the selection of bandwidth is also important. The bandwidth parameter balances between ‘smoothness’ and the amount of the underlying structure that is represented in the density plot (see below).

Epanechnikov with a bandwidth of 2

Epanechnikov with a bandwidth of 2

Epanechnikov with a bandwidth of 1.2

Epanechnikov with a bandwidth of 1.2

Want to learn more about ERGO or give it a try?

References:

M.P. Wand, M.C. Jones. Kernel Smoothing. 1st ed. New York: Chapman & Hall; 1995. 212p. doi: 10.1007/978-1-4899-4493-1