In this guide, we’ll walk you through the installation of the
BioLizardStyleR package and provide a brief overview of its
functions using a practical example. The BioLizardStyleR
package helps you make ggplot2 plots in the BioLizard style. It includes
an integrated font installer, a dedicated ggplot2 theme, custom color
functions, and a tool to add a BioLizard footer and export the plot.
The box below shows how to BioLizard-ify a Rmd knit to an HTML using a CSS style file. In the YAML header, you can set output to html_document with two options: toc: true and toc_float: true.
@import url('https://fonts.googleapis.com/css?family=Red+Hat+Display');
/*** default styling ***/
html, body, h1, h2, h3, h4, h5, h6 {
font-family: 'Red Hat Display', sans-serif;
}
div.main-container {
background-color: white;
max-width: 1200px !important;
padding: 40px;
margin-top: 50px;
margin-bottom: 50px;
border-radius: 15px;
box-shadow: 0px 10px 30px rgba(0, 0, 0, 0.2);
}
h4.author em {
font-size: 1rem;
margin-top: 2rem;
font-weight: 600;
color: #000000;
}
h4.date em {
color: #000000;
font-size: 1rem;
}
h1:not(.title) {
color: #009944;
}
h2 {
color: #0d47a1;
}
h3 {
color: #FF6F59;
}
body {
background: linear-gradient(180deg, #faf4ed 0%, #badeed 100%);
background-attachment: scroll;
}
To install the BioLizardStyleR package directly from
GitHub, you will need to utilize the devtools package. The
package requires our signature font “Red Hat Display”. If the font is
not yet installed, installation will be attempted automatically when you
load the library.
if (!requireNamespace("devtools", quietly = TRUE)) {
install.packages("devtools")
}
# make sure to look up BioConductor packages (used in the vignette)
setRepositories(ind = c(1:6, 8))
devtools::install_github("lizard-bio/nature-grade-visualization-playground", subdir="BioLizardStyleR")
For building the vignette, you also need the pasilla
dataset. In case of issues with the installation of this package, you
might have to install libbz2 (libbz2-dev (deb), libbz2-devel (rpm),
bzip2 (brew)).
library(BioLizardStyleR)
#> Loading required package: ggplot2
#> Red Hat Display loaded via Google Fonts. Ready for BLZ style!
If automatic font installation fails on your system, you can install the font manually using the following steps:
Transforming your ggplot to match the BioLizard style involves a three-step process. Let’s illustrate this on a basic ggplot.
library(ggplot2)
data("mtcars")
mtcars$gear <- factor(mtcars$gear, levels = c(3, 4, 5), ordered = TRUE)
testplot <- ggplot2::ggplot(data = mtcars, ggplot2::aes(x = hp, y = mpg)) +
ggplot2::geom_point(ggplot2::aes(color = gear),size=3) +
ggplot2::labs(title = "Miles per Gallon vs. Horsepower",
x = "Horsepower",
y = "Miles per Gallon",
color = "Gears")
testplot
In the initial step, we implement the biolizard theme through the
lizard_style() function. This not only modifies various
theme settings but also adjusts the font. An important note here is that
the lizard theme can only act as a starting template! Depending on your
specific plot, title length, variables in use, and other factors, you
might need to make further theme adjustments.
testplot <- testplot + lizard_style()
testplot
Tip: you can use
theme_set(lizard_style()) at the beginning of your script
or report to set the theme to lizard_style() in all plots, without
having to add + lizard_style() to each plot individually.
Te reset the theme to the ggplot2 default, you can call
reset_theme_settings().
Once you’ve styled your plot, you can enhance it with a BioLizard color palette. You have the option to choose from qualitative, sequential, or divergent palettes tailored for either discrete or continuous variables. These palettes incorporate the distinct hues of the BioLizard house colors, while ensuring they are color-blind friendly and perceptually uniform. For a deeper understanding of each palette, it’s recommended to consult the documentation for the respective function.
The coloring functions within the package adhere to a flexible structure: scale_color/fill_biolizard. The type (discrete/continuous) and the scheme must be determined as function parameters.
There are 7 available schemes:
data("mtcars")
mtcars$gear <- as.factor(mtcars$gear)
# Base plot
testplot <- ggplot2::ggplot(data = mtcars, ggplot2::aes(x = hp, y = mpg)) +
ggplot2::geom_point(aes(color = gear), size=3) +
ggplot2::labs(
title = "Miles per Gallon vs. Horsepower",
x = "Horsepower",
y = "Miles per Gallon",
color = "Gears"
)
# Applying the qualitative color scale
testplot_qualitative <- testplot +
scale_color_biolizard(type = "discrete", scheme = "qualitative") +
lizard_style()
testplot_qualitative
# Applying the sequential color scales (assuming you have a different dataset or aesthetic)
testplot_sequential <- testplot +
scale_color_biolizard(type = "discrete", scheme = "sequential") +
lizard_style()
testplot_sequential
## sequential scale inspired by BioLizard beige and blue (assuming you have a different dataset or aesthetic)
testplot_sequential <- testplot +
scale_color_biolizard(type = "discrete", scheme = "beige_blue") +
lizard_style()
testplot_sequential
# Applying the divergent color scale (assuming you have a different dataset or aesthetic)
testplot_divergent <- testplot +
scale_color_biolizard(type = "discrete", scheme = "divergent") +
lizard_style()
testplot_divergent
It is possible to reverse the color scales by setting
reverse = TRUE
testplot_divergent <- testplot +
scale_color_biolizard(type = "discrete", scheme = "divergent", reverse = TRUE) +
lizard_style()
testplot_divergent
The discrete color palettes can also be called directly, in case you would like to use them outside ggplot.
#use color palette
mycolors <- biolizard_pal_sequential(3, reverse = TRUE)
gearcolors <- mycolors[as.factor(mtcars$gear)]
graphics::plot(x = mtcars$hp, y = mtcars$mpg, pch = 19, col = gearcolors, family = "Red Hat Display")
graphics::legend("topright", legend = paste("Gears", 3:5), col = mycolors, pch = 19, bty = "n")
# #use BLZ base colors
# mycolors <- c(blz_green, blz_blue, blz_yellow)
# gearcolors <- mycolors[as.factor(mtcars$gear)]
#
# graphics::plot(x = mtcars$hp, y = mtcars$mpg, pch = 19, col = gearcolors, family = "Red Hat Display")
# graphics::legend("topright", legend = paste("Gears", 3:5), col = mycolors, pch = 19, bty = "n")
x = 1:8
y = 1:8
df <- data.frame(x=x, y=y)
ggplot2::ggplot(df, ggplot2::aes(x=x, y=y, fill = factor(x))) +
ggplot2::geom_col() +
scale_fill_biolizard() +
lizard_style()
ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3"))
values <- data.frame(
id = ids,
value = c(3, 3.1, 3.1, 3.2, 3.15, 3.5)
)
positions <- data.frame(
id = rep(ids, each = 4),
x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3,
0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3),
y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5,
2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2)
)
datapoly <- merge(values, positions, by = c("id"))
ggplot2::ggplot(datapoly, ggplot2::aes(x = x, y = y)) +
ggplot2::geom_polygon(ggplot2::aes(group = id, fill = id)) +
scale_fill_biolizard(type = "discrete", scheme = "beige_gn_blue") +
lizard_style()
ggplot2::ggplot(data = mtcars, ggplot2::aes(x = as.factor(gear), y = mpg)) +
ggplot2::geom_boxplot() +
ggplot2::labs(
title = "Miles per Gallon vs. Gears",
x = "Gears",
y = "Miles per Gallon"
) +
lizard_style()
A violin plot is similar to a boxplot, but captures more information on the data distribution:
ggplot2::ggplot(data = mtcars, ggplot2::aes(x = as.factor(gear), y = mpg)) +
ggplot2::geom_violin() +
ggplot2::labs(
title = "Miles per Gallon vs. Gears",
x = "Gears",
y = "Miles per Gallon"
) +
lizard_style()
ggplot2::ggplot(data = mtcars, ggplot2::aes(x = mpg)) +
ggplot2::geom_density() +
lizard_style()
ggplot2::ggplot(data = mtcars, ggplot2::aes(x = hp, y = mpg)) +
ggplot2::geom_smooth() +
ggplot2::geom_point(ggplot2::aes(color= mpg)) +
ggplot2::labs(
title = "Miles per Gallon vs. horsepower",
x = "Horsepower",
y = "Miles per Gallon",
color = "Miles per Gallon",
fill = "Miles per Gallon"
) +
scale_color_biolizard(type = "continuous", scheme = "sequential") +
lizard_style()
#> `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Faceting is a great way to lay out related plots side-by-side, with the axes either on the same scale or on different scales.
# use facet_wrap to stratify plots by one variable
ggplot2::ggplot(mtcars, ggplot2::aes(x=mpg, y=hp)) +
ggplot2::geom_point() +
ggplot2::facet_wrap(ggplot2::vars(gear)) +
lizard_style() +
ggplot2::theme(axis.text.x = ggplot2::element_text(size = 8, angle = 90, vjust = 0.5))
# use facet_grid for two variables
ggplot2::ggplot(mtcars, ggplot2::aes(x=mpg, y=hp)) +
ggplot2::geom_point() +
ggplot2::facet_grid(rows = ggplot2::vars(gear), cols = ggplot2::vars(carb)) +
lizard_style() +
ggplot2::theme(axis.text.x = ggplot2::element_text(size = 8, angle = 90, vjust = 0.5))
Note that the minimal style makes it difficult to distinguish the
different blocks from each other. Readability of the figure is improved
by plotting the axes for each plot, but they are removed by
facet_grid and facet_wrap when the plots are
on the same scale. The similar functions facet_rep_grid and
facet_rep_wrap from the lemon package are more suitable
with the lizard style, since they will show all axes.
library("lemon")
# use facet_wrap to stratify plots by one variable
ggplot2::ggplot(mtcars, ggplot2::aes(x=mpg, y=hp)) +
ggplot2::geom_point() +
lemon::facet_rep_wrap(ggplot2::vars(gear)) +
lizard_style() +
ggplot2::theme(axis.text.x = ggplot2::element_text(size = 8, angle = 90, vjust = 0.5))
# use facet_grid for two variables
ggplot2::ggplot(mtcars, ggplot2::aes(x=mpg, y=hp)) +
ggplot2::geom_point() +
lemon::facet_rep_grid(rows = ggplot2::vars(gear), cols = ggplot2::vars(carb)) +
lizard_style() +
ggplot2::theme(axis.text.x = ggplot2::element_text(size = 8, angle = 90, vjust = 0.5))
Treemap plots are a powerful way of visualizing hierarchical data such as GO terms, for example. The size of the rectangles corresponds to the value you want to show (e.g. p value), and related rectangles are grouped together. This example makes use of the Pokémon dataset from the highcharter package and is based on https://yjunechoe.github.io/posts/2020-06-30-treemap-with-ggplot/.
In this example, more colors are requested than are present in the
qualitative color scales. The biolizard_pal_hue palette was
developped for situations like this. It is a specific version of the
scales::pal_hue function which is used by default by
ggplot2, that starts at Biolizard’s signature green.
The treemapify package allows plotting treemaps with ggplot2.
library(treemapify)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data("pokemon", package = "highcharter")
# Cleaning up data for a treemap
data <- pokemon |>
dplyr::select(pokemon, type_1, type_2) |>
dplyr::mutate(type_2 = ifelse(is.na(type_2), paste("only", type_1), type_2)) |>
dplyr::group_by(type_1, type_2) |>
dplyr::summarise(n = length(pokemon)) |>
dplyr::ungroup()
#> `summarise()` has grouped output by 'type_1'. You can override using the
#> `.groups` argument.
head(data)
#> # A tibble: 6 × 3
#> type_1 type_2 n
#> <chr> <chr> <int>
#> 1 bug electric 4
#> 2 bug fairy 2
#> 3 bug fighting 3
#> 4 bug fire 2
#> 5 bug flying 13
#> 6 bug ghost 1
# plot the treemap
ggplot2::ggplot(data, ggplot2::aes(area = n, label = type_2, fill = type_1,
subgroup = type_1)) +
ggplot2::ggtitle("Number of pokémon stratified by type") +
# 1. Draw type_2 borders and fill colors
treemapify::geom_treemap() +
# 2. Draw type_1 borders
treemapify::geom_treemap_subgroup_border() +
# 3. Print type_1 text
treemapify::geom_treemap_subgroup_text(place = "centre", grow = T, alpha = 0.5, colour = "black",
fontface = "italic", min.size = 0,
family = "Red Hat Display") + #need to add "Red Hat Display" manually as this text is not accessed in theme()
# 4. Print type_2 text
treemapify::geom_treemap_text(colour = "white", place = "topleft", reflow = T,
family = "Red Hat Display") + #need to add "Red Hat Display" manually as this text is not accessed in theme()
# 5. BLZ style
scale_fill_biolizard(scheme = "beige_gn_blue") +
lizard_style() +
ggplot2::theme(legend.position = "none")
However, this is not ideal, since the font of the text still has to be defined manually, and thus the lizard_style() function does not work very well for these treemaps, even though they are ggplot2-based. Another option is to use the ‘treemap’ library, and move away from ggplot2 altogether, which generally leads to more aesthetically pleasing plots:
library(treemap)
# get colors from the biolizard hues palette
ncols <- length(unique(data$type_1))
palette <- biolizard_pal_beige_gn_blue(ncols)
treemap::treemap(data, index=c("type_1", "type_2"), vSize="n", type="index",
title="Number of pokémon stratified by type",
palette=palette, #set the BLZ color palette
fontcolor.labels=c("#FFFFFFDD", "#00000080"), bg.labels=0, border.col="#00000080",
fontfamily.title = "Red Hat Display", fontfamily.labels = "Red Hat Display", fontfamily.legend = "Red Hat Display") #set the Red Hat Display font
Idea from “A ggplot2 Tutorial for Beautiful Plotting in R”.
example_file <- system.file("extdata", "chicago-nmmaps-custom.csv", package = "BioLizardStyleR")
chic <- utils::read.csv(example_file) #An example dataset
g <- ggplot2::ggplot(chic, ggplot2::aes(x = season, y = o3,
color = season)) +
ggplot2::labs(x = "Season", y = "Ozone") +
ggplot2::geom_violin(linewidth = 1, alpha = .5) +
ggplot2::geom_jitter(alpha = .25, width = .3) +
ggplot2::coord_flip()
g + lizard_style() +
scale_color_biolizard()
library(corrr)
library(forcats)
corm <-
chic |>
dplyr::select(temp, dewpoint, pm10, o3) |>
corrr::correlate(diagonal = 1) |>
corrr::shave(upper = FALSE)
#> Correlation computed with
#> • Method: 'pearson'
#> • Missing treated using: 'pairwise.complete.obs'
corm <- corm |>
tidyr::pivot_longer(
cols = -term,
names_to = "colname",
values_to = "corr"
) |>
dplyr::mutate(
rowname = forcats::fct_inorder(term),
colname = forcats::fct_inorder(colname),
label = dplyr::if_else(is.na(corr), "", sprintf("%1.2f", corr))
)
g <- ggplot2::ggplot(corm, ggplot2::aes(rowname, fct_rev(colname),
fill = corr)) +
ggplot2::geom_tile() +
ggplot2::geom_text(ggplot2::aes(label = label)) +
ggplot2::coord_fixed(expand = FALSE) +
ggplot2::labs(x = NULL, y = NULL)
g + scale_fill_biolizard(type='continuous',scheme='divergent',limits = c(-1, 1),
na.value = "white", name="Pearson\nCorrelation") +
lizard_style() +
ggplot2::theme(legend.position = "inside", legend.position.inside = c(.95, .8))
g <- ggplot2::ggplot(chic, ggplot2::aes(temp, o3)) +
ggplot2::geom_hex(color = "grey") +
ggplot2::labs(x = "Temperature (°F)",
y = "Ozone Level",
title = 'How Temperature Affects Ozone Levels',
subtitle ='Hexagonal Binning of Ozone Levels vs. Temperature in °F')
g <- g + lizard_style() +
scale_fill_biolizard(type = 'continuous',scheme = 'beige_blue')
g
finalise_lizardplot(g,source="Source: The Chicago dataset (https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/)")
#> Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
#> font family 'Nunito Sans 10pt Medium' not found, will use 'sans' instead
#> Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
#> font family 'Nunito Sans 10pt Medium' not found, will use 'sans' instead
#> Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
#> font family 'Nunito Sans 10pt Medium' not found, will use 'sans' instead
ggplotly can be used to make a ggplot interactive, for
example to use in a html report. See an example of in interactive PCA
plot and volcano plot below. To generate these plots, we used the
per-gene read counts of the pasilla
dataset. Make sure that you do not apply
lizard_style() to the ggplot call when you want to use
ggplotly. The function lizard_layout() applies
the lizard theme to a plotly object.
Note: to succesfully install Pasilla, you might have to manually install libbz2 (libbz2-dev (deb), libbz2-devel (rpm), bzip2 (brew)).
library(pasilla)
#> Loading required package: DEXSeq
#> Loading required package: BiocParallel
#> Loading required package: Biobase
#> Loading required package: BiocGenerics
#> Loading required package: generics
#>
#> Attaching package: 'generics'
#> The following object is masked from 'package:dplyr':
#>
#> explain
#> The following objects are masked from 'package:base':
#>
#> as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#> setequal, union
#>
#> Attaching package: 'BiocGenerics'
#> The following object is masked from 'package:dplyr':
#>
#> combine
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> anyDuplicated, aperm, append, as.data.frame, basename, cbind,
#> colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
#> get, grep, grepl, is.unsorted, lapply, Map, mapply, match, mget,
#> order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
#> rbind, Reduce, rownames, sapply, saveRDS, table, tapply, unique,
#> unsplit, which.max, which.min
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#>
#> Attaching package: 'matrixStats'
#> The following objects are masked from 'package:Biobase':
#>
#> anyMissing, rowMedians
#> The following object is masked from 'package:dplyr':
#>
#> count
#>
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#>
#> colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#> colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#> colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#> colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#> colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#> colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#> colWeightedMeans, colWeightedMedians, colWeightedSds,
#> colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#> rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#> rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#> rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#> rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#> rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#> rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#> rowWeightedSds, rowWeightedVars
#> The following object is masked from 'package:Biobase':
#>
#> rowMedians
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: S4Vectors
#>
#> Attaching package: 'S4Vectors'
#> The following objects are masked from 'package:dplyr':
#>
#> first, rename
#> The following object is masked from 'package:utils':
#>
#> findMatches
#> The following objects are masked from 'package:base':
#>
#> expand.grid, I, unname
#> Loading required package: IRanges
#>
#> Attaching package: 'IRanges'
#> The following objects are masked from 'package:dplyr':
#>
#> collapse, desc, slice
#> Loading required package: GenomeInfoDb
#> Loading required package: DESeq2
#> Loading required package: AnnotationDbi
#>
#> Attaching package: 'AnnotationDbi'
#> The following object is masked from 'package:dplyr':
#>
#> select
#> Loading required package: RColorBrewer
library(plotly)
#>
#> Attaching package: 'plotly'
#> The following object is masked from 'package:AnnotationDbi':
#>
#> select
#> The following object is masked from 'package:IRanges':
#>
#> slice
#> The following object is masked from 'package:S4Vectors':
#>
#> rename
#> The following object is masked from 'package:ggplot2':
#>
#> last_plot
#> The following object is masked from 'package:stats':
#>
#> filter
#> The following object is masked from 'package:graphics':
#>
#> layout
# get gene counts
datafiles <- system.file("extdata", package="pasilla", mustWork=TRUE)
count.table <- utils::read.table(file.path(datafiles, "pasilla_gene_counts.tsv"), header=TRUE, row.names=1, quote="", comment.char="" )
head(count.table)
#> untreated1 untreated2 untreated3 untreated4 treated1 treated2
#> FBgn0000003 0 0 0 0 0 0
#> FBgn0000008 92 161 76 70 140 88
#> FBgn0000014 5 1 0 0 4 0
#> FBgn0000015 0 2 1 2 1 0
#> FBgn0000017 4664 8714 3564 3150 6205 3072
#> FBgn0000018 583 761 245 310 722 299
#> treated3
#> FBgn0000003 1
#> FBgn0000008 70
#> FBgn0000014 0
#> FBgn0000015 0
#> FBgn0000017 3334
#> FBgn0000018 308
# get metadata
SampleAnno <- utils::read.csv(file.path(datafiles, "pasilla_sample_annotation.csv"))
head(SampleAnno)
#> file condition type number.of.lanes total.number.of.reads
#> 1 treated1fb treated single-read 5 35158667
#> 2 treated2fb treated paired-end 2 12242535 (x2)
#> 3 treated3fb treated paired-end 2 12443664 (x2)
#> 4 untreated1fb untreated single-read 2 17812866
#> 5 untreated2fb untreated single-read 6 34284521
#> 6 untreated3fb untreated paired-end 2 10542625 (x2)
#> exon.counts
#> 1 15679615
#> 2 15620018
#> 3 12733865
#> 4 14924838
#> 5 20764558
#> 6 10283129
# DE analysis with edgeR
library(edgeR)
#> Loading required package: limma
#>
#> Attaching package: 'limma'
#> The following object is masked from 'package:DEXSeq':
#>
#> plotMA
#> The following object is masked from 'package:DESeq2':
#>
#> plotMA
#> The following object is masked from 'package:BiocGenerics':
#>
#> plotMA
Group <- as.factor(c(rep("untreated", 4), rep("treated", 3)))
y <- edgeR::DGEList(counts = count.table, group = Group)
# filter genes
keep <- edgeR::filterByExpr(y)
y <- y[keep, , keep.lib.sizes = FALSE]
# normalize counts
y <- edgeR::calcNormFactors(y)
norm.counts <- edgeR::cpm(y, normalized.lib.sizes = TRUE, log = T)
# plot interactive PCA
mds <- limma::plotMDS(y, gene.selection="common", plot = F)
toplot <- data.frame(Dim1 = mds$x, Dim2 = mds$y, Group = Group, sample = colnames(mds$distance.matrix.squared))
pcaPlot <- ggplot2::ggplot(toplot, ggplot2::aes(Dim1, Dim2, colour = Group)) +
ggplot2::geom_point() +
ggplot2::geom_text(ggplot2::aes(label = sample), size = 3, nudge_y = 0.05) +
ggplot2::coord_fixed() +
ggplot2::xlab(paste0("PC1: ", round(mds$var.explained[1]*100), "% variance")) +
ggplot2::ylab(paste0("PC2: ", round(mds$var.explained[2]*100), "% variance")) +
scale_color_biolizard()
plotly::ggplotly(pcaPlot) |> lizard_layout()
# The lizard style is added through lizard_layout() on plotly instead of lizard_style() in ggplot
# fit general linear model
design <- stats::model.matrix(~ 0 + Group)
colnames(design) <- levels(Group)
y <- edgeR::estimateDisp(y, design)
fit <- edgeR::glmQLFit(y, design)
# fit contrasts of interest
my.contrasts <- limma::makeContrasts(treatedVsUntreated = treated-untreated,
levels=design)
lrt <- edgeR::glmQLFTest(fit, contrast = my.contrasts)
DE <- edgeR::topTags(lrt, n="all")$table
colnames(DE) <- c("log2FoldChange", "logCPM", "LR", "pvalue", "padj")
DE <- DE[order(DE$padj), ]
DE <- DE[!is.na(DE$padj), ]
head(DE)
#> log2FoldChange logCPM LR pvalue padj
#> FBgn0039155 -4.605077 5.882320 1032.5624 3.519114e-11 2.786786e-07
#> FBgn0025111 2.906975 6.925429 718.6612 2.001949e-10 7.926717e-07
#> FBgn0029167 -2.189661 8.222828 527.6780 8.769068e-10 2.151196e-06
#> FBgn0035085 -2.549919 5.685708 504.5058 1.086600e-09 2.151196e-06
#> FBgn0003360 -3.171310 8.451316 467.1444 1.567356e-09 2.482378e-06
#> FBgn0039827 -4.149923 4.399320 412.0585 2.861632e-09 3.476204e-06
# interactive volcano plot
p <- DE |>
ggplot2::ggplot(ggplot2::aes(x = log2FoldChange, y = -log10(pvalue))) +
ggplot2::geom_point(ggplot2::aes(color = padj < 0.05)) +
scale_color_biolizard(reverse = TRUE)
plotly::ggplotly(p) |> lizard_layout()