转载

Lesson 08 for Plotting in R for Biologists

taoyan:R语言中文社区特约作家,伪码农,R语言爱好者,爱开源。

个人博客: https://ytlogos.github.io/

这节课是最后一节课,主要将如何绘制热图(heatmap)。实际上关于热图的绘制,我以前写了一篇博客:R语言学习笔记之热图绘制 ,里面写的十分详细。但是今天热图绘制主要利用一个新的R包ComplexHeatmap进行绘制。


包安装

source("https://bioconductor.org/biocLite.R")

options(BioC_mirror="http://mirrors.ustc.edu.cn/bioc/")

biocLite("ComplexHeatmap")

数据加载

library(ComplexHeatmap)

my_data <- read.table("copy_number_data.txt", sep = "\t", quote="", stringsAsFactors = FALSE, header = TRUE)

head(my_data[,1:6]) 

##    CHR     START       END X.SRR089523. X.SRR089526. X.SRR089529.

## 1 chr1  53440429  53494914    1.1002112    1.1000844    1.0487301

## 2 chr1 105459037 105514187    1.1860780    0.6553897    0.4980016

## 3 chr1 183998520 184048557    1.3791250    1.2024487    0.5735184

## 4 chr1 236135655 236186012    0.8987158    1.1182392    0.6396842

## 5 chr2  38944803  38996507    1.0788216    0.9663390    1.0473623

## 6 chr2  97248366  97298651    1.0459629    1.0658991    1.0473623  

#绘制热图前先进行矩阵化

my_matrix <- as.matrix(my_data[, c(4:100)])

head(my_matrix[,1:4]) 

##      X.SRR089523. X.SRR089526. X.SRR089529. X.SRR089533.

## [1,]    1.1002112    1.1000844    1.0487301    0.5656784

## [2,]    1.1860780    0.6553897    0.4980016    0.5656784

## [3,]    1.3791250    1.2024487    0.5735184    1.0362551

## [4,]    0.8987158    1.1182392    0.6396842    0.4659238

## [5,]    1.0788216    0.9663390    1.0473623    1.1702930

## [6,]    1.0459629    1.0658991    1.0473623    1.2298119  

 #将染色体信息存储好以便后续热图注释

chromosome_info <- data.frame(chrom=my_data$CHR)

   

可视化

Heatmap(my_matrix)

   可以看到使用默认参数热图时十分不美观,可以进行部分调整

 #坐标旋转

my_matrix <- t(my_matrix)

Heatmap(my_matrix)

坐标标签还是十分乱,还得进行调整

Heatmap(my_matrix, cluster_columns = FALSE)#列不聚类

将坐标标签位置置于左边   

Heatmap(my_matrix,

cluster_columns = FALSE,

row_names_side = "left",

row_hclust_side = "left",

row_names_gp = gpar(cex=0.6)

)

Heatmap(my_matrix,

cluster_columns = FALSE,

row_names_side = "left",

row_hclust_side = "left",

row_names_gp = gpar(cex=0.6),

row_hclust_width = unit(3, "cm"))

热图绘制的时候可以使用不同的距离计算方式以及聚类方法,具体的用法可以参考我以前的文章:R语言学习笔记之热图绘制。 

Heatmap(my_matrix,

cluster_columns = FALSE,

row_names_side = "left",

row_hclust_side = "left",

row_names_gp = gpar(cex=0.6),

row_hclust_width = unit(3, "cm"),

clustering_distance_rows = "maximum",

clustering_method_rows = "ward.D")

   

热图注释

主要是对聚成的类进行颜色标记

library(dendextend)

dend <- hclust(dist(my_matrix, method = "maximum"), method = "ward.D")

Heatmap(my_matrix,

cluster_columns = FALSE,

row_names_side = "left",

row_hclust_side = "left",

row_names_gp = gpar(cex=0.6),

row_hclust_width = unit(3, "cm"),

cluster_rows = color_branches(dend, k=3))

将热图分类

随便你先分成几类,但是一般来说有一个最佳分类数,可参考我的另一篇博客:R语言学习笔记之聚类分析

Heatmap(my_matrix,

cluster_columns = FALSE,

row_names_side = "left",

row_hclust_side = "left",

row_names_gp = gpar(cex=0.6),

row_hclust_width = unit(3, "cm"),

clustering_distance_rows = "maximum",

clustering_method_rows = "ward.D",

km=2)

   

根据染色体信息进行注释 

chromosome_colors <- c(rep(c("black","white"),11), "red")

names(chromosome_colors) <- paste("chr",c(seq(1:22), "X"), sep = "")

Heatmap(my_matrix,

cluster_columns = FALSE,

row_names_side = "left",

row_hclust_side = "left",

row_names_gp = gpar(cex=0.6),

row_hclust_width = unit(3, "cm"),

clustering_distance_rows = "maximum",

clustering_method_rows = "ward.D",

km=2,

bottom_annotation = HeatmapAnnotation(chromosome_info, col = list(chrom=chromosome_colors), show_legend = FALSE))

 SessionInfo 

sessionInfo()   

## R version 3.4.3 (2017-11-30)

## Platform: x86_64-w64-mingw32/x64 (64-bit)

## Running under: Windows 10 x64 (build 16299)

##

## Matrix products: default

##

## locale:

## [1] LC_COLLATE=Chinese (Simplified)_China.936

## [2] LC_CTYPE=Chinese (Simplified)_China.936  

## [3] LC_MONETARY=Chinese (Simplified)_China.936

## [4] LC_NUMERIC=C                              

## [5] LC_TIME=Chinese (Simplified)_China.936    

##

## attached base packages:

## [1] grid      stats     graphics  grDevices utils     datasets  methods  

## [8] base    

##

## other attached packages:

## [1] dendextend_1.6.0      ComplexHeatmap_1.17.1

##

## loaded via a namespace (and not attached):

##  [1] Rcpp_0.12.15         DEoptimR_1.0-8       compiler_3.4.3      

##  [4] pillar_1.1.0         RColorBrewer_1.1-2   plyr_1.8.4          

##  [7] viridis_0.4.1        class_7.3-14         prabclus_2.2-6      

## [10] tools_3.4.3          digest_0.6.15        mclust_5.4          

## [13] viridisLite_0.3.0    evaluate_0.10.1      tibble_1.4.2        

## [16] gtable_0.2.0         lattice_0.20-35      rlang_0.1.6        

## [19] yaml_2.1.16          mvtnorm_1.0-7        gridExtra_2.3      

## [22] trimcluster_0.1-2    stringr_1.2.0        knitr_1.19          

## [25] cluster_2.0.6        GlobalOptions_0.0.12 fpc_2.1-11          

## [28] diptest_0.75-7       nnet_7.3-12          stats4_3.4.3        

## [31] rprojroot_1.3-2      robustbase_0.92-8    GetoptLong_0.1.6    

## [34] flexmix_2.3-14       rmarkdown_1.8        kernlab_0.9-25      

## [37] ggplot2_2.2.1.9000   magrittr_1.5         whisker_0.3-2      

## [40] modeltools_0.2-21    backports_1.1.2      scales_0.5.0.9000  

## [43] htmltools_0.3.6      MASS_7.3-48          shape_1.4.3        

## [46] circlize_0.4.3       colorspace_1.3-2     stringi_1.1.6      

## [49] lazyeval_0.2.1       munsell_0.4.3        rjson_0.2.15

往期文章

R语言可视化学习笔记之相关矩阵可视化包ggcorrplot

R语言学习笔记之相关性矩阵分析及其可视化

ggplot2学习笔记系列之利用ggplot2绘制误差棒及显著性标记

ggplot2学习笔记系列之主题(theme)设置

用circlize包绘制circos-plot

利用gganimate可视化R-Ladies发展情况

一篇关于国旗与奥运会奖牌的可视化笔记

利用ggseqlogo绘制seqlogo图

R语言data manipulation学习笔记之创建变量、重命名、数据融合

R语言data manipulation学习笔记之subset data

R语言可视化学习笔记之gganimate包

创建属于自己的调色板

Lesson 01 for Plotting in R for Biologists

Lesson 02&03 for Plotting in R for Biologists

Lesson 04 for Plotting in R for Biologists

Lesson 05 for Plotting in R for Biologists

Lesson 06 for Plotting in R for Biologists

Lesson 07 for Plotting in R for Biologists

R语言学习笔记之热图绘制

正文到此结束
Loading...