R Part Two: Graph

2.1 Basic graph functions

In this tutorial, we will see how to use R to make some very basic types of graphs, which are likely to be used in almost any kind of analysis. The codes will give you a feel for how much can be accomplished with very little R code, which is one big reason why R is a good choice for an analysis platform

Creating scatter plots

How to make scatter plots using some very simple commands

See the data in the "cars" object. This object is automatically loaded in the R environment as example data.

> cars

> cars[1:10,]

> colnames(cars)

Plot the relationship between dist and speed

> plot(cars$dist~cars$speed)

Adjust the parameters

> plot(cars$dist~cars$speed, # y~x
main="Relationship between distance and speed", # Plot Title
xlab="Speed )", #X axis title
ylab="Distance travelled )", #Y axis title
xlim=c(0,30), #Set x axis limits from 0 to 30
ylim=c(0,140), #Set y axis limits from 0 to 140
col="red", #Set the color of plotting symbol to red
pch=19)

Creating line graphs

How to make a line graph used for looking at trends

create some data.

> x <- c(1:5); y <- 2*x

Plot the line graph

> plot(y~x, type="l", #Specify type of plot as l for line col="blue")

Creating bar charts

How to make a line graph used for looking at trends

see the data

> mtcars

> mymtcars<-mtcars[rev(order(mtcars$wt)),]

Plot the bar charts

> mymtcars<-mymtcars[1:5,]

> barplot(mymtcars$wt,names.arg= rownames(mymtcars),col="black")

Creating histograms and density plots

The default setting for histograms is to display the frequency or number of occurrences of values in a particular range on the Y axis

> hist(rnorm(1000))

The density() function can do the same thing

> plot(density(rnorm(1000)))

Creating box plots

Boxplots can be created for individual variables or for variables by group. The format is boxplot(x, data=), where x is a formula and data= denotes the data frame providing the data.

> boxplot(mpg~cyl,data=mtcars, main="Car Milage Data", xlab="Number of Cylinders", ylab="Miles Per Gallon")

Creating heat maps

Heat maps are colorful images, which are very useful for summarizing a large amount of data by highlighting hotspots or key trends in the data.

> heatmap(as.matrix(mtcars), Rowv=NA, Colv=NA, col = heat.colors(256), scale="column", margins=c(2,8), main = "Car characteristics by Model")

Creating maps

> install.packages("maps")

> library("maps")

> map('world', fill = TRUE,col=heat.colors(10))

Creating three-dimensional surface plots

>install.packages("rgl")

>library("rgl")

>volcano

>z<-2*volcano

>x<-10*(1:nrow(z))

>y<-10*(1:ncol(z))

>contour(x, y=y, z=z, xlab="Metres West",ylab="Metres North", main="Topography of Maunga Whau Volcano")

>zlim<-range(z)

> zlen<-zlim[2]-zlim[1]+1

>colorlut<-terrain.colors(zlen)

>col<-colorlut[z-zlim[1]+1]

>rgl.open()

>rgl.surface(x,y,z,color=col,back="lines")

2.2 Adding linear model lines on scatter plot

A very simple example for linear regression

> mtcars

> plot(mtcars$mpg~mtcars$disp)

> lmfit<-lm(mtcars$mpg~mtcars$disp)

> abline(lmfit)

> x<- -(1:100)/10

> y<- 100+10*exp(x/2)+rnorm(x)/10

> nlmod <- nls(y ~ Const + A * exp(B*x), trace=TRUE)

> lines(x,predict(nlmod),col="red")

2.3 Adding non-parametric model curves with lowess

Lowess is defined by a complex algorithm, the Ratfor original of which (by W. S. Cleveland) can be found in the R sources as file ¡®src/appl/lowess.doc¡¯. Normally a local linear polynomial fit is used, but under some circumstances (see the file) a local constant fit can be used. ¡®Local¡¯ is defined by the distance to the floor(f*n)th nearest neighbour, and tricubic weighting is used for x which fall within the neighbourhood.

> plot(cars)

> lines(lowess(cars),col="blue")

> lines(lowess(cars, f=0.3), col = "orange")

2.4 Making scatter plots with smoothed density representation

> n<-10000

> x<-matrix(rnorm(n),ncol=2)

> y<-matrix(rnorm(n,mean=3,sd=1.5),ncol=2)

> smoothScatter(x,y)

> x<-matrix(rnorm(n),ncol=2)

> y<-x+rnorm(n,sd=5)

> smoothScatter(x,y)

2.5 Displaying data density on axes

Download the sequencing reads compare table

> x<-rnorm(1000)

> plot(density(x))

> rug(x)

> reads<-read.table("B1-30vsB1-47per200k.txt")

> plot(reads[,2],reads[,3])

> rug(reads[,2])

> rug(reads[,3],side=2,col="red",ticksize=0.02)

Summary

R R language
plot
barplot
hist
density
boxplot
heatmap

tongyinbio@hku.hk bbru@hku.hk 13th-Feb 2017