Advanced Graphs(高级图形)三
5.ggplot2图形
ggplot2 包是 Hadley Wickham建设的, 其提供了强大的图形语言建设精美巨大的图.最近几年在 R 社区风行.最初是基于 Leland Wilkinson 的 The Grammar of Graphics, ggplot2 答允你简朴方法地建设表示图形来单变量和多变量数值的图形. 分类数据可以通过颜色,标记,巨细. 建设画图框架是相对简朴.
把握 ggplot2 语言可以是挑战性的 (参考下面 更进一步 章节获得有用的资源). 有个有用函数叫qplot() (快速画图) 可以在建设尺度图形时隐掉这里大部门巨大对象.
5.1qplot()
qplot() 函数可以用来建设大部门通用的图形范例. 纵然没用到 ggplot’s 所有强大成果, 它也可以很是遍及有用的图形. 它名目是:
qplot(x, y, data=, color=, shape=, size=, alpha=, geom=, method=, formula=, facets=, xlim=, ylim= xlab=, ylab=, main=, sub=)
个中选项是:
选项 |
描写 |
alpha |
Alpha transparency for overlapping elements expressed as a fraction between 0 (complete transparency) and 1 (complete opacity) |
color, shape, size, fill |
Associates the levels of variable with symbol color, shape, or size. For line plots, color associates levels of a variable with line color. For density and box plots, fill associates fill colors with a variable. Legends are drawn automatically. |
data |
Specifies a data frame |
facets |
Creates a trellis graph by specifying conditioning variables. Its value is expressed asrowvar ~ colvar. To create trellis graphs based on a single conditioning variable, userowvar~. or .~colvar) |
geom |
Specifies the geometric objects that define the graph type. The geom option is expressed as a character vector with one or more entries. geom values include “point”, “smooth”, “boxplot”, “line”, “histogram”, “density”, “bar”, and “jitter”. |
main, sub |
Character vectors specifying the title and subtitle |
method, formula |
If geom=”smooth”, a loess fit line and confidence limits are added by default. When the number of observations is greater than 1,000, a more efficient smoothing algorithm is employed. Methods include “lm” for regression, “gam” for generalized additive models, and “rlm” for robust regression. The formula parameter gives the form of the fit. For example, to add simple linear regression lines, you’d specify geom=”smooth”, method=”lm”, formula=y~x. Changing the formula to y~poly(x,2) would produce a quadratic fit. Note that the formula uses the letters x and y, not the names of the variables. For method=”gam”, be sure to load the mgcv package. For method=”rml”, load the MASS package. |
x, y |
Specifies the variables placed on the horizontal and vertical axis. For univariate plots (for example, histograms), omit y |
xlab, ylab |
Character vectors specifying horizontal and vertical axis labels |
xlim,ylim |
Two-element numeric vectors giving the minimum and maximum values for the horizontal and vertical axes, respectively |
#p#分页标题#e#
注释:
• 当前, ggplot2 不支持3D 图形和镶嵌图.
• 操作 I(value) 来指定特定值. 如 size=z 配置绘制相对变量z的点的或线段. 以此对应, size=I(3) 配置每个点或每条线段是缺省巨细的3倍.
这里是操作包括在mtcars 数据框架汽车数据的一些例子 (车辆里程, 重量,齿轮数, 气缸数,等等.) .
# ggplot2 exampleslibrary(ggplot2)
# create factors with value labels
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),
labels=c(“3gears”,”4gears”,”5gears”))
mtcars$am <- factor(mtcars$am,levels=c(0,1),
labels=c(“Automatic”,”Manual”))
mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),
labels=c(“4cyl”,”6cyl”,”8cyl”))
# Kernel density plots for mpg
# grouped by number of gears (indicated by color)
qplot(mpg, data=mtcars, geom=”density”, fill=gear, alpha=I(.5),
main=”Distribution of Gas Milage”, xlab=”Miles Per Gallon”,
ylab=”Density”)
# Scatterplot of mpg vs. hp for each combination of gears and cylinders
# in each facet, transmittion type is represented by shape and color
qplot(hp, mpg, data=mtcars, shape=am, color=am,
facets=gear~cyl, size=I(3),
xlab=”Horsepower”, ylab=”Miles per Gallon”)
# Separate regressions of mpg on weight for each number of cylinders
qplot(wt, mpg, data=mtcars, geom=c(“point”, “smooth”),
method=”lm”, formula=y~x, color=cyl,
main=”Regression of MPG on Weight”,
xlab=”Weight”, ylab=”Miles per Gallon”)
#p#分页标题#e#
# Boxplots of mpg by number of gears
# observations (points) are overlayed and jittered
qplot(gear, mpg, data=mtcars, geom=c(“boxplot”, “jitter”),
fill=gear, main=”Mileage by Gear Number”,
xlab=””, ylab=”Miles per Gallon”)
5.2定制 ggplot2 图形
不像根基 R 图形, ggplot2 图形不会被 par( ) 函数大大都配置影响. 它们可以用 theme() 函数修改,用qplot() 函数添加图形参数. 操作ggplot() 和其它包里提供的成果可以举办大部门节制. 留意ggplot2 函数可以用 “+” 标记链接起来发生精美的图形.
library(ggplot2)
p <- qplot(hp, mpg, data=mtcars, shape=am, color=am,
facets=gear~cyl, main=”Scatterplots of MPG vs. Horsepower”,
xlab=”Horsepower”, ylab=”Miles per Gallon”)
# White background and black grid lines
p + theme_bw()
# Large brown bold italics labels
# and legend placed at top of plot
p + theme(axis.title=element_text(face=”bold.italic”,
size=”12″, color=”brown”), legend.position=”top”)
5.3更进一步
这里我们只做了浮浅的研究. 进一步进修, 查察 ggplot reference site, 以及 Winston Chang 精彩Cookbook for R 站点. 尽量有一点过期, ggplot2: Elegant Graphics for Data Anaysis 一直是这方面的权威书.
6.概率图
本章节描写在 R 里为解说和数据阐明建设概率图.
6.1为解说及演示建设概率图
当我照旧学院传授时, 经常用手绘制正态漫衍. 它们老是画成像些可爱的小兔子,我能说什么呢?
R 让画概率漫衍和统计观念演示变得容易. 下面给出R 里可用的通用概率漫衍.
漫衍 |
R 名称 |
漫衍 |
R 名称 |
Beta |
beta |
Lognormal |
lnorm |
Binomial |
binom |
Negative Binomial |
nbinom |
Cauchy |
cauchy |
Normal |
norm |
Chisquare |
chisq |
Poisson |
pois |
Exponential |
exp |
Student t |
t |
F |
f |
Uniform |
unif |
Gamma |
gamma |
Tukey |
tukey |
Geometric |
geom |
Weibull |
weib |
Hypergeometric |
hyper |
Wilcoxon |
wilcox |
Logistic |
logis |
|
|
#p#分页标题#e#
更全面的清单, 查察R wiki中 Statistical Distributions. 每种漫衍都有下面名目函数对应:
12下一页