Finding Peak Values For a Density Distribution

0.1 Libraries

library(data.table)
library(ggplot2)

1 Find Distribution Peak

If we want to get the x and y values for a distribution we can use the density function.

This uses Old Faithful data which has the wait time between each eruption and the duration of the eruptions in minutes.

head(faithful)
##   eruptions waiting
## 1     3.600      79
## 2     1.800      54
## 3     3.333      74
## 4     2.283      62
## 5     4.533      85
## 6     2.883      55

So plotting the waiting times we can see that it’s bimodal.

ggplot(faithful, aes(waiting)) + geom_density()

Using the density function we can see different basic statistics for the wait times.

density(faithful$waiting)
## 
## Call:
##  density.default(x = faithful$waiting)
## 
## Data: faithful$waiting (272 obs.);   Bandwidth 'bw' = 3.988
## 
##        x                y            
##  Min.   : 31.04   Min.   :5.880e-06  
##  1st Qu.: 50.27   1st Qu.:1.949e-03  
##  Median : 69.50   Median :1.224e-02  
##  Mean   : 69.50   Mean   :1.299e-02  
##  3rd Qu.: 88.73   3rd Qu.:1.909e-02  
##  Max.   :107.96   Max.   :3.660e-02

The density function will also give us x,y values for the distribution.

head(density(faithful$waiting)$y)
## [1] 8.951193e-06 1.017743e-05 1.152920e-05 1.301184e-05 1.474573e-05
## [6] 1.666005e-05
head(density(faithful$waiting)$x)
## [1] 31.03732 31.18786 31.33840 31.48894 31.63948 31.79002

This tells us which value in the distribution is highest on the y axis

which.max(density(faithful$waiting)$y)
## [1] 326

And using that y max value we can find where that lies on the x-axis

density(faithful$waiting)$x[326]
## [1] 79.96245

And add a line to the density plot.

ggplot(faithful, aes(waiting)) + geom_density() + geom_vline(xintercept = density(faithful$waiting)$x[326])

2 Find The Second Peak

So, quickly, I’m finding the values of x that are less than 65, then finding the peak y value in that range of x values, then plotting the whole thing. You’ll figure it out.

So first this will list all values of the Y axis where the X axis is less than 65

density(faithful$waiting)$y[density(faithful$waiting)$x < 65]
##   [1] 8.951193e-06 1.017743e-05 1.152920e-05 1.301184e-05 1.474573e-05
##   [6] 1.666005e-05 1.875665e-05 2.112854e-05 2.380540e-05 2.673200e-05
##  [11] 2.994350e-05 3.364019e-05 3.767360e-05 4.205664e-05 4.701996e-05
##  [16] 5.250930e-05 5.846285e-05 6.501305e-05 7.239169e-05 8.037679e-05
##  [21] 8.898839e-05 9.873037e-05 1.093073e-04 1.206897e-04 1.332219e-04
##  [26] 1.470600e-04 1.619169e-04 1.778747e-04 1.957596e-04 2.149124e-04
##  [31] 2.353680e-04 2.578628e-04 2.822526e-04 3.082358e-04 3.361586e-04
##  [36] 3.668419e-04 3.994423e-04 4.340033e-04 4.718905e-04 5.122978e-04
##  [41] 5.550213e-04 6.008712e-04 6.503538e-04 7.025264e-04 7.574535e-04
##  [46] 8.173289e-04 8.802754e-04 9.463413e-04 1.017012e-03 1.092055e-03
##  [51] 1.170589e-03 1.253146e-03 1.341556e-03 1.433803e-03 1.529928e-03
##  [56] 1.632233e-03 1.739318e-03 1.850574e-03 1.967151e-03 2.090021e-03
##  [61] 2.217292e-03 2.348987e-03 2.488201e-03 2.632120e-03 2.780597e-03
##  [66] 2.935406e-03 3.096313e-03 3.261813e-03 3.432322e-03 3.610219e-03
##  [71] 3.792640e-03 3.979568e-03 4.173207e-03 4.372083e-03 4.575265e-03
##  [76] 4.783616e-03 4.998110e-03 5.216602e-03 5.439044e-03 5.667677e-03
##  [81] 5.900180e-03 6.136204e-03 6.376776e-03 6.621671e-03 6.869570e-03
##  [86] 7.120538e-03 7.375928e-03 7.633731e-03 7.893867e-03 8.157116e-03
##  [91] 8.422644e-03 8.689852e-03 8.958871e-03 9.229795e-03 9.501710e-03
##  [96] 9.774528e-03 1.004835e-02 1.032253e-02 1.059691e-02 1.087131e-02
## [101] 1.114530e-02 1.141877e-02 1.169164e-02 1.196297e-02 1.223309e-02
## [106] 1.250193e-02 1.276862e-02 1.303301e-02 1.329544e-02 1.355549e-02
## [111] 1.381182e-02 1.406554e-02 1.431657e-02 1.456307e-02 1.480587e-02
## [116] 1.504537e-02 1.528051e-02 1.551028e-02 1.573616e-02 1.595807e-02
## [121] 1.617293e-02 1.638321e-02 1.658898e-02 1.678828e-02 1.698110e-02
## [126] 1.716891e-02 1.735113e-02 1.752479e-02 1.769297e-02 1.785562e-02
## [131] 1.800969e-02 1.815677e-02 1.829794e-02 1.843179e-02 1.855653e-02
## [136] 1.867504e-02 1.878730e-02 1.888916e-02 1.898414e-02 1.907266e-02
## [141] 1.915244e-02 1.922332e-02 1.928766e-02 1.934509e-02 1.939168e-02
## [146] 1.943175e-02 1.946531e-02 1.948934e-02 1.950544e-02 1.951520e-02
## [151] 1.951750e-02 1.951034e-02 1.949710e-02 1.947781e-02 1.944910e-02
## [156] 1.941403e-02 1.937330e-02 1.932535e-02 1.926992e-02 1.920932e-02
## [161] 1.914354e-02 1.906951e-02 1.899084e-02 1.890761e-02 1.881811e-02
## [166] 1.872335e-02 1.862461e-02 1.852147e-02 1.841268e-02 1.830053e-02
## [171] 1.818507e-02 1.806484e-02 1.794135e-02 1.781517e-02 1.768575e-02
## [176] 1.755299e-02 1.741809e-02 1.728114e-02 1.714110e-02 1.699942e-02
## [181] 1.685620e-02 1.671106e-02 1.656437e-02 1.641664e-02 1.626785e-02
## [186] 1.611783e-02 1.596721e-02 1.581606e-02 1.566431e-02 1.551232e-02
## [191] 1.536020e-02 1.520804e-02 1.505609e-02 1.490441e-02 1.475306e-02
## [196] 1.460250e-02 1.445263e-02 1.430349e-02 1.415545e-02 1.400878e-02
## [201] 1.386322e-02 1.371891e-02 1.357685e-02 1.343634e-02 1.329742e-02
## [206] 1.316109e-02 1.302717e-02 1.289532e-02 1.276603e-02 1.264039e-02
## [211] 1.251732e-02 1.239691e-02 1.228107e-02 1.216868e-02 1.205950e-02
## [216] 1.195473e-02 1.185505e-02 1.175920e-02 1.166733e-02 1.158255e-02
## [221] 1.150225e-02 1.142652e-02 1.135766e-02 1.129526e-02 1.123812e-02
## [226] 1.118714e-02

Then I can find the max y value that is that range of x.

MaxY<- max(density(faithful$waiting)$y[density(faithful$waiting)$x < 65])

This will tell me where in the vector of y values this Max Y value appears.

which(density(faithful$waiting)$y == MaxY)
## [1] 151

Then we can use that location to find the complimentary x value and plot it using ggplot

density(faithful$waiting)$x[151]
## [1] 53.61815


ggplot(faithful, aes(waiting)) + geom_density() + geom_vline(xintercept = density(faithful$waiting)$x[151])

3 Find The Trough In The Distribution Between The Peaks

Why? Why not.


So first I’ll create the x and y values for this distribution.

DensityFaithfulY <- density(faithful$waiting)$y
DensityFaithfulX <- density(faithful$waiting)$x

Now we can isolate the range of values where X is less than 75 and greater than 55.

DensityFaithfulX < 75 & DensityFaithfulX > 55
##   [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [45] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [56] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [67] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [78] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [89] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [111] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [122] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [144] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [155] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
## [166]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [177]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [188]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [199]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [210]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [221]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [232]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [243]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [254]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [265]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [276]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [287]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE
## [298] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [309] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [320] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [331] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [342] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [353] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [364] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [375] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [386] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [397] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [408] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [419] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [430] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [441] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [452] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [463] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [474] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [485] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [496] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [507] FALSE FALSE FALSE FALSE FALSE FALSE

Get the minimum Y density value.

MinYDensity<- min(DensityFaithfulY[DensityFaithfulX < 75 & DensityFaithfulX > 55])
MinYDensity
## [1] 0.01104003

Find the location in the Y Axis Vector that has this minimum density.

which(DensityFaithfulY == MinYDensity)
## [1] 232

Then identify where on the x axis the minimum y value falls and plot.

DensityFaithfulX[232]
## [1] 65.8118

ggplot(faithful, aes(waiting)) + geom_density() + geom_vline(xintercept = density(faithful$waiting)$x[232])

4 Density Peak Function

We can also use this function to find the x and y axis for the peak of the density.

densMode <- function(x){
    td <- density(x)
    maxDens <- which.max(td$y)
    list(x=td$x[maxDens], y=td$y[maxDens])
}

densMode(faithful$waiting)
## $x
## [1] 79.96245
## 
## $y
## [1] 0.03660056


There’s also this which returns x values.

peakx <- DensityFaithfulX[which(diff(sign(diff(DensityFaithfulY)))==-2)]
peakx
## [1] 53.46761 79.81191

minx <- DensityFaithfulX[which(diff(sign(diff(DensityFaithfulY)))==2)]
minx
## [1] 65.66126

It’s close but not quite right. It gives vector numbers of 150 and 325 for the peaks but the actual numbers are 151 and 326.

which(diff(sign(diff(DensityFaithfulY)))==-2)
## [1] 150 325

which.max(density(faithful$waiting)$y)
## [1] 326
which(density(faithful$waiting)$y == MaxY)
## [1] 151