Exercise dashboard

I posted a while back about using joy plots for heart rate data. Over the past couple of months, I grew tired of opening RStudio every time I wanted to look through my fitness tracker data. I decided to create a shiny dashboard that I can load via web browser. This involved setting up a shiny server in a dockerized container that runs an Rmd file. I’ve had issues getting plots to size themselves appropriately using base shiny, so I used the flexdashboard package to created a dashboard that will automatically resize.

I wanted to be able to look at individual workouts as well as weekly statistics. To remain consistent with the heart rate app I use, MotiFIT, I copied the charting style for individual workouts. The screenshot at right shows how the app displays heart rate. I find this view useful during weightlifting sessions because I can tell when I’ve rested sufficiently to start another set.

The dashboard has a date selector so you can use small multiples to compare several workouts at once. The last workout in the multiples corresponds to the screenshot from the app, and the similarity is apparent.

The weekly view shows the aggregate amount of time spent in each heart rate zone. You can see a big surge in the number of hours of exercise in September 2017, which is when I started playing in a tennis league. There are noticeable gaps in December and January, when I got sick and then suffered a couple of injuries. I’ve slowly added more hours back as I’ve rehabbed the injuries.

There are a couple of other views, but I’m still trying to decide how to display some of the concepts effectively. The code for the flexdashboard is below.

---
title: "Exercise Dashboard"
runtime: shiny
output:
flexdashboard::flex_dashboard:
orientation: rows
vertical_layout: fill
---

```{r setup, include=FALSE}

library(lubridate)
library(readr)
library(tidyverse)
library(plotly)
library(flexdashboard)
library(ggridges)
library(data.table)
library(scales)

ifelse(
dir.exists("/shiny-server/data/dir"),
datadir <- "shiny-server/data/dir",
datadir <- "/local/data/dir"
)
```

```{r load HR data}

out.file <- data.frame(timestamp = as.POSIXct(character()), bpm = integer(), stringsAsFactors = FALSE)
file.names <- dir(datadir, pattern = "^.*HeartRateData.*.csv") # You'll need to set datadir

library(doParallel)
registerDoParallel()

dir.traversal <- system.time({
# out.file <- foreach (file.name = tail(file.names, 10), .combine = "rbind") %dopar% {
out.file <- foreach (file.name = file.names, .combine = "rbind") %dopar% {
# for (file.name in file.names) {
# exported files are in at least 2 different encodings, so we're going to guess using the guess_encoding function from the readr package
# We can't use readr's read_csv or data.table's fread because some of the encodings are UTF-16LE, which makes this process s
encoding <- guess_encoding(paste(datadir, "/", file.name, sep=""), n_max = 1000)
file <- read.csv(
paste(datadir, "/", file.name, sep=""),
skip = 1,
sep = ",",
strip.white = TRUE,
stringsAsFactors = FALSE,
fileEncoding = toString(encoding[1,1])
)

# exported data has 2 or three columns, so if it's three, we're going to join the date and time fields
if (length(file) > 2) {
temp.date <- file[,1]
temp.time <- trimws(file[,2])
temp.bpm <- trimws(file[,3])

# date format is Wed Feb 8 2017
temp.datetime <- as.POSIXct(paste(temp.date, temp.time, sep=" "), format = "%a %b %d %Y %H:%M")

temp.df <- data.frame(temp.datetime, temp.bpm, stringsAsFactors = FALSE)
names(temp.df) <- c("timestamp", "bpm")
file <- temp.df
rm(temp.df)
} else {
names(file) <- c("timestamp", "bpm")
file$timestamp <- as.POSIXct(file$timestamp)
}

file$name <- file.name

#out.file <- rbind(out.file, file)
file
}
})

```

```{r add week number }
# Used to calculate week number below
start.date <- as.Date(as.character(min(out.file$timestamp)))

week.starts <- seq(from = start.date, to = as.Date(Sys.Date()), by = 7)
week.starts <- data.frame(
week = 0:(NROW(week.starts) - 1),
`week of` = as.Date(week.starts)
)

HR <- out.file %>%
mutate(
week = as.numeric(as.Date(timestamp, tz = Sys.timezone()) - start.date) %/% 7,
bpm = as.numeric(bpm)
) %>%
left_join(week.starts, by = c("week" = "week")) %>%
filter(! is.na(week)) %>%
arrange(week)

HR$week.of <- factor(HR$week.of)
HR$`week.of` <- factor(HR$`week.of`, ordered = T, rev(unique(HR$`week.of`)))
# HR$week <- factor(HR$week, ordered = TRUE, levels = rev(unique(HR$week)))

```

```{r set HR zones}

cuts <- c(-Inf, 109, 123, 138, 164, Inf)
labs <- c("Warm Up", "Fitness", "Endurance", "Hardcore", "Red Line")
HR$zone <- cut(HR$bpm, breaks = cuts, labels = labs, include.lowest=TRUE, ordered_result = TRUE)
fitness.rainbow <- c("royalblue", "royalblue", "green", "yellow", "orange", "red")
rects <- data.frame(ystart = cuts[1:5], yend = cuts[2:6], zone = factor(labs, levels = rev(labs), ordered = TRUE))

bpm.min <- reactive({
min(HR.filtered()$bpm, na.rm = T)
})
bpm.max <- reactive({
max(HR.filtered()$bpm, na.rm = T)
})

zone.breaks <- reactive({
c(
bpm.min(),
(bpm.min() + cuts[2]) / 2,
(cuts[2] + cuts[3]) / 2,
(cuts[3] + cuts[4]) / 2,
(cuts[4] + cuts[5]) / 2,
bpm.max()
)
})

```

```{r make HR reactive }

HR.filtered <- reactive({
HR %>%
filter(as.Date(timestamp, tz = Sys.timezone()) >= input$date[1]) %>%
filter(as.Date(timestamp, tz = Sys.timezone()) <= input$date[2])
})

```

```{r calc durations }

HR.duration <- reactive({
HR.filtered() %>%
mutate(date = as.Date(format(timestamp, "%Y-%m-%d"))) %>%
group_by(name, date) %>% # name is filename so two files from same date don't skew calculations
summarize(start = min(timestamp, na.rm = T),
end = max(timestamp, na.rm = T),
`duration (m)` = round(as.numeric(difftime(end, start, units = "mins")), 1),
`average bpm` = round(mean(as.numeric(bpm)), 0),
sd = round(sd(as.numeric(bpm)), 1),
max.bpm = max(as.numeric(bpm), na.rm = T)) %>%
ungroup() %>%
mutate(start = format(start, "%H:%M:%S"),
end = format(end, "%H:%M:%S"))
})

HR.daily.duration <- reactive({
HR %>%
filter(as.Date(timestamp, tz = Sys.timezone()) >= input$dailyDate[1]) %>%
filter(as.Date(timestamp, tz = Sys.timezone()) <= input$dailyDate[2]) %>%
mutate(date = as.Date(format(timestamp, "%Y-%m-%d"))) %>%
group_by(name, date) %>%
summarize(start = min(timestamp, na.rm = T),
end = max(timestamp, na.rm = T),
`duration (m)` = round(as.numeric(difftime(end, start, units = "mins")), 1),
`average bpm` = round(mean(as.numeric(bpm)), 0),
sd = round(sd(as.numeric(bpm)), 1),
max.bpm = max(as.numeric(bpm), na.rm = T)) %>%
ungroup() %>%
mutate(start = format(start, "%H:%M:%S"),
end = format(end, "%H:%M:%S"))
})

HR.zone.duration.weekly <- reactive({
HR.filtered() %>%
group_by(week, zone) %>%
summarize(
`duration (s)` = n(),
`duration (m)` = round(n() / 60, 1),
`duration (h)` = round(n() / 3600, 1)
) %>%
ungroup() %>%
left_join(week.starts, by = "week") %>%
filter(! is.na(week)) %>%
arrange(week)
})

HR.weekly.duration <- reactive({
HR.zone.duration.weekly() %>%
group_by(week, `week.of`) %>%
summarize(
`duration (h)` = sum(`duration (h)`),
`exercise sessions` = n()
) %>%
ungroup() %>%
arrange(week)
})

HR.zone.duration.daily <- reactive({
HR %>%
mutate(date = as.Date(timestamp, tz = Sys.timezone())) %>%
filter(date >= input$dailyDate[1]) %>%
filter(date <= input$dailyDate[2]) %>%
group_by(date, zone) %>%
summarize(
`duration (s)` = n(),
`duration (m)` = round(n() / 60, 1),
`duration (h)` = round(n() / 3600, 1)
) %>%
ungroup()
})

```

Weekly
=======================================================================

Inputs {.sidebar}
-----------------------------------------------------------------------

Enter a date range for the weekly charts:

```{r input date range for weekly chart }

dateRangeInput("date", "Date Range", start = as.Date(min(HR$timestamp, na.rm = T), tz = Sys.timezone()))

```

Column
-----------------------------------------------------------------------

### Weekly Exercise

```{r weekly barplot}

renderPlotly({
ggplot(data = HR.zone.duration.weekly(), aes(x = `week.of`, y = `duration (h)`, fill = zone)) +
geom_bar(
stat = "identity",
position = "stack"
) +
scale_fill_manual(values = fitness.rainbow[2:6]) +
scale_y_continuous(breaks = seq(from = 0, to = 8, by = 2), minor_breaks = seq(from = 0, to = 8, by = 1))
})

```

Column
-----------------------------------------------------------------------

### Heartrate Zones

```{r ridgeline plot}

renderPlot({
ggplot(
HR.filtered(),
aes(x = bpm, y = `week.of`, fill = ..x..)
) +
scale_fill_gradientn(
colors = fitness.rainbow,
breaks = zone.breaks()
) +
scale_x_continuous(breaks = function(x) {seq(from = 0, to = max(x), by = 10)}) +
geom_density_ridges_gradient(na.rm = TRUE, col = "grey70", scale = 4) +
theme_ridges(font_size = 7) +
theme(
legend.position = "none"
)
})

```

### Weekly Durations

```{r total duration bar plot}

renderPlotly({
ggplot(data = HR.weekly.duration(), aes(x = `week.of`, y = `duration (h)`)) +
geom_bar(
aes(
text = paste(
"week: ", week, "<br>",
"# of sessions: ", `exercise sessions`,
sep = ""
)
),
stat = "identity"
) +
geom_smooth(span = 0.35)
})

# renderTable({
# head(HR.weekly.duration())
# })

```

Daily {data-orientation=rows}
=======================================================================

Inputs {.sidebar}
-----------------------------------------------------------------------

Enter a date range for the daily charts:

```{r input date range for daily calcs }

dateRangeInput("dailyDate", "Date Range", start = as.Date(Sys.Date() %m+% days(-15)))

```

Column
-----------------------------------------------------------------------

### Daily Heartrate Zones

```{r individual exercise zones}

renderPlotly({
ggplot(data = HR.zone.duration.daily(), aes(x = date, y = `duration (m)`, fill = zone)) +
geom_bar(
stat = "identity",
position = "dodge"
) +
scale_fill_manual(values = fitness.rainbow[2:6]) +
scale_y_continuous(breaks = seq(from = 0, to = max(HR.zone.duration.daily()$`duration (m)`, na.rm = TRUE), by = 15)) +
theme(
legend.position = "none"
)
})

```

### Daily Workouts, BPM vs. Duration

```{r bubble plot}

renderPlotly({
ggplot(HR.daily.duration(), aes(x = `duration (m)`, y = `average bpm`)) +
geom_point(aes(color = `average bpm` + sd, size = -sd), alpha = 0.5) +
scale_color_gradientn(
colors = fitness.rainbow[2:6], # "royalblue" "green" "yellow" "orange" "red"
values = rescale(zone.breaks(), to = c(0, 1)), # 54 81.5 116 130.5 151 190
na.value = "black",
breaks = cuts, # -Inf 109 123 138 164 Inf
limits = c(min(zone.breaks()), max(zone.breaks()))
) +
scale_x_continuous(
limits = c(0, as.numeric(max(HR.daily.duration()$`duration (m)`))),
breaks = seq(from = 0, to = as.numeric(max(HR.daily.duration()$`duration (m)`)), by = 15)
) +
scale_y_continuous(
limits = c(min(zone.breaks()), max(zone.breaks())),
breaks = cuts[2:5],
minor_breaks = NULL
) +
#geom_smooth(span = 0.35) +
theme(
legend.position = "none"
)
})

# renderTable({
# head(HR.daily.duration())
# })

# renderText({
# rescale(
# HR.daily.duration()$max.bpm,
# from = c(min(zone.breaks()), max(zone.breaks())),
# to = c(0, 1)
# )
# })

```

Column
-----------------------------------------------------------------------

### Heartrate Curves

```{r individual exercise plots}

renderPlot({

temp <- HR.filtered() %>%
# temp <- HR %>%
mutate(date = as.Date(timestamp, tz = Sys.timezone())) %>%
filter(as.Date(timestamp, tz = Sys.timezone()) >= input$dailyDate[1]) %>%
filter(as.Date(timestamp, tz = Sys.timezone()) <= input$dailyDate[2])

ggplot() +
geom_rect(data = rects, aes(ymin = ystart, ymax = yend, fill = zone), xmin=-Inf, xmax=Inf, inherit.aes = FALSE) +
geom_ribbon(data = temp, aes(x = timestamp, ymax = bpm, ymin = min(temp$bpm)), color = "white", fill="grey90", alpha = .5) +
scale_fill_manual(values = rev(fitness.rainbow[2:6])) +
theme_minimal() +
facet_wrap(~ name, scales = "free_x") +
scale_x_datetime(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0), breaks = cuts[2:5]) +
theme(panel.grid = element_blank(), panel.border = element_blank())
})

```

The “joy” of plotting heartrate data

There’s been quite a few posts lately in the R world about the joyplot. I emailed my dataviz instructor from last fall this wonderful post that shows how Congress has become more polarized on one side of the aisle. He pointed out this isn’t as new as everyone thinks it is, as Tufte gives an example of it in one of his books from the early 1900s (this image is from slide 71 of this presentation).

In any event, joyplots are great for comparing values over time. I’ve been wearing a heartrate monitor while at the gym since February, and the MotiFit app lets you export that data to a CSV file. I’ve been saving these files each day after working out, but I haven’t done much with it yet. After gathering 6 months of data, it seemed high time to play around with joyplots!

The first thing I needed to do was combine all the files, which wasn’t as straightforward as I hoped. The MotiFit app exported data in two different formats and two different encodings, so I had to account for this. I wound up guessing the encodings from the readr package, and then combining the date and time columns where necessary.

out.file <- data.frame()
file.names <- dir(datadir, pattern = "HeartRateData.+?.csv") # You'll need to set datadir

library(readr)
for (i in 1:length(file.names)){
  # exported files are in at least 2 different encodings, so we're going to guess using the guess_encoding function from the readr package 
  encoding <- guess_encoding(paste(datadir, "/", file.names[i], sep=""), n_max = 1000)
  file <- read.csv(
    paste(datadir, "/", file.names[i], sep=""), 
    skip = 1,
    stringsAsFactors = FALSE, 
    fileEncoding = toString(encoding[1,1])
  )
  
  # exported data has 2 or three columns, so if it's three, we're going to join the date and time fields
  if (length(file) > 2) {
    temp.date <- file[,1]
    temp.time <- trimws(file[,2])
    temp.bpm <- trimws(file[,3])
    
    # date format is Wed Feb 8 2017
    temp.datetime <- as.POSIXct(paste(temp.date, temp.time, sep=" "), format = "%a %b %d %Y %H:%M")
    
    temp.df <- data.frame(temp.datetime, temp.bpm)
    names(temp.df) <- c("timestamp", "bpm")
    file <- temp.df
    rm(temp.df)
  } else {
    names(file) <- c("timestamp", "bpm")
    file$timestamp <- as.POSIXct(file$timestamp)
  }
  
  out.file <- rbind(out.file, file)
}

Next, I added a week number column. I’ve been lifting since March on the Wendler 5/3/1 program, which has 4-week cycles of varying intensity (including a deload week). So, the intensity of exercise depends primarily on which week it is.

library(dplyr)

# Used to calculate week number below
start.date <- min(as.Date(out.file$timestamp), na.rm=TRUE)

HR <- out.file %>%
  mutate(
    week = factor(as.numeric(as.Date(timestamp) - start.date) %/% 7, ordered = F),
    bpm = as.numeric(bpm)
  ) %>%
  filter(! is.na(week)) %>%
  arrange(week)

# Reorder factor so oldest week is at top
HR$week <- factor(HR$week, ordered = TRUE, levels = rev(unique(HR$week)))

small multiples

Now that we have the data we need, we can apply the ggjoy! I’m using a custom color scale based on the figures the MotiFit app gives me. To make these look right, however, you need to set the breaks at the midpoint between each value (something I learned in the BMI post).

Here’s the result:

library(ggjoy)

bpm.min <- min(HR$bpm, na.rm = T)
bpm.max <- max(HR$bpm, na.rm = T)

breaks <- c(
  bpm.min,
  (bpm.min + 109) / 2,
  (109 + 123) / 2, 
  (123 + 138) / 2, 
  (138 + 164) / 2,
  bpm.max
)

ggplot(
  HR,
  aes(x = bpm, y = week, height = ..density.., fill = ..x..)
) +
  scale_fill_gradientn(
    colors = c("royalblue", "royalblue", "green", "yellow", "orange", "red"),
    breaks = breaks
  ) +
  geom_joy_gradient(na.rm = TRUE, col = "grey70", scale = 4) +
  theme_joy(font_size = 10) +
  theme(
    legend.position = "none"
  )

The thing that jumps out at me is that you can really identify which weeks I was good about doing cardio work at a target heart rate around 140, and which weeks I wasn’t.

Interactive BMI Chart

bmi-chart

I was recently listening to the #WhoIsFat Joe Rogan podcast where comedians Bert Kreischer and Tom Segura had their weight loss challenge weigh-ins. The challenge was for both guys to get out of the “obese” category and into the merely “overweight” category. If one made it and the other didn’t, the loser would pay for a trip to Paris for the winner. If both made it, fellow comedian Ari Shaffir would pay. At the weigh-in, Ari questioned Tom’s height and whether he made it to the overweight BMI. I googled BMI charts to see whether Ari was right. However, the interactivity of the ones I found left something to be desired.

Coincidentally, the same my wife texted me the same day about an impressive BMI (around 50) in a case she was handling. She practices Social Security disability law, and weight, BMI, diabetes, etc. often arise in a disability determination. Between the bad interactive examples I found on Google and my wife’s comments about her case, I decided to make an interactive BMI chart she can use at work.

Since we’re in the US, we’re using imperial units. We’ll go from 100 to 300 pounds in weight and 5’ to 6’6" in height. We’ll use factors instead of continuous variables so we can label height in feet and inches, rather than just inches.

weights <- seq(from = 100, to = 300, by = 5)
heights <- seq(from = 78, to = 60)

df <- data.frame(height = factor(paste(floor(heights / 12), "'", heights %% 12, "\"", sep=""), labels = rev(paste(floor(heights / 12), "'", heights %% 12, "\"", sep="")), ordered = TRUE))
df$height <- sort(df$height, decreasing = TRUE)

Next, we’ll perform the BMI calculations. That requires conversion from inches to meters and pounds to kilograms. We’ll create a grid of the BMIs that is a text representation of the chart.

for(x in weights){
  bmi.column <- c() 
  for (y in heights){
    # inches to meters
    meters <- y * 0.0254
    # pounds to kgs
    kgs <- x * 0.453592
    
    bmi <- round(kgs / (meters * meters), 1)
    bmi.column <- c(bmi.column, bmi)
  }
    df <- cbind(df, bmi.column)
}
names(df) <- c("height", weights)

In order to plot the chart, we need to translate the data into key->value pairs that ggplot can use. We’ll calculate the max and min BMIs so we can set our color scales as well.

library(reshape2)
df <- melt(df, id.vars = c("height"))
names(df) <- c("height", "weight", "bmi")
min.bmi <- min(df$bmi)
max.bmi <- max(df$bmi)

I wanted to create a nice gradient between each BMI level. Here are the BMI levels I used:

  • < 18: underweight
  • 18-25: normal BMI
  • 25-30: overweight
  • 30-35: obese
  • 35+: morbidly obese
colors <- c(
  "darkgoldenrod", 
  "goldenrod", 
  "green", 
  "yellow", 
  "red", 
  "purple",
  "purple4"
)
values <- c(
  min.bmi, 
  (min.bmi + 18) / 2,
  (18 + 25) / 2, 
  (25 + 30) / 2, 
  (30 + 35) / 2,
  (35 + 40) / 2,
  max.bmi
)
breaks <- c(
  min.bmi,
  (min.bmi + 18) / 2,
  (18 + 25) / 2, 
  (25 + 30) / 2, 
  (30 + 35) / 2,
  40,
  max.bmi
)
labels <- c( 
  "",
  "Underweight", 
  "Ideal", 
  "Overweight", 
  "Obese", 
  "Morbidly Obese",
  ""
)

All the prep work is done now. Let’s plot! Note that the text is commented out. I found that the mouseover brushing didn’t work as well with the labels printed, so I took them out.

library(ggplot2)
library(scales)
gg <- ggplot(df, aes(x = weight, y = height)) +
  geom_raster(aes(fill = bmi), interpolate = TRUE) +
  scale_fill_gradientn("BMI",
                       colors = colors, 
                       guide = "colorbar", 
                       values = rescale(values, to = c(0,1), 
                                        from = range(df$bmi)
                                        ),
                       labels = labels,
                       breaks = breaks
                      ) +
  #geom_text(aes(label = bmi), size = 3) + 
  xlab("weight") +
  scale_x_discrete(breaks = seq(from = 100, to = 300, by = 25))

  library(plotly)
  (ggplotly(gg))
bmi-chart

I’m not thrilled by how much purple is in the legend, but I couldn’t figure out how to shrink the top end of the legend.

What do you think of this plot? What would you do differently?

Thanks for reading!