Improving app performance with profvis

Posit Solutions Engineering (Lisa Anders)

Posit, PBC

Why you should be load testing

You want to know what to prioritize to improve your application
Often what’s holding your application back isn’t intuitive

“Illustration from Hadley Wickham’s talk”The Joy of Functional Programming (for Data Science).” by Allison Horst

Profiling

Profile apps to understand where it is spending the bulk of its time. Oftentimes the result is surprising and may point at the issue being a specific function or command rather than needing an overhaul of the app itself or changing how it is running on the server.

profvis - Profvis is a tool for helping you to understand how R spends its time.

library(profvis)

# general code example
profvis({
  data(diamonds, package = "ggplot2")
  
  plot(price ~ carat, data = diamonds)
  m <- lm(price ~ carat, data = diamonds)
  abline(m, col = "red")
})

# shiny app example
profvis({runApp()})

Profiling

For more information refer to the support article

On top is the code, and on the bottom is a flame graph. In the flame graph, the horizontal direction represents time in milliseconds, and the vertical direction represents the call stack. Looking at the bottom-most items on the stack, most of the time, about 2 seconds, is spent in plot, and then a much smaller amount of time is spent in lm, and almost no time at all is spent in abline – it doesn’t even show up on the flame graph.

Each block in the flame graph represents a call to a function, or possibly multiple calls to the same function. The width of the block is proportional to the amount of time spent in that function. When a function calls another function, another block is added on top of it in the flame graph.

The profiling data has some limitations: some internal R functions don’t show up in the flame graph, and it offers no insight into code that’s implemented in languages other than R (e.g. C, C++, or Fortran).

Load testing

Using load testing with profiling grants a very granular view of where the performance issues are happening. Oftenlower usage apps may appear to have great performance, only to struggle as more users access that piece of content due to multiple users sharing the same R or Python process.

shinyloadtest - Load testing helps developers and administrators estimate how many users their application can support.

Load testing Overview

shinyloadtest - Load testing helps developers and administrators estimate how many users their application can support.

The steps:

Part 1: Record a typical user session for the app.

shinyloadtest::record_session('https://shinyapp.example.com/')

Part 2: Replay the session in parallel, simulating many simultaneous users accessing the app.

shinycannon recording.log https://shinyapp.example.com/ --workers 5 --loaded-duration-minutes 2 --output-dir run1

Part 3: Analyze the results of the load test and determine if the app performed well enough.

df <- shinyloadtest::load_runs("run1")
shinyloadtest::shinyloadtest_report(df, "run1.html")

Let’s look, in more detail, at running this from Workbench for apps deployed to Connect.

Part 1: User Recording

The Connect API key is stored as the r environment variable connect_api_key. It can be edited/modified using the usethis package with:

library(usethis)
usethis::edit_r_environ()

Create “Recording” of a typical user’s interaction

library(shinyloadtest)

shinyloadtest::record_session(
  target_app_url='https://colorado.posit.co/rsc/content/bec1d4bc-2ab7-4ba3-9bd6-b9e336bf3ff9/', 
  connect_api_key=Sys.getenv("CONNECT_API_KEY"))

Use the URL for the app from the “open solo” mode on Connect.

Solo mode is important. For example, this URL, when opened with developer options, doesn’t work: https://colorado.posit.co/rsc/connect/#/apps/bec1d4bc-2ab7-4ba3-9bd6-b9e336bf3ff9/access

Alternatively, we can programmatically create the recording using shinytest2.

Part 2: Load testing, install Shinycannon

Set the env variable for the connect api key in your terminal with (note that set is used in windows, export for mac or linux). Do this in terminal (after adding your API key).

export SHINYCANNON_CONNECT_API_KEY=<add your key here>

Verify that it was set (note that %% is used in windows, $ in linux).

echo $SHINYCANNON_CONNECT_API_KEY

Shinycannon installation is optional on Linux, the jar file can be called directly (useful in organizations where system installation is restricted for security reasons).

Test that shinycannon works by calling the help documentation with:

cd test
java -jar shinycannon-1.1.3-dd43f6b.jar -h

Part 2: Load testing, continued

We will run the load test for simulating the number of simultaneous users, each time saving the results to a different folder:

java -jar shinycannon-1.1.3-dd43f6b.jar recording.log https://colorado.posit.co/rsc/content/d2c40c48-ae0b-48d8-888a-e8626322565d/ --workers 1 --loaded-duration-minutes 2 --output-dir run1 --overwrite-output

java -jar shinycannon-1.1.3-dd43f6b.jar recording.log https://colorado.posit.co/rsc/content/d2c40c48-ae0b-48d8-888a-e8626322565d/ --workers 5 --loaded-duration-minutes 2 --output-dir run2 --overwrite-output

Alternatively we could pass in the command from R using the system() command, for example:

connect_api_key = Sys.getenv("CONNECT_API_KEY")

system(
  sprintf(
    # "set SHINYCANNON_CONNECT_API_KEY=", #Change to this if you are running on Windows
     "export SHINYCANNON_CONNECT_API_KEY=",
    connect_api_key
    )
)

target_url <- "https://colorado.posit.co/rsc/content/d2c40c48-ae0b-48d8-888a-e8626322565d/"
workers <- 1
dir <- "run1"
system(
  sprintf(
    "java -jar shinycannon-1.1.3-dd43f6b.jar recording.log %s --workers %s --loaded-duration-minutes 2 --output-dir %s --overwrite-output",
    target_url, workers, dir
  )
)

Part 3: Analyze the results

Reference the documentation to understand the different charts: https://rstudio.github.io/shinyloadtest/articles/analyzing-load-test-logs.html?q=output#report-output

library(dplyr)

df <- load_runs(
  `1 user` = "run1",
  `5 users` = "run2"
)

shinyloadtest_report(df, "report.html")

Notes:

RMarkdown and various dependencies will need to be installed.
After running this, depending on your organizations security policies, it may help to open as a “preview” rather than in web browser (the error message will be something like “CORS restricted”).
It may also be helpful to set self_contained = TRUE, or self_container = FALSE depending on any error messages encountered.

Load testing output

For more information refer to shinyloadtest

Load testing output: Impact on session duration

For more information refer to shinyloadtest

What about Python?

Python profiling tool: https://jiffyclub.github.io/snakeviz/
Python load testing tool: https://locust.io/

Where to go from here?

Optional/Backup

Data best practices

Apply data best practices and see if that improves performance:

Pull data on a scheduler
Reducing the data being loaded
Selecting a faster data storage system, for example by pinning arrow files
Utilize cacheing

Async

As a last resort we can consider async.

In general, async is only useful when there are specific steps that take a long time to run, since that will free up the process to service other users. Usually async is saved as a last resort because it is usually the most challenging to implement.

When using async, encouraging developers to include additional debug messages, for example with log4r in R, is particularly important. This will allow developers to trace back errors to the session and connection.

When to know it’s a real server issue

Ask the questions:

Are all applications impacted?
Has performance gotten worse over time?

Use the tools:

Admin dashboard
Logs
Scheduled content timetable
Any additional monitoring through Prometheus/Graphite/etc.
Reclaim space by reducing the number of bundles stored on the server