• Software
    • Workbench
    • Method R Trace
    • ILO
  • Courses
    • Mastering Oracle Trace Data
    • Thinking Clearly about Performance
  • Books
    • Mastering Oracle Trace Data
    • Optimizing Oracle Performance
logologo_light
  • Software
    • Method R Workbench
    • Method R Trace
    • Downloads and Free Trials
    • Documentation
    • Release Notes
    • License Agreement
    • Pricing
  • Support
    • Downloads
    • Ask Stack Overflow
    • Report an Issue
    • Consulting
  • Education
    • Courses
    • Books
    • Papers
    • Blog Posts
    • Videos
  • Software
    • Method R Workbench
    • Method R Trace
    • Downloads and Free Trials
    • Documentation
    • Release Notes
    • License Agreement
    • Pricing
  • Support
    • Downloads
    • Ask Stack Overflow
    • Report an Issue
    • Consulting
  • Education
    • Courses
    • Books
    • Papers
    • Blog Posts
    • Videos

A Model for Predicting Response Times

One of the more challenging types of performance problem is the dreaded scenario in which the problem occurs only intermittently and is difficult to trap with your performance measurement tools. In such a situation, trace data can be an irreplaceable ally, allowing you to understand even phenomena that you’ve not yet actually measured.

Recently, a client came to us with a problem:

“We have this batch job. It processes pretty much the same amount of data every time we run it. It usually runs in a little over an hour, but sometimes—out of the blue—it’ll run nearly two and a half hours. We have no idea when it’s going to happen. There must be a pattern to it; we just can’t figure out what it is. It was slow last Tuesday, but it’s not slow every Tuesday. It’s slow sometimes between three and four o’clock, but not always, and sometimes it’s slow at other times. We thought maybe it was interference with our daily batch jobs, but we’ve proven that that’s not it, either. We just can’t correlate it to anything…”

The easiest way for us to kill a problem like this is to collect two trace files: (1) a trace of a fast execution (our baseline), and (2) a trace of a slow execution. Then we could see in vivid color what’s different when the program runs slowly.

The client ran a baseline trace of a fast execution right away, but they still haven’t traced the program in the act of running slowly. Here’s the top-level profile for fast run (created by our Method R Workbench software):

The thing that really caught my eye in this profile was the mean duration per “db file sequential read” call. I expect for disk read calls to consume 2–5 ms, but this program did over a million read calls with an average read latency of just 0.849 ms.

That’s awesome, but what’s going to happen someday when the storage array is too busy to serve up sub-millisecond read calls? Well, the profile makes it easy to find out. With a simple spreadsheet, I can calculate what the total job duration would be if any of the call counts or durations were to change.

The “baseline execution” table contains data from the profile. The “forecast execution” table contains the same rows and columns, but the duration for each subroutine uses the formula duration = calls × duration per call, so that the duration will recalculate whenever a call count or latency value in a green cell changes. Now I can what-if to my heart’s content.

The 0.005090 value shown in the figure is the answer to the question, “If all else were held equal, what ‘db file sequential read’ latency would it take to drive the job’s total duration to 8,700 s (which is approximately double its 4,358 s value)?” The Tools › Goal Seek… feature makes it easy to figure out. The latency that would do that is 0.005090 s—about 5 ms.

This is interesting, because 5 ms is a typical read call latency on a lot of systems I see. …At a latency that is typical for lots of people, the job’s duration would double—like it has been doing intermittently.

Of course, the model doesn’t explain why the job did run slowly (we’ll still need the second trace file for that), but it certainly shows why it will run slowly if disk I/O latencies degrade. This program’s sensitivity to perturbations in disk latency is our sign to eliminate as many OS read calls as we can.

Once you can comprehend any program’s duration as nothing more complicated than the sum of a list of count × latency values, you’ll have a new lens through which previously intractable performance problems can surrender to logical explanation. Being able to “see” how a program spends your time can help you understand how that program will behave even under conditions that you have not yet measured. This is a vital capability to have.

This post is an excerpt from a new chapter in the forthcoming Third Edition of Cary Millsap’s The Method R Guide to Mastering Oracle Trace Data.

Author’s edit 2019-04-12: The Method R Guide to Mastering Oracle Trace Data is now released and available for purchase.

Share this:

  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to email this to a friend (Opens in new window)

Related

  • Posted by Cary Millsap
  • On 2017-11-14
  • 0 Comment
  • 0 likes
Tags: excel, method r workbench, model, profile

Leave Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Recent Posts
  • Trace 19.2.0.1 (2019-10-01)
  • Method R Optimization at a High-Profile Application Vendor
  • Method R Optimization at a Global Food Retailer
  • Method R Optimization at a European Airline
  • Trace 19.1.3.0 (2019-06-20)
  • Workbench 8.1.2.1 (2019-05-21)
  • Bob Rhubart Interviews Cary Millsap about the New Third Edition of “Mastering Oracle Trace Data”
  • Trace 19.1.2.0 (2019-04-25)
  • Trace 19.1.1.1 (2019-04-16)
  • Mastering Oracle Trace Data, 3rd edition
  • Workbench 8.1.1.8 (2019-04-11)
  • Trace 19.1.0.0 (2019-04-11)
  • Mastering Oracle Trace Data
Categories
  • All Posts (28)
  • Blog Posts (8)
  • Books (3)
  • Case Studies (3)
  • Courses (2)
  • Papers (13)
  • Testimonials (10)
  • Trace Release Notes (10)
  • Videos (4)
  • Workbench Release Notes (9)
Recent Comments

    Our Q&A Site Is Stack Overflow

    Previous thumb

    Workbench 8.0.0.88 (2018-05-09)

    Scroll

    logo-white@2x.png

    Method R is a software and education company committed to easy-to-use, high-precision measurement of user performance experiences, for Oracle application developers and DBAs.

    info@method-r.com

    • View Method-R-Corporation-170040323054594’s profile on Facebook
    • View @MethodR’s profile on Twitter
    • View carymillsap’s profile on LinkedIn
    Products
    • Software
    • Documentation
    • Release Notes
    • Courses
    • Books
    • Pricing
    Free Stuff
    • Software Trials
    • Case Studies
    • Papers
    • Videos
    • Blog
    About Us
    • FAQ
    • Testimonials
    • Contact Us
    • Privacy
    Recent Posts
    • Trace 19.2.0.1 (2019-10-01)
    • Method R Optimization at a High-Profile Application Vendor
    • Method R Optimization at a Global Food Retailer
    • Method R Optimization at a European Airline
    • Trace 19.1.3.0 (2019-06-20)
    • Workbench 8.1.2.1 (2019-05-21)
    • Bob Rhubart Interviews Cary Millsap about the New Third Edition of “Mastering Oracle Trace Data”
    • Trace 19.1.2.0 (2019-04-25)
    • Trace 19.1.1.1 (2019-04-16)
    • Mastering Oracle Trace Data, 3rd edition
    • Workbench 8.1.1.8 (2019-04-11)
    • Trace 19.1.0.0 (2019-04-11)
    • Mastering Oracle Trace Data
    • Workbench 8.1.0.64 (2019-01-22)
    • Resetting User Session Handle Attributes in JDBC
    • Workbench 8.0.5.0 (2018-08-27)
    • Trace 18.2.4.0 (2018-08-24)
    © 2008, 2019 Method R Corporation
    loading Cancel
    Post was not sent - check your email addresses!
    Email check failed, please try again
    Sorry, your blog cannot share posts by email.