Do you have SQL in your system that doesn’t use placeholders, like this?
[3r3dhkb0z824v] select stuff from t where id=18432... [5wamvs45j6nh4] select stuff from t where id=4286... [ih5x9lgg492nk] select stuff from t where id=329971...
Statements like these are generated dynamically by procedural programs written in Java, PHP, C#, etc. (or even PL/SQL if you work really hard). Programs that generate SQL statements like these create a lot of performance problems:
- They spend too much CPU time on database parse calls;
- They serialize on library cache and shared pool latches, which diminishes your application’s scalability;
- They abuse the library cache, causing your application to consume way more memory than it ought to;
- And they create unnecessary network congestion.
It’s easy to detect this kind of a problem in your trace files with the mrskew utility in the Method R Workbench package.
Counting Shareable SQL Statements
The secret to counting shareable SQL statements is the mrskew $ssqlid
variable (note the double-‘s’). In the example shown above, all of these similar but distinct SQL statements would have different $sqlid
values (the ones shown), but they would all have the same $ssqlid
value.
With just a little bit of magic, primarily with the --init
, --group
, and --select
command line options, you can write a mrskew program that can count shareable statements:
# Count SQLID values that share the same SSQLID. # --init=' my %h = (); sub f($) { my ($ss, $s) = @_; if ($h{$ss}{$s}) { # If we've seen this SSQLID/SQLID combo before, then return a count of zero new things seen. return 0; } else { # Otherwise, record its existence and return that we've seen a new one. return $h{$ss}{$s} = 1; } } ' --name=parse --group='$ssqlid' --glabel=SSQLID --select='f($ssqlid,$sqlid)' --slabel=DISTINCT-TEXTS --precision=0
Saving the Report in an RC File
Yuk, right? There probably aren’t a whole lot of people in this world who are good today at writing these things, but fortunately, you have us at Method R for that.
Of course, for any sequence of arguments as complicated as this, you’d never want to have to figure this out twice (or even type it in twice), so this is a perfect candidate for saving in an RC file. Just echo the value of your environment variable called MRTOOLS_RCPATH
to find out where mrskew is going to search for your RC files. Put a cut-and-paste copy of this program shown above in a file in one of your rcpath directories. I called my file ssql
. Now, when I need it, I simply use the option --rc=ssql
, and presto, I get a report that looks like this:
mrskew p1_ora_1492.trc --rc=ssql SSQLID DISTINCT-TEXTS % CALLS MEAN MIN MAX ---------- -------------- ------ ----- ---- --- --- 4151812497 70 20.0% 70 1 1 1 3642320257 70 20.0% 70 1 1 1 2047770123 70 20.0% 70 1 1 1 1928547239 70 20.0% 70 1 1 1 1138917066 69 19.7% 69 1 1 1 3957414185 1 0.3% 349 0 0 1 ---------- -------------- ------ ----- ---- --- --- TOTAL (6) 350 100.0% 698 1 0 1
This table represents a program execution that Oracle’s tkprof would have reported as using 350 distinct SQL statements, and it would have reported statistics on each one of those statements individually. Each one consumes about the same total amount of time, so you’d be left looking at this list of 350 not-great-but-not-horrible-looking statements with this nothing-I-can-do feeling. But seeing the information the mrskew way tells you a lot more. This table tells you that there were really only six SQL statements that a developer actually wrote. The code then generated roughly 70 copies of each (of the first five of the six shown here) during the execution of this report.
Now I know there are no more than six places in the code that I need to go fix. And with a report like the following, I can know exactly which one to attack first:
$ mrskew p1_ora_1492.trc --name=parse --group='"[$ssqlid] ".substr($sql,0,60)' --where='$sqlid !~ /^#/' '"[$ssqlid] ".substr($sql,0,60)' DURATION % CALLS MEAN MIN MAX ------------------------------------------------------------------------- ---------- ------ ----- -------- -------- -------- [1138917066] select "T0"."C0" "C0", "T0"."C1" "C1", "T0"."C2" "C2", "T0". 192.792694 26.2% 69 2.794097 2.367640 4.498316 [4151812497] select "T0"."C0" "C0", "T0"."C1" "C1", "T0"."C2" "C2", "T0". 145.986808 19.9% 70 2.085526 1.556764 3.306498 [1928547239] with "INVOICE_DETAIL_SQL" as (SELECT DISTINCT RIH.REF_INVOIC 137.010170 18.6% 70 1.957288 1.569761 2.691591 [3642320257] with "INVOICE_DETAIL_SQL" as (SELECT DISTINCT RIH.REF_INVOIC 129.994236 17.7% 70 1.857061 1.364792 2.628600 [2047770123] with "INVOICE_DETAIL_SQL" as (SELECT DISTINCT RIH.REF_INVOIC 125.465926 17.1% 70 1.792370 1.333797 2.915557 [3957414185] SELECT DISTINCT RIH.REF_INVOICE_NUMBER ,RIH.INVOICE_NUMBER , 4.176363 0.6% 349 0.011967 0.000000 4.135371 ------------------------------------------------------------------------- ---------- ------ ----- -------- -------- -------- TOTAL (6) 735.426197 100.0% 698 1.053619 0.000000 4.498316
Ha! I might have gotten lucky. Looks like there are really only three SQL statements in this application worth bothering over. Now I know exactly which one to look for first in my application code.