Changes
Jump to navigation
Jump to search
m
tt --> code
{{Module rating |beta<!-- Values: pre-alpha • alpha • beta • release • protected -- If a rating not needed/relevant, delete this template call -->}}
<!-- Categories go at the bottom of this page and interwikis go in Wikidata. -->
The purpose of this module is to provide a simple method to do timing analysis for [[w:en:performance tuning|performance tuning]] of Lua-functions, that is [[w:en:Profiling (computer programming)|profiling of functions]]. Usually measuring the duration of function calls is part of a larger effort to identify bottlenecks and problematic code. This is not a full-blown profiler, as it is not possible to do line-tracing (call-graph profiling) in the current set up. Its only purpose is to measure execution time (flat profiling), and to do this interactively from the debug console (ie. on a single function).
''As it is said several times in this text; the timing is not accurate and it should only be used as a coarse measure of the approximate run time and load of a function. Do not attempt to fine-tune a module to just fit into the maximum allowed time slot for a module, keep well below the limit!''
The timing analysis is called with an executable function, and optionally a count (size, default 100) of each test set and a number of such test sets (default 10). The total number of calls will be ''count × sets'' and gives the mean runtime for the function. The standard deviation is calculated from the number of ''sets'' and will only be a coarse estimate. If only run with a single set the standard deviation will go away, but even if it isn't measured the execution time will still vary.
The profiler does two passes over the total calls, one with a dummy function and one with the actual function. This makes it possible to get a baseline for how heavy the additional code are, and then we can later subtract this from the measurement of the function we are testing. If the function under test is to simple those to execution times will be close and the uncertainty in the difference can be high. If the execution time is similar to the standard deviation, then the analysis should be rerun with larger or more sets.
During development it became clear that 100 calls in 10 sets is usually enough to get good estimates, but do not put overly much weight on those numbers. If one function is twice, or tree times, or even ten times slower, never mind as long as it runs in constant or linear time. If something is 100 times slower or have indications that it runs in exponential time, then you should probably consider other algorithms.
==Usage==
This module is not for use in ordinary templates, or on pages, it is to be used in the debug console below the edit window on module pages. For use of the module in a sandboxed environment, look further down.
=== Testing a plain function ===
For example, assume we open a page [[Module:HelloWorld]], then we can write a call to profile the function p.hello(). The code in the page looks something like the example below, but local versions may be a little different. (At ''nowiki'' a similar code is at [[w:no:Module:HelloWorld2]].)
The following code is a stripped down version of [[Module:HelloWorld]]. See the page for a full version.
<syntaxhighlight lang="lua">
my_object = {};
my_object.hello = function( frame )
local str = "Hello World!"
return str
end
return my_object
</syntaxhighlight>
Access to this code will be through a generic ''p'' in the debug console. Assuming the previous code the profiler can be called as
<syntaxhighlight lang="lua">
=require 'Module:Timing'(p.hello)
</syntaxhighlight>
A call like that produced for the same type of module a report like the following at enwiki
<pre>
=require 'Module:Timing'(p.hello)
Each call was running for about 1.2e-09 seconds.
Mean runtime for each set was 1.2e-07 seconds,
with standard deviation of 1.9e-06 seconds,
minimum 8.9e-06, maximum 1.5e-05.
Total time spent was about 2.5e-04 seconds.
Relative load is estimated to 1.0.
</pre>
In this the important information is <code>Each call was running for about 8.5120000000022E-9 seconds</code>. This says how long the function actually run.
Note that this function is very simple, and therefore the run time for the function gets close to the run time for the baseline. As a result the run time for each set gets close to the standard deviation.
=== Testing with arguments ===
The test call can take additional arguments. It will pass those on unaltered to the function under test. Most important, they are ''not'' cloned, that is they are passed on as is, and therefore if the function under test alters them then errors will accumulate. This can happen for the types ''function'', ''table'', ''userdata'' and ''thread''. We don't use ''userdata'' and ''thread'' on Wikipedia. That leaves ''nil'', ''boolean'', ''number'' and ''string'' as passed by value and safe against changes.
Strings are really pass by reference, but they are immutable and has copy on write.
A common example on how to pass in an argument is how to do this for the current frame
<syntaxhighlight lang="lua">
=require 'Module:Timing'(p.hello, mw.getCurrentFrame())
</syntaxhighlight>
This produce the following output
<pre>
=require 'Module:Timing'(p.hello, mw.getCurrentFrame())
Each call was running for about 3.0e-09 seconds.
Mean runtime for each set was 3.0e-07 seconds,
with standard deviation of 2.0e-07 seconds,
minimum 4.8e-06, maximum 5.2e-06.
Total time spent was about 1.2e-04 seconds.
Relative load is estimated to 1.1.
</pre>
Using arguments like this does not impose very much additional load, and most of it can be factored out.
=== Testing with a wrapper function ===
If a call need additional setup, or parameters, then those can be provided in a wrapper function. That can be achieved by wrapping up the function in an additional function like the following, and then timing the resulting function.
Do not add wrappers to very simple functions, the timing will be far off, as is clearly seen in these examples!
====Wrapper in the debugger ====
The wrapper function can de defined in the debugger, that is added in the text area for debugging. Add a separate function like the following
<syntaxhighlight lang="lua">
function wrap() return p.hello(mw.getCurrentFrame()) end
=require 'Module:Timing'(wrap)
</syntaxhighlight>
A call sequence like that produced the following output
<pre>
function wrap() return p.hello(mw.getCurrentFrame()) end
=require 'Module:Timing'(wrap)
Each call was running for about 1.9e-07 seconds.
Mean runtime for each set was 1.9e-05 seconds,
with standard deviation of 3.4e-07 seconds,
minimum 6.7e-06, maximum 7.3e-06.
Total time spent was about 3.7e-04 seconds.
Relative load is estimated to 3.7.
</pre>
Note that the duration of the call is much higher than the bare function itself! The inspected function must be a bit heavy, otherwise the wrapper will dominate the result.
An alternate form with an anonymous function is
<syntaxhighlight lang="lua">
=require 'Module:Timing'(function() return p.hello(mw.getCurrentFrame()) end)
</syntaxhighlight>
In general you should precompute as much as possible to avoid unnecessary computations inside the loop, like this
<syntaxhighlight lang="lua">
my_frame=mw.getCurrentFrame()
=require 'Module:Timing'(function() return p.hello(my_frame) end)
</syntaxhighlight>
====Wrapper in the module====
Sometimes it is better to put the wrapper in the module itself
<syntaxhighlight lang="lua">
my_object = {};
my_object.hello = function( frame )
local str = "Hello World!"
return str
end
my_object.wrap = function ()
return my_object.hello(mw.getCurrentFrame())
end
return my_object
</syntaxhighlight>
That produced the following output
<pre>
=require 'Module:Timing'(p.wrap)
Each call was running for about 1.9e-07 seconds.
Mean runtime for each set was 1.9e-05 seconds,
with standard deviation of 7.0e-06 seconds,
minimum 6.3e-06, maximum 7.1e-06.
Total time spent was about 3.7e-04 seconds.
Relative load is estimated to 4.0.
</pre>
Also this wrapper is quite heavy compared to the tested function itself.
In general you should precompute as much as possible to avoid unnecessary computations inside the loop, like this
<syntaxhighlight lang="lua">
my_object = {};
my_object.hello = function( frame )
local str = "Hello World!"
return str
end
my_object.frame = mw.getCurrentFrame()
my_object.wrap = function ()
return my_object.hello(my_object.frame)
end
return my_object
</syntaxhighlight>
===Testing in a sandboxed environment===
If the module under test is sandboxed, that is under [[Special:MyPage/sandbox]], then the debugger is not directly available and we must put the testing inside the module itself. If we have a replacement module, then we add a requirement and a helper function
Sometimes it is better to put the wrapper in the module under inspection itself
<syntaxhighlight lang="lua">
local timing = require 'Module:Timing'
my_object = {};
my_object.hello = function( frame )
local str = "Hi there World!"
return str
end
my_object.TIMING = function (frame)
return timing(my_object.hello)
end
return my_object
</syntaxhighlight>
This can them be used in [[Special:TemplateSandbox]] with a prefix from [[Special:MyPage/sandbox]] and any "render page" we see fit for the purpose. Often we want a specific page though to get access to additional data like the Wikidata item. Fill inn an invoke-call for our helper function
<syntaxhighlight lang="html5">
{{#invoke|HelloWorld|TIMING}}
</syntaxhighlight>
and we will have a timing report for the helper call in the module.
==Gadget==
There is an additional gadget (to be continued…)
Not sure if this is really useful, will be getting back to this.
==Note of advice==
Simple functions gets close to the baseline in run time and because of that will be very noisy. When calculating a mean we actually do a low-pass filtering, removing high frequency noise. Still there will be noise and the only way to avoid it is to make longer runs (larger sets and more sets). Take note of the maximum and minimum run time, if they are to far from each other, or the standard deviation gets large, then it is very likely that the load situation on the server makes it difficult to get good results.
Note that a function consisting of one or two lines can be just fractions slower than the baseline function, but add a few function calls and the load explodes.
If the function is very simple, use the timing info. If the function is rather complex, use the relative load info.
Do not trust the measurement if they seems fishy, do trust your instinct. Especially, do not use higher order functions without thorough investigation even if they look very promising on simple cases!
==See also==
* [[w:en:Average]] ''The code use an arithmetic mean''
* [[w:en:Standard deviation]] ''The code use variance and standard deviation''
* [[w:en:Sum of normally distributed random variables]] ''The code makes an assumption that delays are independent random occurences, which is not quite right''
<includeonly>{{#ifeq:{{SUBPAGENAME}}|sandbox||
<!-- Categories go here and interwikis go in Wikidata. -->
[[Category:Lua metamodules]]
}}</includeonly>
<!-- Categories go at the bottom of this page and interwikis go in Wikidata. -->
The purpose of this module is to provide a simple method to do timing analysis for [[w:en:performance tuning|performance tuning]] of Lua-functions, that is [[w:en:Profiling (computer programming)|profiling of functions]]. Usually measuring the duration of function calls is part of a larger effort to identify bottlenecks and problematic code. This is not a full-blown profiler, as it is not possible to do line-tracing (call-graph profiling) in the current set up. Its only purpose is to measure execution time (flat profiling), and to do this interactively from the debug console (ie. on a single function).
''As it is said several times in this text; the timing is not accurate and it should only be used as a coarse measure of the approximate run time and load of a function. Do not attempt to fine-tune a module to just fit into the maximum allowed time slot for a module, keep well below the limit!''
The timing analysis is called with an executable function, and optionally a count (size, default 100) of each test set and a number of such test sets (default 10). The total number of calls will be ''count × sets'' and gives the mean runtime for the function. The standard deviation is calculated from the number of ''sets'' and will only be a coarse estimate. If only run with a single set the standard deviation will go away, but even if it isn't measured the execution time will still vary.
The profiler does two passes over the total calls, one with a dummy function and one with the actual function. This makes it possible to get a baseline for how heavy the additional code are, and then we can later subtract this from the measurement of the function we are testing. If the function under test is to simple those to execution times will be close and the uncertainty in the difference can be high. If the execution time is similar to the standard deviation, then the analysis should be rerun with larger or more sets.
During development it became clear that 100 calls in 10 sets is usually enough to get good estimates, but do not put overly much weight on those numbers. If one function is twice, or tree times, or even ten times slower, never mind as long as it runs in constant or linear time. If something is 100 times slower or have indications that it runs in exponential time, then you should probably consider other algorithms.
==Usage==
This module is not for use in ordinary templates, or on pages, it is to be used in the debug console below the edit window on module pages. For use of the module in a sandboxed environment, look further down.
=== Testing a plain function ===
For example, assume we open a page [[Module:HelloWorld]], then we can write a call to profile the function p.hello(). The code in the page looks something like the example below, but local versions may be a little different. (At ''nowiki'' a similar code is at [[w:no:Module:HelloWorld2]].)
The following code is a stripped down version of [[Module:HelloWorld]]. See the page for a full version.
<syntaxhighlight lang="lua">
my_object = {};
my_object.hello = function( frame )
local str = "Hello World!"
return str
end
return my_object
</syntaxhighlight>
Access to this code will be through a generic ''p'' in the debug console. Assuming the previous code the profiler can be called as
<syntaxhighlight lang="lua">
=require 'Module:Timing'(p.hello)
</syntaxhighlight>
A call like that produced for the same type of module a report like the following at enwiki
<pre>
=require 'Module:Timing'(p.hello)
Each call was running for about 1.2e-09 seconds.
Mean runtime for each set was 1.2e-07 seconds,
with standard deviation of 1.9e-06 seconds,
minimum 8.9e-06, maximum 1.5e-05.
Total time spent was about 2.5e-04 seconds.
Relative load is estimated to 1.0.
</pre>
In this the important information is <code>Each call was running for about 8.5120000000022E-9 seconds</code>. This says how long the function actually run.
Note that this function is very simple, and therefore the run time for the function gets close to the run time for the baseline. As a result the run time for each set gets close to the standard deviation.
=== Testing with arguments ===
The test call can take additional arguments. It will pass those on unaltered to the function under test. Most important, they are ''not'' cloned, that is they are passed on as is, and therefore if the function under test alters them then errors will accumulate. This can happen for the types ''function'', ''table'', ''userdata'' and ''thread''. We don't use ''userdata'' and ''thread'' on Wikipedia. That leaves ''nil'', ''boolean'', ''number'' and ''string'' as passed by value and safe against changes.
Strings are really pass by reference, but they are immutable and has copy on write.
A common example on how to pass in an argument is how to do this for the current frame
<syntaxhighlight lang="lua">
=require 'Module:Timing'(p.hello, mw.getCurrentFrame())
</syntaxhighlight>
This produce the following output
<pre>
=require 'Module:Timing'(p.hello, mw.getCurrentFrame())
Each call was running for about 3.0e-09 seconds.
Mean runtime for each set was 3.0e-07 seconds,
with standard deviation of 2.0e-07 seconds,
minimum 4.8e-06, maximum 5.2e-06.
Total time spent was about 1.2e-04 seconds.
Relative load is estimated to 1.1.
</pre>
Using arguments like this does not impose very much additional load, and most of it can be factored out.
=== Testing with a wrapper function ===
If a call need additional setup, or parameters, then those can be provided in a wrapper function. That can be achieved by wrapping up the function in an additional function like the following, and then timing the resulting function.
Do not add wrappers to very simple functions, the timing will be far off, as is clearly seen in these examples!
====Wrapper in the debugger ====
The wrapper function can de defined in the debugger, that is added in the text area for debugging. Add a separate function like the following
<syntaxhighlight lang="lua">
function wrap() return p.hello(mw.getCurrentFrame()) end
=require 'Module:Timing'(wrap)
</syntaxhighlight>
A call sequence like that produced the following output
<pre>
function wrap() return p.hello(mw.getCurrentFrame()) end
=require 'Module:Timing'(wrap)
Each call was running for about 1.9e-07 seconds.
Mean runtime for each set was 1.9e-05 seconds,
with standard deviation of 3.4e-07 seconds,
minimum 6.7e-06, maximum 7.3e-06.
Total time spent was about 3.7e-04 seconds.
Relative load is estimated to 3.7.
</pre>
Note that the duration of the call is much higher than the bare function itself! The inspected function must be a bit heavy, otherwise the wrapper will dominate the result.
An alternate form with an anonymous function is
<syntaxhighlight lang="lua">
=require 'Module:Timing'(function() return p.hello(mw.getCurrentFrame()) end)
</syntaxhighlight>
In general you should precompute as much as possible to avoid unnecessary computations inside the loop, like this
<syntaxhighlight lang="lua">
my_frame=mw.getCurrentFrame()
=require 'Module:Timing'(function() return p.hello(my_frame) end)
</syntaxhighlight>
====Wrapper in the module====
Sometimes it is better to put the wrapper in the module itself
<syntaxhighlight lang="lua">
my_object = {};
my_object.hello = function( frame )
local str = "Hello World!"
return str
end
my_object.wrap = function ()
return my_object.hello(mw.getCurrentFrame())
end
return my_object
</syntaxhighlight>
That produced the following output
<pre>
=require 'Module:Timing'(p.wrap)
Each call was running for about 1.9e-07 seconds.
Mean runtime for each set was 1.9e-05 seconds,
with standard deviation of 7.0e-06 seconds,
minimum 6.3e-06, maximum 7.1e-06.
Total time spent was about 3.7e-04 seconds.
Relative load is estimated to 4.0.
</pre>
Also this wrapper is quite heavy compared to the tested function itself.
In general you should precompute as much as possible to avoid unnecessary computations inside the loop, like this
<syntaxhighlight lang="lua">
my_object = {};
my_object.hello = function( frame )
local str = "Hello World!"
return str
end
my_object.frame = mw.getCurrentFrame()
my_object.wrap = function ()
return my_object.hello(my_object.frame)
end
return my_object
</syntaxhighlight>
===Testing in a sandboxed environment===
If the module under test is sandboxed, that is under [[Special:MyPage/sandbox]], then the debugger is not directly available and we must put the testing inside the module itself. If we have a replacement module, then we add a requirement and a helper function
Sometimes it is better to put the wrapper in the module under inspection itself
<syntaxhighlight lang="lua">
local timing = require 'Module:Timing'
my_object = {};
my_object.hello = function( frame )
local str = "Hi there World!"
return str
end
my_object.TIMING = function (frame)
return timing(my_object.hello)
end
return my_object
</syntaxhighlight>
This can them be used in [[Special:TemplateSandbox]] with a prefix from [[Special:MyPage/sandbox]] and any "render page" we see fit for the purpose. Often we want a specific page though to get access to additional data like the Wikidata item. Fill inn an invoke-call for our helper function
<syntaxhighlight lang="html5">
{{#invoke|HelloWorld|TIMING}}
</syntaxhighlight>
and we will have a timing report for the helper call in the module.
==Gadget==
There is an additional gadget (to be continued…)
Not sure if this is really useful, will be getting back to this.
==Note of advice==
Simple functions gets close to the baseline in run time and because of that will be very noisy. When calculating a mean we actually do a low-pass filtering, removing high frequency noise. Still there will be noise and the only way to avoid it is to make longer runs (larger sets and more sets). Take note of the maximum and minimum run time, if they are to far from each other, or the standard deviation gets large, then it is very likely that the load situation on the server makes it difficult to get good results.
Note that a function consisting of one or two lines can be just fractions slower than the baseline function, but add a few function calls and the load explodes.
If the function is very simple, use the timing info. If the function is rather complex, use the relative load info.
Do not trust the measurement if they seems fishy, do trust your instinct. Especially, do not use higher order functions without thorough investigation even if they look very promising on simple cases!
==See also==
* [[w:en:Average]] ''The code use an arithmetic mean''
* [[w:en:Standard deviation]] ''The code use variance and standard deviation''
* [[w:en:Sum of normally distributed random variables]] ''The code makes an assumption that delays are independent random occurences, which is not quite right''
<includeonly>{{#ifeq:{{SUBPAGENAME}}|sandbox||
<!-- Categories go here and interwikis go in Wikidata. -->
[[Category:Lua metamodules]]
}}</includeonly>