The Oyranos library became quite slower during the last development cycle for 0.9.6 . That is pretty normal, as new features were added and more ideas waited for implementation letting not much room for all details as wanted. The last two weeks, I took a break and mainly searched for bottlenecks inside the code base and wanted to bring performance back to satisfactory levels. One good starting point for optimisations in Oyranos are the speed tests inside the test suite. But that gives only help on starting a few points. What I wished to be easy, is seeing where code paths spend lots of time and perhaps, which line inside the source file takes much computation time.
I knew from old days the oprofile suite. So I installed it on my openSUSE machine, but had not much success to get callgraphs working. The web search for “Linux profiling” brought me to a article on pixel beat and to perf. I found the article very informative and do not want to duplicate it here. The perf tools are impressive. The sample recording needs to run as root. On the other hand the obtained sample information is quite useful. Most tools of perf are text based. So getting to the hot spots is not straight forward for my taste. However the pixel beat site names a few graphical data representations, and has a screenshot of kcachegrind. The last link under misc guides to flame graphs. The flame graphs are amazing representations of what happens inside Oyranos performance wise. They show in a very intuitive way, which code paths take most time. The graphs are zoom able SVG.
Here an example with expensive hash computation and without in oyranos-profiles:
Computation time has much reduced. An other bottleneck was expensive DB access. I talked with Markus about that already some time ago but forgot to implement. The according flame graph reminded me about that open issue. After some optimisation the DB bottleneck is much reduced.
The command to create the data is:
root$ perf record -g my-command
user& perf-flame-graph.sh my-command-graph-title
… with perf-flame-graph.sh somewhere in your path:
#!/bin/sh
path=/path/to/FlameGraph
output=”$1″
if [ "$output" = "" ]; then
output=”perf”
fi
perf script | $path/stackcollapse-perf.pl > $TMPDIR/$USER-out.perf-folded
$path/flamegraph.pl $TMPDIR/$USER-out.perf-folded > $TMPDIR/$USER-$output.svg
firefox $TMPDIR/$USER-$output.svg
One needs FlameGraph, a set of perl script, installed and perf. The above script is just a typing abbreviation.
have you tried valgrind -tool=callgrind + kcachegrind?
if so was the data better or worse than with oprofile?
Br,
Christoph
Thanks for pointing out. I tried valgrind/callgrind and kcallgrind shows lots of informations. The displayed data is similiar to flame-graph, especially the callmap. However I can much easier understand the cost distribution of callgraphs by the flag-graph SVG.
In kcallgrind as is, one needs to examin the callgraph view and look around for the cost attribute, the small blue graphs with percentage attached, or click on the largest area in the callermap.
The callgraph in kcallgrind could be improved in that cost is not shown literally but more prominently graphically. The color of each function rectangle in the callgraph could hint the user the hottest zones at a glance. At the moment the colors appear almost random to me.