wiki:instruments

What is Apple Instruments?

Apple Instruments is an application profiler that can be used to find performance issues by using a multitude of separate "instruments." These instruments collect and display relevant data such as: time, i/o use, network use, memory allocations and leaks, and much more. For simplicity, we will only be using the time instrument to profile ISSM. In this case, the time instrument measures how long it takes to run a certain test in ISSM. It will also generate a list of all functions used during the program's execution, as well as individual times for those functions.

Installing Instruments

First, you must be running Mac OSX in order to access and run Instruments.

If you have XCode installed, then Instruments is already installed! The Instruments application is located in the following folder:

/Applications/Xcode.app/Contents/Applications/

If you want to access the application from Finder, go to Applications, right click on Xcode and select "Show Package Contents." Then navigate to Contents/Applications/ to find the Instruments application. To make finding and opening Instruments easier, you can either create an alias of Instruments.app on your desktop or pin it to the dock.

If you do not have XCode installed, you need to install the command line tools. You can install command line tools by running this command on the terminal:

xcode-select --install

Instruments will then be located in the directory above.

Setting up a profiler test

Before using Instruments, we need to execute a test from MATLAB. Under $ISSM_DIR/test/NightlyRun/ create a profile_test.m file and paste the following code inside the newly created file.

%Test Name: PigTranHO
md=triangle(model(),'../Exp/Pig.exp',8000.);
md=setmask(md,'../Exp/PigShelves.exp','../Exp/PigIslands.exp');
md=parameterize(md,'../Par/Pig.par');
md=extrude(md,10,1.);
md=setflowequation(md,'HO','all');
md.cluster=generic('name',oshostname(),'np',1);
md.timestepping.time_step=0.5;
md.verbose=verbose('all');
md=solve(md,TransientSolutionEnum());

Since Instruments is a sampling profiler where the data collected is statistically approximated, it is important to create a test that will run for more than a few seconds. Tests that run longer will allow Instruments to collect more accurate sampling data. The test above should take ~26 seconds to execute in MATLAB.

When the test is run, a folder is generated under $ISSM_DIR/execution/ that should have the following format:

$ISSM_DIR/execution/profile_test-MONTH-DAY-YEAR-TIME
example: profile_test-06-30-2016-16-07-26-99507

In order to provide correct arguments to Instruments, we need to save the arguments generated in profile_test.queue. The profile_test.queue file and the portion that needs to be saved for later use is shown below.

[profile_test.queue]
#!/bin/sh
mpiexec -np 1 /Users/username/ISSM/issm/trunk/bin/issm.exe TransientSolution /Users/username/ISSM/issm/trunk/execution/profile_test-06-30-2016-16-07-26-99507 profile_test

[arguments of profile_test.queue to save]
TransientSolution /Users/username/ISSM/issm/trunk/execution/profile_test-06-30-2016-16-07-26-99507 profile_test

Setting up Instruments

In order to run Instruments on the test above, we need to open Instruments and select the Time Profiler.

http://i.imgur.com/wJO4Lkt.png

After selecting the Time Profiler, you will need to attach the target process to Instruments. To do this, select the button that says "All Processes" towards the top of the Instruments window. This option is shown highlighted by a red rectangle in step 1 of the image below. Then, select "Choose Target" to choose the ISSM executable. This option is highlighted in step 2 of the image below.

http://i.imgur.com/0zCNfA3.jpg

Next, navigate to $ISSM_DIR/bin/ and select issm.exe as the target process. The final step is adding arguments to the Arguments section below the target process list. The arguments that need to be added are from the end of the "Setting up a profiler test" section. Below is a screenshot of what your target should look like (with the exception of different arguments).

http://i.imgur.com/VWgJEpv.jpg

Profiling your ISSM test

After setting up the target, we can finally start to profile ISSM. To do this, just hit the record button and let Instruments run. The record button is just a red circle at the top by the pause button. In order to get a better idea of how ISSM is performing, it might be necessary to do a few "runs." This just means recording a new run once the previous run has finished. Recording multiple runs allows Instruments and ISSM to overcome any initial setup operations that might falsely inflate the time, compared to a normal scenario. Below is an image showing what happens when we create multiple runs.

We can also view each run by selecting the drop down arrow on the left of the "CPU Usage" label. If a run is unsuccessful or was accidentally paused before completion, we can delete that specific run by selecting it and going to Instrument -> Delete Run # from the main menu.

http://i.imgur.com/K0p992f.jpg

Obtaining Useful Information from a Run

As you might have noticed, there is nothing useful from the default view. In order to see what is happening, we need to change a few settings. To start, click the gear icon located in the middle right hand side of the screen (next to an icon that looks like an 'E'). This shows us the display settings. Under the "Call Tree" options, select: Separate by Thread, Invert Call Tree, Hide System Libraries, and Flatten Recursion. Make sure that these are the only selected options.

This will clean up the view of the call tree and will only show relevant functions ordered by time (from highest to lowest). The image below shows all of the settings needed to obtain accurate information about what functions within ISSM take the most time. The first red rectangle highlights where the display settings icon is while the second red rectangle shows where the "Call Tree" options that need to be changed are.

http://i.imgur.com/j8kCtok.jpg

Call Tree Option Explanations

From the previous section, we selected Separate by Thread, Invert Call Tree, Hide System Libraries, and Flatten Recursion. This section will explain the options that were selected and not selected.

Separate by Thread

For multi-threaded applications, or for applications that call a multi-threaded library, turning this setting on allows us to see what is executed under each thread during run time. This is left on by default, and is extremely useful for profiling multi-threaded applications or tests that might call multi-threaded functions.

Invert Call Tree

This is probably the most important option you can enable if you want to find the most time-consuming functions. Within the call tree there is a running time column and a self time column. The running time displays a total running time for that function, including other functions that are called from that specific function. The self time displays how much time is actually spent inside the function, excluding other function calls.

This concept can be better understood using the following example: If functionA() spends 200ms before calling functionB(), and functionB() takes 400ms, then functionA() will have a running time of 600ms while functionB() only has a running time of 400ms. This makes it seem like functionA() takes the most time, but in reality functionB() is actually the slowest function. With invert call tree on we examine leaves of the call tree. This means that functions which do not call other functions are shown first. With this setting on, sorting by run time actually displays the correct run time and therefore gives us a better idea of what we can optimize.

Hide System Libraries

This setting cleans up our call tree view by hiding away functions we don't have to worry about.

Flatten Recursion

Instead of creating a separate symbol for each level of recursion, only a single function name and symbol is generated for the entire recursive call. This allows us to see what the total time is in a recursive function instead of seeing a partial amount of time for each level of recursion.

Top Functions

This setting will order symbols and functions by total running time. This option isn't needed because the results we use are already sorted by self time. Turning this one while Invert Call Tree is selected will only make the results of your profiling run harder to understand.

Inspecting Functions

The power of Instruments is that you can see what takes the most time in a specific function. In order to do this, just double click on the function name you want to inspect further. Initially, when a function is selected, you will only be presented with an assembly view. This view shows all the assembly functions that take place when the code is executed. There is also a percentage next to an assembly routine that corresponds to a line of code within the function selected.

Since ISSM is not built with XCode, in order to view the corresponding source code relating to each assembly routine, we must locate the source file we want to examine. To do this, select the "designate source" button to locate the file. This will look like a small red exclamation point. Select the file that contains the function you want to examine and then reload the view by going back to the original call tree and then clicking on the function again. The image below shows the steps to take to designate a source file for MatrixMultiply(), and the resulting view you should see.

http://i.imgur.com/ME8LzU8.jpg

By designating a source file for MatrixMultiply(), we can now see that the inner loop and corresponding matrix operations take the most time out of the entire function. This gives us an idea of what we need to start optimizing first.

Known Issues

Below are some issues I've run into in the past while setting up Instruments. Hopefully everything goes smoothly, but if for some reason Instruments doesn't work as intended, check the questions below.

The call tree has mangled, greyed out symbols

By mangled greyed out symbols, I mean seemingly random codes that you can't actually focus on. A picture of what it might look like is shown below.

http://i.imgur.com/aWlLPC6.png

The issue is caused by trying to use the wrong dSYM file when attaching a process. This is caused by using the incorrect issm.exe or by using another issm executable. To fix this, you need to be sure that you have set the target process as issm.exe under $ISSM_DIR/bin/issm.exe. Any other process, even if it is an executable will result in mismatching library symbols, and will cause all functions to be mangled.

Issues locating dSYM libraries or UUID errors when locating dSYM files

When attempting to locate a source file, you might get an error like this:

No new libraries to symbolicate
The specified path didn't locate a dSYM for any of the selected libraries.

Or you might get an error like this when attempting to view a source file in relation to the assembly code:

The UUID of '/Users/me/project/trunk/lib/libCore.0.dylib.dSYM' 
does not match the UUID for 'libCore.0.dylib'

This might be caused by a corrupt dSYM file. In order to validate that this is the root cause, you can run two simple commands in terminal:

dwarfdump --uuid "libCore.0.dylib.dSYM" | cut -d' ' -f2
dwarfdump --uuid "libCore.0.dylib" | cut -d' ' -f2

If the numbers do not match up, then you have a corrupt dSYM file. In order to fix this, run make install under trunk/src/c and then re-configure and rebuild ISSM. This should remake the dSYM file that is corrupt.

The source code view appears to have percentages in impossible areas of the code

Sometimes when viewing the source code, the percentages of time that are usually next to a source code line appear to be mismatched. For example sometimes it will say 50% of the function time is spent in a blank line or spent in a function that isn't even called from the function you are inspecting.

This problem will happen when using an old view of the source code after remaking trunk/src/c. So, if any changes are made to the source code, a new run will have to occur within Instruments and the source code will have to be relocated. Otherwise, it will be referencing the old source code, making it appear incorrect.

Last modified 9 years ago Last modified on 07/01/16 23:27:03
Note: See TracWiki for help on using the wiki.