@@ -30,44 +30,85 @@ section, type "make parsec". The resulting histograms are in the following file
3030 - parsec_experiment/freqmine_rd_l1_output/freqmine-PID.reuse.hpcrun.bin
3131 - parsec_experiment/freqmine_rd_spatial_l1_output/freqmine-PID.reuse.hpcrun.bin
3232
33- Running ReuseTracker to profile An Individual Application
34- ===============
33+
34+ Usage on Intel
35+ ==============
3536- To profile the private cache temporal reuse distance of an application, run the following command.
3637
3738HPCRUN_WP_REUSE_PROFILE_TYPE="TEMPORAL" HPCRUN_PROFILE_L3=false HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_REUSETRACKER -e MEM_UOPS_RETIRED: ALL_LOADS @100000 -e MEM_UOPS_RETIRED: ALL_STORES @100000 <./your_executable> your_args
3839
40+ - To profile the private cache spatial reuse distance of an application, run the following command.
41+
42+ HPCRUN_WP_REUSE_PROFILE_TYPE="SPATIAL" HPCRUN_PROFILE_L3=false HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_REUSETRACKER -e MEM_UOPS_RETIRED: ALL_LOADS @100000 -e MEM_UOPS_RETIRED: ALL_STORES @100000 <./your_executable> your_args
43+
3944- To profile the shared cache temporal reuse distance of an application, run the following command.
4045
4146HPCRUN_WP_REUSE_PROFILE_TYPE="TEMPORAL" HPCRUN_PROFILE_L3=true HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_REUSETRACKER -e MEM_UOPS_RETIRED: ALL_LOADS @100000 -e MEM_UOPS_RETIRED: ALL_STORES @100000 <./your_executable> your_args
4247
48+ - To profile the shared cache spatial reuse distance of an application, run the following command.
49+
50+ HPCRUN_WP_REUSE_PROFILE_TYPE="SPATIAL" HPCRUN_PROFILE_L3=true HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_REUSETRACKER -e MEM_UOPS_RETIRED: ALL_LOADS @100000 -e MEM_UOPS_RETIRED: ALL_STORES @100000 <./your_executable> your_args
51+
52+
53+ Usage on AMD
54+ ============
55+ - To profile the private cache temporal reuse distance of an application, run the following command.
56+
57+ HPCRUN_WP_REUSE_PROFILE_TYPE="TEMPORAL" HPCRUN_PROFILE_L3=false HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_AMD_REUSETRACKER -e IBS_OP@100000 -e AMD_L1_DATA_ACCESS@100000000 <./your_executable> your_args
58+
59+ - To profile the private cache spatial reuse distance of an application, run the following command.
60+
61+ HPCRUN_WP_REUSE_PROFILE_TYPE="SPATIAL" HPCRUN_PROFILE_L3=false HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_AMD_REUSETRACKER -e IBS_OP@100000 -e AMD_L1_DATA_ACCESS@100000000 <./your_executable> your_args
62+
63+ - To profile the shared cache temporal reuse distance of an application, run the following command.
64+
65+ HPCRUN_WP_REUSE_PROFILE_TYPE="TEMPORAL" HPCRUN_PROFILE_L3=true HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_AMD_REUSETRACKER -e IBS_OP@100000 -e AMD_L1_DATA_ACCESS@100000000 <./your_executable> your_args
66+
67+ - To profile the shared cache spatial reuse distance of an application, run the following command.
68+
69+ HPCRUN_WP_REUSE_PROFILE_TYPE="SPATIAL" HPCRUN_PROFILE_L3=true HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_AMD_REUSETRACKER -e IBS_OP@100000 -e AMD_L1_DATA_ACCESS@100000000 <./your_executable> your_args
70+
71+
72+ Attribution to Locations in Source Code
73+ =======================================
4374
44- - To attribute the detected uses, reuses and the total of reuse distances captured between each use-reuse pair to their locations in source code lines and program stacks,
75+ To attribute the detected uses, reuses and the total of reuse distances captured between each use-reuse pair to their locations in source code lines and program stacks,
76+ you need to take the following steps:
77+
78+ 1 . Compile the code that you want to profile using "-g" flag to allow for debugging.
79+
80+ 2 . To attribute the detected communications to their locations in source code lines and program stacks,
4581you need to take the following steps:
4682
4783a. Download and extract a binary release of hpcviewer from
4884http://hpctoolkit.org/download/hpcviewer/latest/hpcviewer-linux.gtk.x86_64.tgz
4985
5086b. Run ReuseTracker on a program to be profiled
5187
52- For example: HPCRUN_WP_REUSE_PROFILE_TYPE="TEMPORAL" HPCRUN_PROFILE_L3=false HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -o name_of_output_folder -e WP_REUSETRACKER -e MEM_UOPS_RETIRED: ALL_LOADS @100000 -e MEM_UOPS_RETIRED: ALL_STORES @100000 <./your_executable> your_args
88+ In Intel:
89+
90+ HPCRUN_WP_REUSE_PROFILE_TYPE="TEMPORAL" HPCRUN_PROFILE_L3=false HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_REUSETRACKER -e MEM_UOPS_RETIRED: ALL_LOADS @100000 -e MEM_UOPS_RETIRED: ALL_STORES @100000 <./your_executable> your_args
91+
92+ In AMD:
5393
94+ HPCRUN_WP_REUSE_PROFILE_TYPE="TEMPORAL" HPCRUN_PROFILE_L3=false HPCRUN_WP_REUSE_BIN_SCHEME=4000,2 HPCRUN_WP_CACHELINE_INVALIDATION=true HPCRUN_WP_DONT_FIX_IP=true HPCRUN_WP_DONT_DISASSEMBLE_TRIGGER_ADDRESS=true reusetracker-bin/bin/hpcrun -e WP_AMD_REUSETRACKER -e IBS_OP@100000 -e AMD_L1_DATA_ACCESS@100000000 <./your_executable> your_args
5495
5596c. Extract the static program structure from the profiled program by using hpcstruct
5697
57- hpcstruct <./your_executable>
98+ reusetracker-bin/bin/ hpcstruct <./your_executable>
5899
59100The output of hpcstruct is <./your_executable>.hpcstruct.
60101
61102d. Generate an experiment result database using hpcprof
62103
63- hpcprof -S <./your_executable>.hpcstruct -o <name of database > name_of_output_folder
104+ reusetracker-bin/bin/ hpcprof -S <./your_executable>.hpcstruct -o <name of database > < name of output folder >
64105
65106The output of hpcprof is a folder named <name of database >.
66107
67108e. Use hpcviewer to read the content of the experiment result database in a GUI interface
68109
69110hpcviewer/hpcviewer <name of database >
70111
71- Information on program stack and source code lines is available in the Scope column, and
72- the other columns display numbers of detected reuses, total time distance, and
73- total reuse distance that correspond to the source code lines.
112+ Information on program stack and source code lines is available in the Scope column,
113+ and the other columns display numbers of detected reuses, total time distance,
114+ and total reuse distance that correspond to the source code lines.
0 commit comments