aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2022-10-04perf test: Add kernel lock contention testGravatar Namhyung Kim 1-0/+73
Add a new shell test to check if both normal 'perf lock record' + contention and BPF (with -b) option are working. Use 'perf bench sched messaging' as a workload since it creates some contention for sending and receiving messages. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220924004221.841024-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf lock: Add -q/--quiet option to suppress header and debug messagesGravatar Namhyung Kim 2-11/+20
Like in 'perf report', this option is to suppress header and debug messages. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220924004221.841024-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf lock: Add -E/--entries optionGravatar Namhyung Kim 2-5/+25
Like in 'perf top', the -E option can limit number of entries to print. It can be useful when users want to see top N contended locks only. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220924004221.841024-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: waiting.sh: Parameterize timeoutsGravatar Adrian Hunter 1-9/+17
Let helper functions accept a parameter to specify time out values in tenths of a second. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220914080150.5888-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Move helper functions for waitingGravatar Adrian Hunter 2-64/+73
Move helper functions for waiting to a separate file so they can be shared. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220914080150.5888-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Add per-thread testGravatar Adrian Hunter 1-0/+247
When tracing the kernel with Intel PT, text_poke events are recorded per-cpu. In per-thread mode that results in a mixture of per-thread and per-cpu events and mmaps. Check that happens correctly. The debug output from perf record -vvv is recorded and then awk used to process the debug messages that indicate what file descriptors were opened and whether they were mmapped or set-output. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220912083412.7058-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf tools: Add debug messages and comments for testingGravatar Adrian Hunter 3-0/+12
Add debug messages to enable scripts to track aspects of 'perf record' behaviour. The messages will be consumed after 'perf record' has run, with the exception of "perf record has started" which is consequently flushed. Put comments so developers know which messages are also being used by test scripts. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Add more output in preparation for more testsGravatar Adrian Hunter 1-1/+11
When there are more tests it won't be obvious which test failed. Add more output so that it is. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Fix return checkingGravatar Adrian Hunter 1-3/+3
The use of set -e will cause a function that returns non-zero to terminate the script unless the result is consumed by || for example. That is OK if there is only 1 test function, but not if there are more. Prepare for more by using ||. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Use quotes around variable expansionGravatar Adrian Hunter 1-6/+6
As suggested by shellcheck, use quotes around variable expansion. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Use grep -c instead of grep plus wc -lGravatar Adrian Hunter 1-1/+1
As suggested by shellcheck, use grep -c instead of grep plus wc -l Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Stop using backticksGravatar Adrian Hunter 1-1/+1
As suggested by shellcheck, stop using backticks. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Stop using exprGravatar Adrian Hunter 1-3/+3
As suggested by shellcheck, stop using expr. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Fix redirectionGravatar Adrian Hunter 1-1/+1
As reported by shellcheck, 2>&1 must come after >/dev/null Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Use a temp directoryGravatar Adrian Hunter 1-4/+10
Create a directory for temporary files so that mktemp needs to be used only once. It also enables more temp files to be added without having to add them also to the cleanup. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: test_intel_pt.sh: Add cleanup functionGravatar Adrian Hunter 1-2/+16
Add a cleanup function that will still clean up if the script is terminated prematurely. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf tests: Fix 'perf probe' error log check in skip_if_no_debuginfoGravatar Athira Rajeev 1-1/+1
The perf probe related tests like probe_vfs_getname.sh which is in "tools/perf/tests/shell" directory have dependency on debuginfo information in the kernel. Currently debuginfo check is handled by skip_if_no_debuginfo function in the file "lib/probe_vfs_getname.sh". skip_if_no_debuginfo function looks for this specific error log from perf probe to skip the testcase: <<>> Failed to find the path for the kernel|Debuginfo-analysis is not supported <>> But in some case, like this one in powerpc, while running this test, observed error logs is: <<>> The /lib/modules/<version>/build/vmlinux file has no debug information. Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package. Error: Failed to add events. <<>> Update the skip_if_no_debuginfo function to include the above error, to skip the test in these scenarios too. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220916104904.99798-1-atrajeev@linux.vnet.ibm.com Reviewed-By: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf annotate: Toggle full address <-> offset displayGravatar Namhyung Kim 3-3/+26
Handle 'f' key to toggle the display offset and full address. Obviously it only works when users set to see disassembler output ('o' key). It'd be useful when users want to see the full virtual address in the TUI annotate browser. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220923173142.805896-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf tools: Add 'addr' sort keyGravatar Namhyung Kim 5-1/+43
Sometimes users want to see actual (virtual) address of sampled instructions. Add a new 'addr' sort key to display the raw addresses. $ perf record -o- true | perf report -i- -s addr # To display the perf.data header info, please use --header/--header-only options. # [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] # # Total Lost Samples: 0 # # Samples: 12 of event 'cycles:u' # Event count (approx.): 252512 # # Overhead Address # ........ .................. # 42.96% 0x7f96f08443d7 29.55% 0x7f96f0859b50 14.76% 0x7f96f0852e02 8.30% 0x7f96f0855028 4.43% 0xffffffff8de01087 Note that it just compares and displays the sample ip. Each process can have a different memory layout and the ip will be different even if they run the same binary. So this sort key is mostly meaningful for per-process profile data. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220923173142.805896-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf inject: Clarify build-id options a little bitGravatar Namhyung Kim 1-2/+4
Update the documentation of --build-id and --buildid-all options to clarify the difference between them. The former requires full sample processing to find which DSOs are actually used. While the latter simply injects every DSO's build-id from MMAP{,2} records, skipping SAMPLEs. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220923173142.805896-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf record: Fix a segfault in record__read_lost_samples()Gravatar Namhyung Kim 1-0/+6
When it fails to open events record__open() returns without setting the session->evlist. Then it gets a segfault in the function trying to read lost sample counts. You can easily reproduce it as a normal user like: $ perf record -p 1 true ... perf: Segmentation fault ... Skip the function if it has no evlist. And add more protection for evsels which are not properly initialized. Fixes: a49aa8a54e861af1 ("perf record: Read and inject LOST_SAMPLES events") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220909235024.278281-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf top: Fix error code in cmd_top()Gravatar Shang XiaoJing 1-0/+3
There are three error paths which return success: 1. Propagate the errno from evlist__create_maps() if it failed. 2. Return -EINVAL if top.sb_evlist is NULL. 3. Return -EINVAL if evlist__add_bpf_sb_event() failed. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220922141438.22487-4-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf stat: Merge cases in process_evlistGravatar Shang XiaoJing 1-3/+1
As two cases in process_evlist has same behavior, make the first fall through to the second. Commiter notes: Added __fallthrough, the kernel has "fallthrough", we need to make tools/ use it. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220922141438.22487-3-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf genelf: Fix error code in jit_write_elf()Gravatar Shang XiaoJing 1-0/+1
The error code is set to -1 at the beginning of jit_write_elf(), but it is assigned by jit_add_eh_frame_info() in the middle, hence the following error can only return the error code of jit_add_eh_frame_info(). Reset the error code to the default value after being assigned by jit_add_eh_frame_info(). Fixes: 086f9f3d7897d808 ("perf jit: Generate .eh_frame/.eh_frame_hdr in DSO") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stefano Sanfilippo <ssanfilippo@chromium.org> Link: https://lore.kernel.org/r/20220922141438.22487-2-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf lock contention: Skip stack trace from BPFGravatar Namhyung Kim 2-4/+6
Currently it collects stack traces to max size then skip entries. Because we don't have control how to skip perf callchains. But BPF can do it with bpf_get_stackid() with a flag. Say we have max-stack=4 and stack-skip=2, we get these stack traces. Before: After: .---> +---+ <--. .---> +---+ <--. | | | | | | | | | +---+ usable | +---+ | max | | | max | | | stack +---+ <--' stack +---+ usable | | X | | | | | | +---+ skip | +---+ | | | X | | | | | `---> +---+ `---> +---+ <--' <=== collection | X | +---+ skip | X | +---+ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220912055314.744552-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf lock contention: Allow to change stack depth and skipGravatar Namhyung Kim 4-9/+28
It needs stack traces to find callers of locks. To minimize the performance overhead it only collects up to 8 entries for each stack trace. And it skips first 3 entries as they came from BPF, tracepoint and lock functions which are not interested for most users. But it turned out that those numbers are different in some configuration. Using fixed number can result in non meaningful caller names. Let's make them adjustable with --stack-depth and --skip-stack options. On my setup, the default output is like below: # /perf lock con -ab -F contended,wait_total sleep 3 contended total wait type caller 28 4.55 ms rwlock:W __bpf_trace_contention_begin+0xb 33 1.67 ms rwlock:W __bpf_trace_contention_begin+0xb 12 580.28 us spinlock __bpf_trace_contention_begin+0xb 60 240.54 us rwsem:R __bpf_trace_contention_begin+0xb 27 64.45 us spinlock __bpf_trace_contention_begin+0xb If I change the stack skip to 5, the result will be like: # perf lock con -ab -F contended,wait_total --stack-skip 5 sleep 3 contended total wait type caller 32 715.45 us spinlock folio_lruvec_lock_irqsave+0x61 26 550.22 us spinlock folio_lruvec_lock_irqsave+0x61 15 486.93 us rwsem:R mmap_read_lock+0x13 12 139.66 us rwsem:W vm_mmap_pgoff+0x93 1 7.04 us spinlock tick_do_update_jiffies64+0x25 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220912055314.744552-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf lock contention: Show full callstack with -v optionGravatar Namhyung Kim 3-4/+57
Currently it shows a caller function for each entry, but users need to see the full call stacks sometimes. Use -v/--verbose option to do that. # perf lock con -a -b -v sleep 3 Looking at the vmlinux_path (8 entries long) symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols contended total wait max wait avg wait type caller 1 10.74 us 10.74 us 10.74 us spinlock __bpf_trace_contention_begin+0xb 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffbb8b8e75 bpf_trace_run2+0x35 0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb 0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5 0xffffffffbc1c26ff _raw_spin_lock+0x1f 0xffffffffbb841015 tick_do_update_jiffies64+0x25 0xffffffffbb8409ee tick_irq_enter+0x9e 1 7.70 us 7.70 us 7.70 us spinlock __bpf_trace_contention_begin+0xb 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffbb8b8e75 bpf_trace_run2+0x35 0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb 0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5 0xffffffffbc1c26ff _raw_spin_lock+0x1f 0xffffffffbb7bc27e raw_spin_rq_lock_nested+0xe 0xffffffffbb7cef9c load_balance+0x66c Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220912055314.744552-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf lock contention: Factor out get_symbol_name_offset()Gravatar Namhyung Kim 1-9/+19
It's to convert addr to symbol+offset. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220912055314.744552-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: Add basic core_wide expression testGravatar Ian Rogers 1-0/+13
Add basic test for coverage, similar to #smt_on. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220831174926.579643-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf metrics: Wire up core_wideGravatar Ian Rogers 8-43/+134
Pass state necessary for core_wide into the expression parser. Add system_wide and user_requested_cpu_list to perf_stat_config to make it available at display time. evlist isn't used as the evlist__create_maps, that computes user_requested_cpus, needs the list of events which is generated by the metric. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220831174926.579643-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf stat: Delay metric parsingGravatar Ian Rogers 3-18/+39
Having metric parsing as part of argument processing causes issues as flags like metric-no-group may be specified later. It also denies the opportunity to optimize the events on SMT systems where fewer events may be possible if we know the target is system-wide. Move metric parsing to after command line option parsing. Because of how stat runs this moves the parsing after record/report which fail to work with metrics currently anyway. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220831174926.579643-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf topology: Add core_wideGravatar Ian Rogers 4-0/+70
It is possible to optimize metrics when all SMT threads (CPUs) on a core are measuring events in system wide mode. For example, TMA metrics defines CORE_CLKS for Sandybrdige as: if SMT is disabled: CPU_CLK_UNHALTED.THREAD if SMT is enabled and recording on all SMT threads: CPU_CLK_UNHALTED.THREAD_ANY / 2 if SMT is enabled and not recording on all SMT threads: (CPU_CLK_UNHALTED.THREAD/2)* (1+CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE/CPU_CLK_UNHALTED.REF_XCLK ) That is two more events are necessary when not gathering counts on all SMT threads. To distinguish all SMT threads on a core vs system wide (all CPUs) call the new property core wide. Add a core wide test that determines the property from user requested CPUs, the topology and system wide. System wide is required as other processes running on a SMT thread will change the counts. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220831174926.579643-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf smt: Compute SMT from topologyGravatar Ian Rogers 6-101/+49
The topology records sibling threads. Rather than computing SMT using siblings in sysfs, reuse the values in topology. This only applies when the file smt/active isn't available. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220831174926.579643-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf expr: Move the scanner_ctx into the parse_ctxGravatar Ian Rogers 5-14/+11
We currently maintain the two independently and copy from one to the other. This is a burden when additional scanner context values are necessary, so combine them. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220831174926.579643-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf pmu: Remove perf_pmu_lex() needless declarationGravatar Gaosheng Cui 1-2/+0
It builds without it, perhaps with some older combination of flex/bison we needed this, clean it up a bit removing this. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220909044542.1087870-3-cuigaosheng1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf sort: Remove hist_entry__sort_list() and sort__first_dimension() ↵Gravatar Gaosheng Cui 1-2/+0
leftover declarations The hist_entry__sort_list and sort__first_dimension functions have been removed in commit cfaa154b2335d4c8 ("perf tools: Get rid of obsolete hist_entry__sort_list"), remove them. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220909044542.1087870-2-cuigaosheng1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf test: Skip sigtrap test on old kernelsGravatar Namhyung Kim 1-1/+64
If it runs on an old kernel, perf_event_open would fail because of the new fields sigtrap and sig_data. Just skipping the test could miss an actual bug in the kernel. Let's check BTF (when we have libbpf) if it has the sigtrap field in the perf_event_attr. Otherwise, we can check it with a minimal event config. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Song Liu <song@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # Using BTF to check for the struct members Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220908230150.4105955-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf sched: Factor out destroy_tasks()Gravatar Namhyung Kim 1-2/+22
Add destroy_tasks() as a counterpart of create_tasks() and put the thread safety notations there. After join, it destroys semaphores too. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220908225448.4105056-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf cpumap: Add range data encodingGravatar Ian Rogers 5-87/+166
Often cpumaps encode a range of all CPUs, add a compact encoding that doesn't require a bit mask or list of all CPUs. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.king@intel.com> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Kees Kook <keescook@chromium.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220614143353.1559597-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf events: Prefer union over variable length arrayGravatar Ian Rogers 4-34/+27
It is possible for casts to introduce alignment issues, prefer a union for perf_record_event_update. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.king@intel.com> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Kees Kook <keescook@chromium.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220614143353.1559597-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf vendor events: Update events for Neoverse E1Gravatar Nick Forrington 18-267/+2
These CPUs contain the same PMU events (as per the Arm Technical Reference manuals for Cortex A65 and Neoverse E1) This de-duplicates event data, and avoids issues in previous E1 event data (not present in A65 data) * Missing implementation defined events * Inclusion of events that are not implemented: - L1D_CACHE_ALLOCATE - SAMPLE_POP - SAMPLE_FEED - SAMPLE_FILTRATE - SAMPLE_COLLISION Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Nick Forrington <nick.forrington@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220907154932.60808-1-nick.forrington@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf timechart: Add p_state_end helperGravatar Shang XiaoJing 1-19/+18
Wrap repeated code in helper functions p_state_end, which alloc a new power_event recording last pstate, and insert to the head of tchart->power_events. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220908021141.27134-5-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf timechart: Add create_pidcomm helperGravatar Shang XiaoJing 1-12/+16
Wrap repeated code combined with alloc of per_pidcomm in helper function create_pidcomm. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220908021141.27134-4-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf lock: Add get_key_by_aggr_mode helperGravatar Shang XiaoJing 1-76/+53
Wrap repeated code in helper functions get_key_by_aggr_mode and get_key_by_aggr_mode_simple, which assign the value to key based on aggregation mode. Note that for the conditions not support LOCK_AGGR_CALLER, should call get_key_by_aggr_mode_simple directly. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220908021141.27134-3-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf trace: Use zalloc() to save initialization of syscall_statsGravatar Shang XiaoJing 1-4/+1
As most members of syscall_stats is set to 0 in thread__update_stats, using zalloc() directly. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220908021141.27134-2-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf vendor events arm64: Move REMOTE_ACCESS to "memory" categoryGravatar Nick Forrington 6-15/+9
Move REMOTE_ACCESS event from other.json to memory.json for Neoverse CPUs. This is consistent with other Arm (Cortex) CPUs. Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Nick Forrington <nick.forrington@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220908112519.64614-1-nick.forrington@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf intel-pt: Remove first line of log dumped on errorGravatar Adrian Hunter 1-5/+28
Instead of printing "(first line may be sliced)", always remove the first line of the debug log if the buffer has wrapped when dumping on error. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220905073424.3971-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf intel-pt: Support itrace option flag d+e to log on errorGravatar Adrian Hunter 4-5/+117
Pass d+e option and log size via intel_pt_log_enable(). Allocate a buffer for log messages and provide intel_pt_log_dump_buf() to dump and reset the buffer upon decoder errors. Example: $ sudo perf record -e intel_pt// sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.094 MB perf.data ] $ sudo perf config itrace.debug-log-buffer-size=300 $ sudo perf script --itrace=ed+e+o | head -20 Dumping debug log buffer (first line may be sliced) Other ffffffff96ca22f6: 48 89 e5 Other ffffffff96ca22f9: 65 48 8b 05 ff e0 38 69 Other ffffffff96ca2301: 48 3d c0 a5 c1 98 Other ffffffff96ca2307: 74 08 Jcc +8 ffffffff96ca2311: 5d Other ffffffff96ca2312: c3 Ret ERROR: Bad RET compression (TNT=N) at 0xffffffff96ca2312 End of debug log buffer dump instruction trace error type 1 time 15913.537143482 cpu 5 pid 36292 tid 36292 ip 0xffffffff96ca2312 code 6: Trace doesn't match instruction Dumping debug log buffer (first line may be sliced) Other ffffffff96ce7fe9: f6 47 2e 20 Other ffffffff96ce7fed: 74 11 Jcc +17 ffffffff96ce7fef: 48 8b 87 28 0a 00 00 Other ffffffff96ce7ff6: 5d Other ffffffff96ce7ff7: 48 8b 40 18 Other ffffffff96ce7ffb: c3 Ret ERROR: Bad RET compression (TNT=N) at 0xffffffff96ce7ffb Warning: 8 instruction trace errors Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220905073424.3971-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf intel-pt: Improve object code read error messageGravatar Adrian Hunter 1-1/+2
The offset is more readable in hex instead of decimal. Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220905073424.3971-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf intel-pt: Improve man page layout slightlyGravatar Adrian Hunter 1-0/+8
Improve man page layout slightly by adding blank lines. Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220905073424.3971-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>