You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support multi-version Lua (5.3/5.4/5.5) with dynamic detection
Major changes:
- Add dynamic Lua version detection via DWARF debug info
- Split lua.h into version-specific headers (lua_5_3_6.h, lua_5_4_0.h, lua_5_5_0.h)
- Build separate BPF skeletons for each Lua version
- Use DWARF to precisely locate the L variable position
Bug fixes:
- Fix tsslen macro in lua_5_4_0.h: use shrlen != 0xFF check instead of tt field
- Fix lua_get_file in lua_5_5_0.h: use strisshr() for short string detection
- Fix build.rs typo: cargo:cargo -> cargo:rerun-if-changed
- Add missing rerun-if-changed for new lua header files
Other changes:
- Update dependencies to latest versions
- Change default sample frequency from 1000Hz to 100Hz
- Bump version to 0.2
`lua-perf` is a performance profiling tool implemented based on `eBPF`, currently supporting `Lua 5.4`.
4
+
`lua-perf` is a performance profiling tool implemented based on `eBPF`, supporting `Lua 5.3`, `Lua 5.4`, and `Lua 5.5`.
5
5
6
6
## Features
7
7
8
8
- Provides performance analysis for mixed `C` and `Lua` code, as well as pure `C` code.
9
9
- Uses stack sampling technique with minimal performance impact on the target process, making it suitable for production environments.
10
10
- Performs stack backtracing in the kernel space using `eh-frame`, eliminating the need for the target process to use the `-fno-omit-frame-pointer` option to preserve stack frame pointers.
11
+
- Automatically detects the Lua version of the target process, no manual specification required.
12
+
- Precisely locates the `L` variable position via DWARF debug information, supporting GCC/Clang O0~O3 optimization levels.
11
13
12
14
## Requirements
13
15
@@ -19,7 +21,7 @@ To use `lua-perf`, make sure you meet the following requirements:
19
21
20
22
To generate flame graphs, you need to use `lua-perf` in conjunction with the [FlameGraph](https://github.com/brendangregg/FlameGraph.git) tool. Here's how you can do it:
21
23
22
-
1. First, run the command `sudo lua-perf -p <pid> -f <HZ>` to sample the call stacks of the target process and generate a `perf.fold` file in the current directory. `<pid>` is the process ID of the target process, which can be a process inside a Docker container or a process on the host machine. `<HZ>` is the stack sampling frequency, with a default value of `1000` (1000 samples per second).
24
+
1. First, run the command `sudo lua-perf -p <pid> -f <HZ>` to sample the call stacks of the target process and generate a `perf.fold` file in the current directory. `<pid>` is the process ID of the target process, which can be a process inside a Docker container or a process on the host machine. `<HZ>` is the stack sampling frequency, with a default value of `100` (100 samples per second).
23
25
24
26
2. Next, convert the `perf.fold` file to a flame graph by running `./FlameGraph/flamegraph.pl perf.folded > perf.svg`.
25
27
@@ -35,15 +37,13 @@ In the BPF program, bpf_trace_printk is used to print logs. If you suspect any a
35
37
```
36
38
sudo mount -t tracefs nodev /sys/kernel/tracing
37
39
sudo cat /sys/kernel/debug/tracing/trace_pipe
38
-
These commands will help you access the logs and view them. If you have any further questions, feel free to ask.
39
40
```
40
41
41
42
## Known Issues
42
43
43
44
`lua-perf` currently has the following known issues:
44
45
45
46
- Lack of support for `CFA_expression`, which may result in failed stack backtracing in extreme cases.
46
-
- When analyzing Lua stacks, the search for the `L` pointer is currently done by assuming it is stored in register `rbx`, which is correct for most cases with `GCC -O2`. However, depending on the optimization level of GCC, the value of `L` may be stored in a different register, leading to failures in Lua stack analysis.
47
47
- The analysis of `CFA` instructions does not handle `vdso` at the moment, causing stack backtracing failures for function calls in `vdso`.
48
48
- The process of merging C stacks and Lua stacks uses a heuristic strategy, which may have some flaws in extreme cases (none have been found so far).
49
49
@@ -53,7 +53,5 @@ The following tasks are planned for `lua-perf`:
53
53
54
54
- Support for `CFA_expression`
55
55
- Support for `vdso`
56
-
- Dynamic analysis of the `L` register
57
56
- Optimization of the merging strategy for C stacks and Lua stacks
0 commit comments