bdev/rbd: integrate SpdkContextWQ for reactor thread execution#17
Open
baum wants to merge 614 commits intoceph:ceph-nvmeof-v25.09from
Open
bdev/rbd: integrate SpdkContextWQ for reactor thread execution#17baum wants to merge 614 commits intoceph:ceph-nvmeof-v25.09from
baum wants to merge 614 commits intoceph:ceph-nvmeof-v25.09from
Conversation
The keep alive timeout should be less than the admin or IO timeout. Otherwise, it defeats the purpose of the keep alive feature, which is meant to detect communication failures i.e. communication failure should be detected before the expiration of any IO or admin command, rather than after the keep alive timeout. Change-Id: I16dc9064e9d1127d15549d2e5946cf5c23282b57 Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27006 Reviewed-by: Marcin Gałecki <marcin.galecki@dell.com> Reviewed-by: Krzysztof Goreczny <krzysztof.goreczny@dell.com> Community-CI: Mellanox Build Bot Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Reviewed-by: Jim Harris <jim.harris@nvidia.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com>
Keep Alive is not just about the controller detecting failures. It also enables the host to detect either transport or controller failures. See the relevant sections of the NVMe specification rev 2.2 for details. The Linux NVMe/bdev implementation also reuses existing I/O and admin timeout mechanisms and overrides the request timeout value with KATO. This approach makes sense, as it is up to the upper layer to decide how to handle timeouts when tracking is requested and enabled (by using spdk_nvme_ctrlr_register_timeout_callback). However, the Linux implementation uses traffic based keep alive rather than command based. 3.9 Keep Alive The Keep Alive capability uses the Keep Alive Timer on a controller as a watchdog timer to detect communication failures (e.g., transport failure, host failure, or controller failure) between a host and a controller. 3.9.3.2 Command Based Keep Alive on the Host (...) If a host detects a Keep Alive Timeout and has outstanding commands for which that host has not received completions (refer to section 3.4.5), then it is strongly recommended that the host take the steps described in section 9.6 to avoid possible data corruption caused by interaction between outstanding commands and subsequent commands submitted by that host to another controller. (...) The host detects a Keep Alive Timeout if the host sends a Keep Alive command and does not receive a completion for the Keep Alive command before KATT elapses from when the Keep Alive command was sent. Section 9.6 describes Communication Loss Handling and is a subject to further improvements. Change-Id: Ifa614c3ccbc0852b0a712286ae05f193c4604f35 Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27007 Reviewed-by: Marcin Gałecki <marcin.galecki@dell.com> Reviewed-by: Jim Harris <jim.harris@nvidia.com> Community-CI: Mellanox Build Bot Reviewed-by: Krzysztof Goreczny <krzysztof.goreczny@dell.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com>
Two branches were removed from hot path, and readability was increased as the admin qpair is now handled within a single scope. Change-Id: Iebfa1bfb8eac7b314260846f846d1394078f2739 Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27017 Community-CI: Mellanox Build Bot Reviewed-by: Krzysztof Goreczny <krzysztof.goreczny@dell.com> Reviewed-by: Jim Harris <jim.harris@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Marcin Gałecki <marcin.galecki@dell.com>
Change ONCS bits naming to conform to NVMe 2.2 specification definitions. This alignment will allow for easier grepping through the code. Change-Id: I712bf82af6b0d7a7c842b7e278592575096725ba Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27306 Reviewed-by: Marcin Gałecki <marcin.galecki@dell.com> Community-CI: Mellanox Build Bot Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Reviewed-by: Jim Harris <jim.harris@nvidia.com>
Change FUSES bits naming to conform to NVMe 2.2 specification definitions. This alignment will allow for easier grepping through the code. Change-Id: Ie4be86c5d61761e327b3122e613e9a4da9ca2cbe Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27307 Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Community-CI: Mellanox Build Bot Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Marcin Gałecki <marcin.galecki@dell.com> Reviewed-by: Jim Harris <jim.harris@nvidia.com>
ncurses-devel : Header and development files for ncurses ncurses-libs : Ncurses Libraries ncurses-compat : Ncurses compatibility libraries found in spdk/spdk-ci#132 Change-Id: Ie3512e0b1bdfd3beb846afc0f27815e09716719a Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27330 Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Jim Harris <jim.harris@nvidia.com>
If AIO is a backend block device then we're in the kernel realm in terms of task status and 'uninterruptible sleep' is a valid state for the SPDK process to be in. AIO is used for interrupt mode testing. Fix busy CPU check by treating it the same way as R (Running). Long term fix, as suggested in below comment, would be to track TSC in reactor struct, however it requires more work. This one should resolve the issue short-term. https://review.spdk.io/c/spdk/spdk/+/26187/comment/1ef59015_332894ec/ Fixes spdk#3634 Change-Id: Id1d8b2f0b1a45187bfa2e95edfc12511534b5bc8 Signed-off-by: Krzysztof Goreczny <krzysztof.goreczny@dell.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27326 Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com>
regex without rpc_ prefix will catch more cases Change-Id: If2d7de685dd4461d9e9ede58d095ea1842221173 Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27334 Reviewed-by: Jim Harris <jim.harris@nvidia.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com>
Update submodule after recent cherry-pick with fixes for building with clang 21. Change-Id: I4a227a59c3045119d5abced3f9649b8db8b766c4 Signed-off-by: Karol Latecki <karol.latecki@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27344 Community-CI: Mellanox Build Bot Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com>
Explicitly install lld linker, as we require it for clang builds, and it is not shipped by default in some distros (e.g. Fedora 42 and 43). Lld is installed instead of ld.gold. Gold is deprecated in Fedora 43. Change-Id: Iee0eeacd879428dd8f01edbe8479a2906f3aea8b Signed-off-by: Karol Latecki <karol.latecki@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27249 Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Community-CI: Mellanox Build Bot Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com>
renamed 'name' to 'key_name' to match c-code, schema and python found when linting c-code and schema Change-Id: I84c1dfabc96b89f00cead56a6310cb8752ac7f77 Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27336 Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Jim Harris <jim.harris@nvidia.com>
'no_auto_visible' moved inside the 'namespace' object found when linting c-code and schema Change-Id: Ic847b395082b53e139c1e2190e0ba9a1a3143caa Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27337 Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Jim Harris <jim.harris@nvidia.com>
renamed 'property' to 'ftl_property' to match c-code, schema and python found when linting c-code and schema Change-Id: I335a575f6058726b12072dfe5f021e60e8961955 Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27350 Reviewed-by: Jim Harris <jim.harris@nvidia.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com>
use RPC NAME with prefix "_decoders" This is a continuation of earlier work linting is done in next separate patch Change-Id: I1992ff1b99a636dd65aeeffbd97f83620fd638de Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27311 Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <jim.harris@nvidia.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com>
Stats are protected by lock in bdev_nvme_update_nvme_error_stat (exec on IO threads) whereas missed in bdev_nvme_reset_device_stat (exec on app thread). This can lead to actually not clearing the stats while requested. Stats are also not protected in bdev_nvme_dump_device_stat_json (exec on app thread) but as it doesn't race with bdev_nvme_reset_device_stat this seems fine. While here added assertion checks for app thread in functions using module's interfaces to reset/dump stats. Change-Id: I166aa47e7c97c950df9b92cd736a11f44e3cec9c Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27160 Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Community-CI: Mellanox Build Bot Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Reviewed-by: Changpeng Liu <changpeliu@tencent.com>
This seems harmless, as nbdev is meant to be released soon, but it is not done immediately because spdk_io_device_unregister is async, and some flows might still try to get the first nvme_ns from that list. Found by code inspection. Change-Id: Ic3b07eb836d9c43a8e746418f405bc2fe9ac8ae7 Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27162 Reviewed-by: Changpeng Liu <changpeliu@tencent.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Community-CI: Mellanox Build Bot
It was bool and occupied 1 byte followed by 3 byte hole in 2st cache line. By changing to unamed struct we can occupy 4 bytes and have 31 bits reserved for other flags. This is a prework patch where new flag will be added. Change-Id: I9e9d36d4fb4f39c31409574f01ae3a2f55f028b1 Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27163 Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Changpeng Liu <changpeliu@tencent.com> Community-CI: Mellanox Build Bot
This aggregates flag into 2nd cache line and allows to get rid of 3/7 byte holes. Holes are marked explicitly now. By having 4/8 reserved fields it is easier to put some u32 or pointer there without impacting the size or shifting other fields. Reserved fields can be used later without impacting ABI. Change-Id: I055eb67590bf68db271b5ea0ff89f1a2edd96983 Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27164 Community-CI: Mellanox Build Bot Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Changpeng Liu <changpeliu@tencent.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com>
The module’s .get_memory_domains function can be invoked on non app thread during spdk_bdev_open_ext_v2, while some of the objects e.g. in bdev/nvme module (nvme_ns, ctrlr) are modified (or even released) on the app thread causing races and leading to undefined behavior. The fix is to cache the memory_domains_supported in the bdev itself during spdk_bdev_register. This already runs on the app thread and is considered thread safe. There are other SPDK in-tree spdk_bdev_get_memory_domains usages in bdev's rpc hence should run on app thread already. As there is no clear thread contract for spdk_bdev_get_memory_domains app thread assertion check is not included. To make it fully thread safe either contract needs to be added or all domains cached in bdev. Change-Id: I3a6f63ef6d846e45e4f58e36578d4e8d45f95554 Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27165 Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Community-CI: Mellanox Build Bot Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Changpeng Liu <changpeliu@tencent.com>
It is just for readability - to give the logical operation a name. Change-Id: I5719ae95fabac735503b410beea1792cac4dae81 Signed-off-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27222 Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeliu@tencent.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com>
For better memory related debuggability and accounting, this patch adds an API that retrieves the stats related to the heaps. Change-Id: I8538e47d64f4796407e6a243dd28c0d8731e8c81 Signed-off-by: Umang patel <umang.patel@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27117 Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Community-CI: Mellanox Build Bot Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com>
By default "install" and "uninstall" actions don't operate in virtual env. "up pip uninstall" does not create a venv itself and expects on to be already present before uninstalling, which results in "No virtual environment found" error. Check if we're running in a vritual env. If not then add "--system" flag so that system Python is used instead. Fixes spdk#3799 Change-Id: Ie4d7f34efcbaa5f5694637d92349d404b14186ee Signed-off-by: Karol Latecki <karol.latecki@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27363 Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com>
spdk_nvmf_ctrlr_connect has no callsites and no test covreage and we believe no one is using it anymore. Mark as deprecated and planned removal in 26.05 Update out-of-date comments that referenced this function. Change-Id: I936ee9cdf4ec3b0e3f36ad6a99f37be583c42f6e Signed-off-by: Joel Cunningham <joel.cunningham@oracle.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27360 Reviewed-by: Jim Harris <jim.harris@nvidia.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Community-CI: Mellanox Build Bot
Moves the name copy ahead of othe first log. This will allow us ensure the logging consistently includes the bdev name in a subsequent change. Change-Id: I27ace531bdc2eb4a9b6a48e62b78eec7a0c89942 Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27358 Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com>
Saves bdev_io into variable to simplify completion calls. Change-Id: I2f7a50085129b8e20f5e88913840a9a9b4abe0da Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27361 Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com>
Helps identify the affected bdev Keeps consistency across several logs. Change-Id: I35d9af56c99f7901917b6a5165ab402c4be1bdc7 Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27340 Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com>
For details see a14a949 bdev: clarify thread contract for spdk_bdev_unregister[_by_name] 57358e8 bdev: clarify thread contract for spdk_bdev_finish Change-Id: Icbd83d63d3941e5b75a2a43aa9cc653d85c66ca2 Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27359 Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com>
The help message can be ignored; the group still enforces proper mutual exclusivity. Also adjusted help messages per gerrit review. Change-Id: If4866d7d29af5860a95ebe8e35c02348a2194fcb Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27348 Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz@tzawadzki.com>
found by linting the code Change-Id: I17cd814f571539e4b185868a4d15c528a743cc2d Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27342 Reviewed-by: Jim Harris <jim.harris@nvidia.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com>
Fixes spdk#3786 After glibc update we no longer can do pthread_join() in exit handler or we get deadlocked. Please see relevant glibc commits - [1] and [2]. This affected only the two fuzzing applications due to LLVMFuzzerRunDriver() calling exit() on its own. See explanation here [3]. [1] https://sourceware.org/git/?p=glibc.git;a=commit;h=f6ba993e0cda0ca5554fd47b00e6a87be5fdf05e [2] https://sourceware.org/git/?p=glibc.git;a=commit;h=c6af8a9a3ce137a9704825d173be22a2b2d9cb49 [3] (0674ead) llvm_nvme_fuzz: add exit handler Change-Id: If2f1c641704f5b1a2ba2026028d26915bcbef03c Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@nutanix.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27265 Reviewed-by: Jacek Kalwas <jacek.kalwas@nutanix.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Karol Latecki <karol.latecki@nutanix.com> Reviewed-by: Jim Harris <jim.harris@nvidia.com>
Motivation: See in NVM Express: "Receipt of reserved coded values in defined fields in commands shall be reported as an error" The real issue with code before fix was that ESX initator that supports XCOPY sent Set-feature HBS to mode that supports XCOPY. Target accepted the command since 'struct spdk_nvme_host_behavior' was extended in TP supporting XCOPY. As a result initiator tries to perform hardware accelerated xcopy that is not supported by the target. After applying the fix ESX initiator for volume copy instead of using modern XCOPY performs bunch of reads and writes. Change-Id: I9552e7b2bdc214c3309a1d51549b2e4398f62f04 Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit df30d3e) (cherry picked from commit c51d93d)
The new code handles reservation when nvme host
is connected to different nvmeof GWs.
Each nvmeof GW starts spdk for handling nvmf-tgt.
The feature is implemented partually in new spdk
file that redefines handlers of reservation
'load' and 'update' operations.
Operations related specifically to rbd device are
called from the new spdk file via commonly used
bdev->fn_table and handled by bdev_rbd module.
Required:
1. spdk reservation logic not changed. Just
reservation ops (load, update) are redirected to
other file
2.reservation info is stored in some
persistent place per namespace - rbd image metadata.
3.Change of NS reservation in one of the GWs
should trigger loading of reservation in other
GW's spdk modules (rbd watcher logic).
Added component 'reservation' for logging
check-format script was run
Change-Id: I5b187c6ac5eab123fb488c0e6030a668272e81f0
Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
(cherry picked from commit 382b568)
(cherry picked from commit 6edd5fd)
implemented TP4097 for nvmf target Cancel command was added to the NVMe protocol to improve the Abort mechanism. For example. Abort commands can be only sent on the Admin queue, and the Admin queue is limited in size, so we cannot Abort many commands at once. The commands that needs to be Aborted are in the IO queues, so it might take too long to get the command on the Admin queue, and look for it in the IO queue. Cancel command is implemented on the IO queue, and that gives much better chance to actually being able to catch and Abort. We can also issue as many Cancel commands as the queue allows for the IO commands implemented as synchronous flow added log component io-cancel, fixed spdk formatting Change-Id: I7d5448d697fa09f41549b4934aab81c2e13fc55b Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit d0b226a) (cherry picked from commit 14eadb7)
Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 7a564bd)
Signed-off-by: gadi-didi <gadi.didi@ibm.com> (cherry picked from commit 5db008e)
Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit a7b321f)
Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 6852fcc)
1.implemented tcp nvmef read/write IO statistics per controller statistics are stored in each QP and presented in buckets by io size in each bucket measured total delay , bdev delay, networking delay and QoS delay. min, max, mean values are dumped for each type. NVMf tcp io statistics are collected in tcp.c transport. 2.Also implemented new RPC to dump nvmf io stats per controller and to reset io stats. 3. alloc from pool for tcp_req_stats and zmalloc for qp_stats, 4. RPC for enable/disable stats Change-Id: I069c4d30e72d68a0b11cf598492c6e3c04ceee1b Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit dfa7458)
Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 5fb0c44)
Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 77d121f)
Add --system flag to uv pip install command (via USE_SYSTEM_PYTHON variable) to ensure the system Python3 interpreter is used when installing the SPDK Python package. This fixes the RPM dependency issue where the package was incorrectly requiring /var/spdk/dependencies/pip/bin/python3 instead of /usr/sbin/python3. Fixes issue spdk#3841 and addresses the uv dependency tracking issue documented at astral-sh/uv#13198 Before fix: rpm -qpR spdk-v26.05-1.x86_64.rpm | grep python /var/spdk/dependencies/pip/bin/python3 After fix: rpm -qpR spdk-v26.05-1.x86_64.rpm | grep python /usr/sbin/python3 Also fix %build working directory for generated specs. On Fedora 43 (RPM 4.20+), rpmbuild introduces a separate "build directory" with a -build suffix. When the spec is generated via rpmspec -P, the %setup macro is expanded to literal shell commands, so rpmbuild no longer sets %buildsubdir. This causes %build to start in the parent build directory instead of the source directory, making git submodule update --init fail with "not a git repository". Add an explicit cd into the source directory at the start of %build. In --build-in-place mode the cd fails harmlessly and is silently ignored. Change-Id: I31930ac55edbc0e6d4f46f6384d6f0fcdb721359 Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/27986 Reviewed-by: Jim Harris <jim.harris@nvidia.com> Tested-by: SPDK Automated Test System <spdkbot@gmail.com> Reviewed-by: Przemyslaw Wielgo <przemyslaw.wielgo@dell.com> Reviewed-by: Konrad Sztyber <ksztyber@nvidia.com> (cherry picked from commit 0e38790)
Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit cf59213) Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
added 3 commits
April 28, 2026 21:02
Add SpdkContextWQ implementation to schedule RBD I/O operations on SPDK reactor threads instead of ASIO thread pool. This includes: - New SpdkContextWQ class implementing ContextWQ interface - Reactor thread selection via bdev_rbd_find_reactor_thread() - Integration with with ContextWQ ceph API Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
Reactor threads are discovered and cached, and each new RBD bdev gets the next reactor in round-robin order so SpdkContextWQ work is spread across cores instead of a single reactor. Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
bdev/rbd: integrate SpdkContextWQ for reactor thread execution
Add SpdkContextWQ implementation to schedule RBD I/O operations on
SPDK reactor threads instead of ASIO thread pool. This includes: