Remove the modulo operations in spsc by sgued · Pull Request #652 · rust-embedded/heapless

sgued · 2026-04-02T07:21:06Z

These modulo operations used to be well optimized when N was a power of 2 However, the consumer and producer now use view-types that make N runtime dependant, preventing the compiler from optimizing these modulo operations even when N is always a power of 2.

This patch leverages the fact that head and tail are always kept lower than N to replace the modulo operations with a simple if, which gets optimized pretty well by the compiler and no branch is left.

Closes #650

sgued · 2026-04-02T07:21:58Z

I wish we had some benchmarking infra in place. This would help me make sure this actually solves the problem and would have helped detect it in the first place.

sgued · 2026-04-02T07:45:01Z

Clippy lint fixes are already included in #644

sgued · 2026-04-02T07:45:37Z

@823984418 does this fix your performance issues?

These modulo operations used to be well optimized when N was a power of 2 However, the consumer and producer now use view-types that make N runtime dependant, preventing the compiler from optimizing these modulo operations even when N is always a power of 2. This patch leverages the fact that `head` and `tail` are always kept lower than N to replace the modulo operations with a simple if, which gets optimized pretty well by the compiler and no branch is left. Closes rust-embedded#650

jannic · 2026-04-07T17:00:12Z

src/spsc.rs

            let head = self.rb.head.load(Ordering::Relaxed);

-            let i = (head + self.index) % self.rb.n();
+            let i = head + self.index;


Could this (theoretically) overflow? If n is larger than half of the usize::MAX and the queue already wrapped around so head is close to the end of the underlying array, and index is large?

(Very unlikely in practice, and I only had this idea because I wondered why QueueInner::len uses wrapping_add/wrapping_sub).

Good catch.

This would be a bug even with the original implementation if n is not a divisor of usize::MAX.

I don't think this is the same as for len. in len we're doing wrapping operations because we "know" it's going to go negative (thus wrap) since current_head > current_tail. We could in theory remove the wrapping operation by changing the order (but then it's sensible to the same bug, where it could in theory overflow if N is too close to usize::MAX).

I'll take a look at fixing this.

We could in theory remove the wrapping operation by changing the order (but then it's sensible to the same bug, where it could in theory overflow if N is too close to usize::MAX)

Yes, sorry, my comment about len was a bit short. Of course, as it's written currently, the wrapping_sub/add calls are required. I was thinking about changing that by changing the order. But then:
a) as you noticed, it would have the same potential overflow issue
b) wrapping operations might be more efficient, because they don't require any overflow checks (if they are enabled, e.g. in debug mode or by setting overflow-checks = true in [profile.release])

So I decided to not suggest removing the wrapping operations.

I added a fix for this in f438685

Thanks Jannic for the find: rust-embedded#652 (comment) If N == usize::MAX, there is the possibility of a panic in len() If N >= usize::MAX, then in the iterator code, self.index + self.head could overflow The operations are now slightly more complex and a bit slower, but thanks to compiler optimization they don't introduce branches, only conditional instructions (cmov, csel, it). All added tests fail without the fix

sgued force-pushed the rem-perf branch from 24e86eb to 4ed150a Compare April 7, 2026 10:02

sgued force-pushed the rem-perf branch from 4ed150a to 0950363 Compare April 7, 2026 10:05

jannic mentioned this pull request Apr 7, 2026

The %(rem) operation seems to cause unexpected overhead. #650

Open

jannic reviewed Apr 7, 2026

View reviewed changes

sgued force-pushed the rem-perf branch from e0ea5f8 to 3516395 Compare April 9, 2026 21:06

sgued force-pushed the rem-perf branch from 3516395 to f438685 Compare April 9, 2026 21:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the modulo operations in spsc#652

Remove the modulo operations in spsc#652
sgued wants to merge 2 commits intorust-embedded:mainfrom
sgued:rem-perf

sgued commented Apr 2, 2026

Uh oh!

sgued commented Apr 2, 2026

Uh oh!

sgued commented Apr 2, 2026

Uh oh!

sgued commented Apr 2, 2026

Uh oh!

jannic Apr 7, 2026

Uh oh!

sgued Apr 7, 2026 •

edited

Loading

Uh oh!

jannic Apr 8, 2026

Uh oh!

sgued Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sgued commented Apr 2, 2026

Uh oh!

sgued commented Apr 2, 2026

Uh oh!

sgued commented Apr 2, 2026

Uh oh!

sgued commented Apr 2, 2026

Uh oh!

jannic Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

sgued Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jannic Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

sgued Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sgued Apr 7, 2026 •

edited

Loading

sgued Apr 9, 2026 •

edited

Loading