Skip to content

fix(content-type): fix ensure parameter causing duplicate entries and skipped items in paginated results#35300

Open
rjvelazco wants to merge 8 commits intomainfrom
issue-35213-bug-content-drive-dotassetfileasset-duplicate-in-content-type-filter-after-reload-and-become-unselectable
Open

fix(content-type): fix ensure parameter causing duplicate entries and skipped items in paginated results#35300
rjvelazco wants to merge 8 commits intomainfrom
issue-35213-bug-content-drive-dotassetfileasset-duplicate-in-content-type-filter-after-reload-and-become-unselectable

Conversation

@rjvelazco
Copy link
Copy Markdown
Member

@rjvelazco rjvelazco commented Apr 13, 2026

What's the problem?

The /api/v1/contenttype endpoint supports an ensure parameter — a way to guarantee that a specific content type appears in the first page of results, even if it wouldn't land there in normal alphabetical order. Content Drive uses this so that when you reload the page with a content type already selected in the URL (e.g. ?filters=contentType:dotAsset), the multiselect dropdown can immediately show your pre-selected item.

The problem: the pagination math didn't account for the slot that the ensure item "borrowed" from page 1. This caused two bugs at once:

Bug 1 — The ensured item appears twice

Consider 7 content types sorted alphabetically and a request for 3 per page with ensure=FileAsset (which naturally lives at position 5):

Page Expected Actual (before fix)
1 (offset=0) [Blog, DotAsset, FileAsset] [Blog, DotAsset, FileAsset]
2 (offset=3) [Generic, News, Product] [Generic, News, FileAsset] ❌ FileAsset again!
3 (offset=6) [Widget] [Widget]

Bug 2 — A content type goes missing forever

Because FileAsset was "borrowed" to page 1, it displaced the item that should have been in the 3rd slot. That item — Generic in the example — gets pushed to page 2. But the offset for page 2 was never adjusted, so the database query skips right over it:

Content type Natural position Where it ended up
Blog 0 Page 1 ✅
DotAsset 1 Page 1 ✅
FileAsset 5 Page 1 ✅ (via ensure)
Generic 2 Nowhere — permanently skipped
News 3 Page 2 ✅
Product 4 Page 2 ✅
Widget 6 Page 3 ✅

So after scrolling through the entire list, FileAsset showed up twice and Generic never appeared at all.


Root Cause

Both bugs trace back to two lines in ContentTypeAPIImpl.search():

Problem 1 — The ensure item lookup was called with the client's offset:

// Before: on page 2 (offset=3), this query asks for 'FileAsset' starting at row 3.
// FileAsset is only 1 row — OFFSET 3 skips past it and returns nothing.
// With no IDs in the exclusion set, the normal query returns FileAsset again at its natural position.
contentTypeFactory.find(requestedContentTypes, null, offset, limit, orderBy);

Problem 2 — The normal paginated query used the raw client offset, unaware that page 1 had "borrowed" a slot:

// Before: on page 2, offset=3 skips positions 0,1,2 — but position 2 was displaced by ensure on page 1.
// That displaced item is never reachable.
performSearch(..., offset, includedIds);

The Fix

Three targeted changes in ContentTypeAPIImpl.search():

  1. Always look up ensure items from offset=0 — their IDs are needed every time to build the exclusion list for the normal query.
  2. Only add ensure items to the response on page 1 (offset == 0) — subsequent pages don't re-include them.
  3. Subtract ensureCount from the offset on pages 2+ — this shifts the window back so the displaced item is no longer skipped.
// After: ensure items always resolved from offset=0
contentTypeFactory.find(requestedContentTypes, null, 0, limit, orderBy);

// Their IDs always go into the exclusion set — prevents them reappearing in normal results
includedIds.add(contentType.inode());
includedIds.add(contentType.id());

// Only add to the response body on page 1
if (offset == 0 && ...) returnTypes.add(contentType);

// Adjust the offset on pages 2+ to recover displaced items
int adjustedOffset = offset > 0
    ? Math.max(0, offset - authorizedIncluded.size())
    : offset;

Result with the fix:

Page offset adjustedOffset Normal query returns Final page
1 0 0 Blog, DotAsset (limit=2, FileAsset excluded) Blog, DotAsset, FileAsset
2 3 2 Generic, News, Product (offset 2, FileAsset excluded) Generic, News, Product
3 6 5 Widget Widget

All 7 items returned exactly once.


Proposed Changes

  • ContentTypeAPIImpl.search() — fix ensure item lookup offset, gate page-1-only inclusion, apply adjusted offset to normal paginated query

Additional Info

This fix covers the single base-type path (the path taken by Content Drive when no base-type filter is active — the scenario that triggered the original bug report). The multi-type UNION path (ContentTypeFactoryImpl.searchMultipleTypes) has analogous code and the same latent bugs, but it is only reached when multiple base-type filters are selected simultaneously — a separate, lower-priority follow-up.

Videos

Issue

video.mov

Fix

video.mov

This PR fixes: #35213

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 13, 2026

Claude finished @nollymar's task in 3m 38s —— View job


PR Review

  • Read changed files
  • Analyze the fix logic
  • Review test coverage
  • Post findings

The core fix (always resolve ensure items from offset=0, only emit them on page 1, always exclude their IDs) is sound for the described scenario. Two latent bugs in the offset adjustment math are worth flagging.


Bug 1 — Offset drift when the ensure item sits inside a later page's DB batch

performSearch advances its internal rollingOffset by the raw batch count (rollingOffset += rawContentTypes.size()), not by the filtered count. When an ensure item is excluded in-memory mid-batch, the rolling offset overshoots, and the next client page starts at the same DB row — producing a duplicate.

Concrete example with ensure=D (natural position 3), limit=3:

Step DB offset DB returns After exclude D rollingOffset after
Page 2, pass 1 2 C, D, E C, E 5
Page 2, pass 2 5 F F
Page 2 result C, E, F
Page 3, pass 1 5 (=6−1) F, G F, G
Page 3 result F, G ← duplicate

This triggers whenever adjustedOffset lands inside a window where the excluded ensure item is present and triggers the while-loop's extra fetch. The fix would need to track the actual last DB row consumed by a page and feed that as the next page's adjustedOffset — not just offset - ensureCount. Fix this →


Bug 2 — Offset over-adjusted when the ensure item is already in page 1's natural range

adjustedOffset = offset - authorizedIncluded.size() always subtracts ensureCount on pages 2+, regardless of whether the ensure item actually displaced anything. If the ensure item sorts naturally into page 1 (e.g. ensure=B where B is at position 1, limit=3), page 1 looks correct but page 2 starts one row too early — causing the last item from page 1 to reappear.

Concrete example with ensure=B (natural position 1), limit=3:

Page adjustedOffset DB returns (excludes B) Result
1 0 A, C (limit 2) A, B, C (B prepended) ✅
2 2 (=3−1) C, D, E C, D, E ← C duplicated ❌

The adjustment should only subtract the count of ensure items whose natural positions are below offset (i.e. they were truly displaced from the current page's window). Fix this →


Minor: Tests don't cover the edge cases above

All three ensure tests use items at positions 5 or 4–5 of 7 with limit=3 — well beyond the page 1 boundary. No coverage for:

  • ensure item at natural position 0, 1, or 2 (within page 1's natural range → Bug 2)
  • ensure item at position 3 (the exact page boundary → Bug 1)

The reported regression is fixed for the production scenario (dotAsset/FileAsset late in the sort order), but the offset math remains fragile for edge cases.


Unrelated: removal of inode() from the exclusion set (line ~988–994)

The change from dual inode()+id() to id() only is clean and consistent within this method. As long as no other code path adds to includedIds via inode() (none visible here), this is safe.


Bottom line: The fix resolves the original bug (FileAsset/dotAsset duplicating on page 2). Two correctness gaps exist in the adjustedOffset calculation for ensure items near or within page 1's natural range — these should be covered by tests and fixed before the offset math is relied on more broadly.

@rjvelazco rjvelazco changed the title fix(Content Drive): duplicated contenttype in Content Drive filter li… fix(content-type): fix ensure parameter causing duplicate entries and skipped items in paginated results Apr 13, 2026
…set-duplicate-in-content-type-filter-after-reload-and-become-unselectable
…set-duplicate-in-content-type-filter-after-reload-and-become-unselectable
…set-duplicate-in-content-type-filter-after-reload-and-become-unselectable
Content type id and inode are the same value, so tracking both in
includedIds was redundant. Removed duplicate inode() calls in favor
of id() only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nollymar nollymar enabled auto-merge April 15, 2026 22:26
…set-duplicate-in-content-type-filter-after-reload-and-become-unselectable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[BUG] Content Drive: dotAsset/FileAsset duplicate in content type filter after reload and become unselectable

4 participants