Skip to content

fix: fix npe on pp after rename folder (#35260)#35298

Open
gortiz-dotcms wants to merge 4 commits intomainfrom
issue-35260-npe-pp-after-rename
Open

fix: fix npe on pp after rename folder (#35260)#35298
gortiz-dotcms wants to merge 4 commits intomainfrom
issue-35260-npe-pp-after-rename

Conversation

@gortiz-dotcms
Copy link
Copy Markdown
Member

@gortiz-dotcms gortiz-dotcms commented Apr 13, 2026

This PR fixes: #35260

Problem

Push-publishing a renamed folder (with content inside, or with child subfolders) failed in two ways:

  1. NPE in dependency processorPushPublishigDependencyProcesor looked up the folder by its old identifier, which no longer existed after rename, producing a NullPointerException on folder.getPath().
  2. Content not included in bundle — contentlets inside the renamed folder were excluded because their version_ts was not updated after the rename, so push-publish considered them unchanged. The same applied to content inside child subfolders.
  3. "Conflicts between Folders" on the receiver — child subfolders were arriving in the bundle with the same UUID they had before the rename. On the receiver side this caused a path mismatch that triggered a conflict error.

Root cause

These problems pre-date the recent folder-rename refactors (PRs #35086 and #35176). The rename workflow replaced the original per-item move() approach with a bulk SQL update of identifier.parent_path, but did not replicate two side effects that move() performed automatically:

  • Bumping contentlet_version_info.version_ts so push-publish detects the content as changed.
  • Giving child subfolders new deterministic identifiers (UUIDs) when their path changes.

Fix

Content not detected as changed

A bulk UPDATE contentlet_version_info SET version_ts is issued for all non-folder assets under the renamed path, followed by targeted IdentifierCache eviction per language. This replicates what the old per-contentlet move() did, but in a single SQL statement.

Child subfolder identity

Folder identifiers in dotCMS are deterministic: the UUID is a hash of assetType:hostname:parentPath:folderName. When a parent folder is renamed, every child subfolder's path changes — meaning each child must receive a new identifier. This was the original behavior before the rename refactors.

The fix restores that original approach: for each child subfolder (sorted parent-before-child so findFolderByPath resolves correctly), getNewFolderRecord() is called to create a new record with a new deterministic UUID. The bulk updateChildPaths SQL is kept for non-folder identifiers only (AND asset_type != 'folder'), and old child folder records are deleted in child-before-parent order to satisfy the check_child_assets DB trigger. Because every folder in the push bundle now carries a new UUID, the receiver always takes the create path in FolderHandler — no move() cascade, no conflict.

Removed bumpSubFolderModDateForSubtree

A previous attempt bumped folder.mod_date for child subfolders to force push-publish to include them. This is no longer needed: getNewFolderRecord() sets modDate = new Date(), so every new child folder record is automatically newer than any previous push date and will be included without manual bumping. The bump also caused the "Conflicts between Folders" receiver error and has been removed.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 13, 2026

Claude finished @gortiz-dotcms's task in 3m 59s —— View job


🔍 dotCMS Backend Review

  • Gather context and read conventions
  • Get Java diff
  • Run specialized sub-agent reviews (Security, Database, Java Standards, REST API)
  • Consolidate findings and post review

3 findings posted — all 🟡 Medium. Security and REST API sub-agents returned no issues. Full review in the separate backend review comment.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 13, 2026

🔍 dotCMS Backend Review

[🟡 Medium] dotCMS/src/main/java/com/dotmarketing/portlets/folders/business/FolderFactoryImpl.java:1092

Log injection (carry-over from previous review — still not fixed). rootPath is composed from user-supplied newName and is embedded directly into Logger.debug() without CR/LF sanitization. A user with folder-rename permission who creates a name containing \r\n can inject fake log entries (OWASP A09 log spoofing). Note: FolderAPIImpl sanitizes the name locally for its own logging, but that sanitization is not propagated to the factory.

Logger.debug(FolderFactoryImpl.class,
    "Bumped version_ts and evicted version-info cache for " + affected.size()
        + " contentlet(s) under path '" + rootPath + "'");   // line 1092-1094
// Same pattern at line 1123-1124 in evictContentletCacheForSubtree()

💡 Strip CR/LF before logging: rootPath.replaceAll("[\r\n]", " ") — or add the validation upstream in FolderAPIImpl.validateFolderName().

Fix this →


[🟡 Medium] dotCMS/src/main/java/com/dotmarketing/portlets/folders/business/FolderFactoryImpl.java:1058

Duplicate DB round-trips in bumpVersionTsForSubtree(). The method first SELECTs all affected identifiers (to collect IDs for cache eviction), then issues a separate UPDATE with the identical WHERE clause. While transactionally safe under @WrapInTransaction, this doubles the scan of the identifier table for large folder trees. If PostgreSQL's RETURNING clause were used the SELECT could be eliminated entirely.

// Round-trip 1: SELECT i.id FROM identifier i WHERE parent_path LIKE ? …
final List<Map<String, Object>> affected = new DotConnect().setSQL("SELECT i.id …").loadObjectResults();

// Round-trip 2: UPDATE contentlet_version_info … WHERE identifier IN (SELECT i.id … same WHERE)
new DotConnect().executeUpdate("UPDATE contentlet_version_info … WHERE identifier IN (SELECT …)", …);

💡 Collapse into one query using UPDATE contentlet_version_info SET version_ts = ? WHERE identifier IN (SELECT …) RETURNING identifier (PostgreSQL), or run the SELECT first and feed the ID list into the UPDATE with an IN (?, ?, …) to avoid re-scanning identifier.


Next steps

  • 🟡 Both findings are mechanical fixes — you can ask me to handle them inline: @claude fix log injection in FolderFactoryImpl.java
  • Every new push triggers a fresh review automatically

View job run

gortiz-dotcms and others added 3 commits April 14, 2026 08:34
…UID behavior

Child subfolder identifiers are no longer preserved after a parent rename —
each subfolder gets a new deterministic UUID (path change = identity change),
so the old UUIDs are deleted. Updated both renameFolder tests to assert the
new behavior: old identifiers are null, sub-folders are findable by new path
with new UUIDs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…E order fix

- Add version_ts assertion to renameFolder_updatesChildrenAndSubChildrenPaths:
  captures a timestamp before the rename and asserts all three contentlets
  (file in parent, page in parent, file in sub-folder) have version_ts > beforeRename,
  protecting the core push-publish fix against regression.

- Flip SELECT/UPDATE order in bumpVersionTsForSubtree: run the UPDATE first,
  then SELECT affected rows for cache eviction. The previous order had a race
  window where a concurrent INSERT between SELECT and UPDATE would be bumped
  but its cache entry would not be evicted. Running UPDATE first closes that gap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix ClassCastException in test: use loadObjectResults() instead of
  loadResults() so version_ts comes back as Timestamp, not String
- Fix variant mismatch (High): restrict UPDATE to DEFAULT variant only
  (AND variant_id = ?) so the UPDATE scope aligns exactly with the
  cache eviction, which already targets DEFAULT via
  removeContentletVersionInfoToCache(id, lang)
- Fix duplicate round-trips (Medium): SELECT affected identifiers first,
  then feed the collected ID list into the UPDATE via an explicit IN (...)
  to avoid scanning the identifier table twice

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 14, 2026

🔍 dotCMS Backend Review

[🟡 Medium] dotCMS/src/main/java/com/dotmarketing/portlets/folders/business/FolderFactoryImpl.javabumpVersionTsForSubtree

Unbounded IN clause can exceed PostgreSQL's 32,767-parameter limit. ids.size() + 2 bind params are allocated into a single Object[] and passed to one executeUpdate. On a large site with more than ~32,765 non-folder assets under the renamed path, the JDBC driver throws PSQLException: too many bind parameters and the entire rename transaction rolls back. The SELECT is fine; only the UPDATE is affected.

final String placeholders = ids.stream().map(id -> "?").collect(Collectors.joining(", "));
final Object[] params = new Object[ids.size() + 2];
// ...all ids in one shot...
new DotConnect().executeUpdate(
    "UPDATE contentlet_version_info SET version_ts = ? WHERE identifier IN (" + placeholders + ") ...", params);

💡 Partition ids into chunks of 1,000 and loop: for (int start = 0; start < ids.size(); start += 1_000) { List<String> chunk = ids.subList(start, Math.min(start + 1_000, ids.size())); ... }


[🟡 Medium] dotCMS/src/main/java/com/dotmarketing/portlets/folders/business/FolderFactoryImpl.javabumpVersionTsForSubtree and evictContentletCacheForSubtree (carry-over — still not fixed)

Log injection via user-supplied rootPath. bumpVersionTsForSubtree receives newPath (= parentPath + newName + "/") where newName is user-supplied. A user with rename permission who creates a folder name containing \r\n can inject fake log entries (OWASP A09 log spoofing). The same pattern appears in evictContentletCacheForSubtree where rootPath is oldPath, also partially user-derived.

Logger.debug(FolderFactoryImpl.class,
    "Bumped version_ts and evicted version-info cache for " + ids.size()
        + " contentlet(s) under path '" + rootPath + "'");   // rootPath contains user-supplied newName

💡 Strip CR/LF before logging: rootPath.replaceAll("[\r\n]", " ") — or add the validation upstream in FolderAPIImpl so names containing CR/LF are rejected before reaching the factory.


[🟡 Medium] dotCMS/src/main/java/com/dotmarketing/portlets/folders/business/FolderFactoryImpl.javabumpVersionTsForSubtree

new Date() passed as a TIMESTAMP bind parameter instead of Timestamp. JDBC accepts java.util.Date for timestamp columns, but java.sql.Timestamp is the semantically correct type and is already used elsewhere in this file (e.g. new Timestamp(new Date().getTime())). Inconsistency here can surface as precision loss or driver-specific behaviour on some JDBC implementations.

params[0] = new Date();   // java.util.Date — should be java.sql.Timestamp

💡 Replace with params[0] = new Timestamp(System.currentTimeMillis()); — consistent with the rest of FolderFactoryImpl.


Next steps

  • 🟡 Both the IN-clause chunking and new Date() fix are mechanical — you can ask me to apply them inline: @claude fix unbounded IN clause in bumpVersionTsForSubtree / @claude fix new Date() to Timestamp in bumpVersionTsForSubtree
  • 🟡 Log injection: fix or reject upstream — @claude fix log injection in FolderFactoryImpl.java
  • Every new push triggers a fresh review automatically

@dsilvam dsilvam requested a review from wezell April 15, 2026 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Null Pointer Exception when push publishing a folder after rename

3 participants