[#73523] Implement WorkPackage semantic ID allocation system#22581
[#73523] Implement WorkPackage semantic ID allocation system#22581
WorkPackage semantic ID allocation system#22581Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a semantic Work Package identifier system ({PROJECT}-{SEQ} like PROJ-42) that remains resolvable across project renames and work package moves by maintaining an alias registry, while also updating routing and services to use the new identifier forms.
Changes:
- Adds DB support for per-project sequence allocation (
projects.wp_sequence_counter), per-WP sequence/semantic ID (work_packages.sequence_number,work_packages.semantic_id), and a global alias registry (work_package_semantic_aliases). - Introduces model concerns to allocate/update semantic IDs on WP create/move and project rename, plus lookup helpers (
find_by_identifier). - Updates routes/controller to accept semantic IDs in URLs and adds integration/model specs for the new behavior.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
db/migrate/20260326100000_create_work_package_semantic_ids.rb |
Adds columns, alias table, and indexes for semantic ID storage + lookup. |
app/models/work_package/semantic_identifier.rb |
Implements semantic ID allocation, alias writes, and identifier resolution helpers. |
app/models/projects/semantic_identifier.rb |
Implements project-rename propagation for semantic IDs and aliases. |
app/models/work_package_semantic_alias.rb |
Adds alias model with validations/associations and documentation. |
app/services/work_packages/update_service.rb |
Hooks WP move handling to update semantic IDs after project changes. |
app/services/projects/update_service.rb |
Hooks project identifier changes to update semantic IDs/aliases. |
app/controllers/work_packages_controller.rb |
Updates WP lookup to support semantic IDs. |
config/routes.rb |
Broadens WP show route constraint to include semantic IDs. |
config/locales/en.yml |
Adds i18n labels for the alias model attributes. |
.rubocop.yml |
Allows new find_by_* methods under Rails/DynamicFindBy. |
spec/services/work_packages/semantic_ids/integration_spec.rb |
End-to-end coverage for create/move/rename/multi-step scenarios. |
spec/models/work_package/semantic_identifier_spec.rb |
Unit coverage for lookup and write-side behavior. |
spec/models/work_package_semantic_alias_spec.rb |
Validations/association coverage for alias model. |
judithroth
left a comment
There was a problem hiding this comment.
Ok, maybe I did not yet grasp every detail, but here are some first thoughts about this. It's coming along nice!
| def rewrite_semantic_ids(like_pattern, prefix, new_prefix) | ||
| WorkPackage | ||
| .where("semantic_id LIKE ?", like_pattern) | ||
| .update_all(["semantic_id = REPLACE(semantic_id, ?, ?)", prefix, new_prefix]) |
There was a problem hiding this comment.
Since the number of WorkPackages and the number of WorkPackageSemanticAliases should be roughly the same (differing only by the ones that were moved to another project) I wonder if we should use in_batches for both? (if we stay with doing it through ActiveRecord - if both could be done in the database directly that would also be nice).
| @@ -29,6 +29,7 @@ | |||
| #++ | |||
|
|
|||
| class WorkPackage < ApplicationRecord | |||
There was a problem hiding this comment.
Okay, I was thinking around the "batch-insert-and-change" code (see my last comments.
At first I thought we would do something like (horrible pseudo-code incoming):
transaction do
wp.identifier = new_identifier
wp.save!
WorkPackageSemanticAliases.create!(identifier: new_identifier, work_package_id: id)
endwhen we detect changes on the identifier.
But compared to your solution, this will be slower since we're going record by record. So I strongly prefer your solution!
However, I was thinking if we should somehow disallow changes to the identifier of a WorkPackage, to make sure, no one is messing with a WorkPackages identifier manually. On Projects, it's allowed to change the identifier of the record, so someone else could easily somehow not understand that just changing the identifier on a WorkPackage in isolation "breaks" our system.
There was a problem hiding this comment.
However, I was thinking if we should somehow disallow changes to the
identifierof a WorkPackage, to make sure, no one is messing with a WorkPackages identifier manually
I gave it some thought yesterday and couldn't figure out a way that didn't seem needlessly messy (e.g. introducing a thread-local lock variable that is momentarily bypassed in semantic helpers).
This would mainly be a warning to our developers, so I guess the closest thing that comes to mind is some extra custom validation on the model. But then, we would probably have to make it check some state, which already feels like an overkill. 🤔
There was a problem hiding this comment.
Side note: If we decide to keep that sequence number field for pagination purposes, this is where an extra validation actually makes a lot of sense -- just a simple one-liner that verifies that both fields are set at once, as opposed to only identifier or only sequence number. That should address the denormalization concerns from another comment.
There was a problem hiding this comment.
Side note: If we decide to keep that sequence number field for pagination purposes, this is where an extra validation actually makes a lot of sense -- just a simple one-liner that verifies that both fields are set at once, as opposed to only identifier or only sequence number. That should address the denormalization concerns from another comment.
=> Add WP validator to ensure setting identifier + semantic number at the same time
There was a problem hiding this comment.
Thanks for adding the follow-up work packages, Tom! I don't have a solution handy for the topic I raised in my original comment, but I still think this is important. So I added an open point for it: https://community.openproject.org/wp/73713
|
🍊 55 commits for this PR (rubocop × 5, "cosmetics", "remove empty line", "rename", "lint"). Should be squashed before merge. |
This is where I would really rather appreciate squash merge. 😢 If we do a cosmetic squash + force push just to have things cleaned up, on GitHub it often completely wipes out the linearity of PR discussion and makes the individual threads actually harder to track down. If the merge policy is iron-clad, I'm considering closing this PR, re-pushing squashed version to a new PR, then merging that one instead. (I know, the alternative is to also write "nicer" commits from the very beginning. But I think it's very rare that a long-running PR consists primarily of self-contained, meaningful and testable commits.) |
AFAIK, it's not iron clad- we have loads of commits in dev that are less "nicer" :) - 🍏 your call |
WorkPackage semantic ID resolution system (query-optimized)WorkPackage semantic ID resolution system
144bf65 to
46027a4
Compare
46027a4 to
c4debc8
Compare
OK, after squashing this all into a single commit just now, I have to admit that at least the discussion doesn't get lost in the ether anymore. So, this is where GitHub seems to have improved over the last few years. 👍 |
WorkPackage semantic ID resolution systemWorkPackage semantic ID allocation system
Ticket
https://community.openproject.org/projects/communicator-stream/work_packages/73436
https://community.openproject.org/projects/communicator-stream/work_packages/73523
What are you trying to accomplish?
Add support for semantic WP identifiers in the format of
{project identifier}-{sequence number}(e.g. "PROJ-42").The semantic WP identifier gets generated into the
WorkPackage.identifiercolumn via anafter_createhook. This hook only functions ifSetting::WorkPackageIdentifieris set tosemantic(which is currently opt-in and hidden behind a feature flag).The semantic WP identifiers have to be stable across time, i.e. even if the identifier changes multiple times (due to project rename or WP move), all old identifiers must still resolve to the same WP.
Additionally, we need to support "ghost identifiers", which are identifiers that have technically never been generated by the system, but we should still somehow support them to allow flexible aliases. Example:
PROJcontains WPPROJ-1PROJgets renamed toPROJ_NEWPROJ_NEWhas a new WP addedPROJ-2needs to resolve toPROJ_NEW-2What approach did you choose and why?
Project.wp_sequence_counterto track per-project WP sequencesWorkPackage.sequence_numberto provide numeric representation of the sequence part of the IDWorkPackage.identifierto reflect the current semantic ID of the work packageWorkPackageSemanticAliasto track all possible handles for each WP — both current and retired identifiersThe flow consists of the following steps, firing off a single query for the happy path, and 2 for expired/ghost identifiers.
work_packages.semantic_id(1 query).WorkPackageSemanticAlias(1 additional query).404.Notes on initialization: The
wp_sequence_counter,sequence_number,semantic_id, and alias records for existing work packages are initialized by the appropriate admin procedure when a user enables the semantic identifier option in admin settings. New work packages created after enabling the feature are handled automatically on insert/move.Tradeoffs:
REPLACE-based bulk update forsemantic_idto keep rename cost bounded.WorkPackageinsert & move become a bit heavier as now they also have to write down aliases.Merge checklist