Skip to content

Infer WorkflowRun from CreateAction in five-safes SHACL profiles#104

Open
EttoreM wants to merge 6 commits intodevelopfrom
43-run-createaction-checks-only-against-the-createactions-that-the-profile-cares-about
Open

Infer WorkflowRun from CreateAction in five-safes SHACL profiles#104
EttoreM wants to merge 6 commits intodevelopfrom
43-run-createaction-checks-only-against-the-createactions-that-the-profile-cares-about

Conversation

@EttoreM
Copy link
Copy Markdown

@EttoreM EttoreM commented Mar 20, 2026

Closes #43.

This PR implements a solution for treating the workflow-execution CreateAction as a dedicated ro-crate:WorkflowRun entity across the Five Safes SHACL profiles.

At a high level:

  • A hidden SHACL rule was added to the relevant TTL files to infer the triple CreateAction -> rdf:type -> ro-crate:WorkflowRun for the CreateAction whose instrument matches RootDataEntity -> mainEntity.

  • All SHACL constraints that are really about the actual workflow run were then retargeted from generic CreateAction to WorkflowRun, with corresponding updates on shapes' names and messages.

  • Tests were updated accordingly.

One important testing detail:

  • In the Python tests, the graph mutations still target entities of type CreateAction, not WorkflowRun.

  • This is intentional: WorkflowRun is inferred by SHACL during validation, so it is safer for the tests to alter the source CreateAction nodes in the RO-Crate metadata graph.

  • This remains correct as long as the tests are understood to target the specific CreateAction that is effectively the workflow run, and not any other CreateAction that may also be present in the crate.

A separate consistency note:

  • In prefixes.ttl, the RO-Crate namespace is exposed in SPARQL via rocrate (without the dash), while elsewhere in the codebase we also use ro-crate.
    This did not originate in this PR, but it is something we may want to clean up in future to align prefix syntax across the codebase.

@EttoreM EttoreM self-assigned this Mar 20, 2026
@EttoreM EttoreM requested review from douglowe and elichad March 20, 2026 11:08
@douglowe douglowe requested a review from alexhambley March 24, 2026 15:40
Copy link
Copy Markdown

@douglowe douglowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a read of Eli's original issue, and I think I understand now what we're doing (I didn't before - which I think may have led me to think we were doing something different here).

Many of the changes look sensible to me, and should do what we want. I've a couple of changes I'd request though.

  1. Should we use five-safes-crate:WorkflowRun instead of ro-crate:WorkflowRun?
  2. We should be able to transform the RDF graph once, and then use the ro-crate:WorkflowRun tag in all our tests after that. Can you create a 0_workflowrun.ttl file in the 'must' folder, which contains only this transformation - perhaps that would work?
  3. Should we be mentioning WorkflowRun in the message strings which tests return? If this is an internal tag then will this confuse users?

Copy link
Copy Markdown

@alexhambley alexhambley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with this I think, just a bit concerned about the maintainability of the duplicated code block across the rules. :)

@EttoreM
Copy link
Copy Markdown
Author

EttoreM commented Mar 25, 2026

@alexhambley @douglowe

OK, I managed to centralise the WorkflowRun inference used across five-safes-crate into a single hidden SHACL rule in 0_workflow_run_inference.ttl, and removed the duplicated FindWorkflowRunAction blocks from the individual requirement files.

Also made the dependency on the RO-Crate root explicit: FindRootDataEntity in 2_root_data_entity_metadata.ttl now has sh:order 0, while FindWorkflowRunAction has sh:order 1 and targets ro-crate:RootDataEntity rather than re-deriving the root from schema:Dataset. This ensures the root is inferred first, then the WorkflowRun, before the other validation shapes run.

@EttoreM EttoreM requested review from alexhambley and douglowe March 25, 2026 16:40
Copy link
Copy Markdown

@douglowe douglowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for merging the definition into one file - that looks good.

Can you review the messages that rules return, as well as the tests you've removed? I think we should make sure that the messages that the end user gets mention CreateAction not WorkflowRun.

@douglowe
Copy link
Copy Markdown

@elichad & @alexhambley - I've finished off the last couple of message strings for Ettore, as he's on leave now. If you're okay with the PR now I'll merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run CreateAction checks only against the CreateAction(s) that the profile cares about

4 participants