Validate CustomResourceConfig shape + GLT-backend compatibility by kmontemayor2-sc · Pull Request #627 · Snapchat/GiGL

kmontemayor2-sc · 2026-05-06T23:08:33Z

Adds two validators wired into the existing pre-run validation chain:

A bypass in _validate_machine_config (via the trainer /
inferencer resource-config-valid checks) so a CustomResourceConfig
trainer / inferencer skips machine-shape checks — the new oneof arm
carries no machine spec to validate.
check_custom_resource_config_shape, which raises if a populated
CustomResourceConfig has an empty command. Pure shape check, no
subprocess spawn.
check_custom_resource_config_requires_glt_backend, an
unconditional gate that pairs CustomResourceConfig with
feature_flags.should_run_glt_backend. The v1 dispatchers do not
route through launch_custom; this catches the misconfig at
validate time.

Both new gates run unconditionally inside kfp_validation_checks for
both Trainer and Inferencer components.

Introduces a new `oneof` arm on `TrainerResourceConfig` / `InferencerResourceConfig` that lets callers describe a launcher as a shell command + positional args, instead of a fixed-shape Vertex AI / KFP / local resource config. The proto carries no semantics here — the dispatcher is added in a follow-up PR; this commit only ships the message, regenerated bindings, and the wrapper-property update so downstream code can read `wrapper.trainer_config` and get a `CustomResourceConfig` back. The diff includes a long tail of cosmetic Scala changes outside `gigl_resource_config/` because scalapbc regenerates every sibling proto's emitted source whenever any one proto in the same directory changes. Reviewers can scope to `CustomResourceConfig.scala` and the `*ResourceConfig.scala` siblings that gain the new oneof case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Implements `launch_custom`, a thin shim that takes a populated `CustomResourceConfig` and shells out via `subprocess.run(shell=True, check=True)`. The proto's `command` is a shell snippet (so leading `KEY=VALUE` env assignments parse naturally) and `args[]` are individually `shlex.quote`-d before joining, so values containing whitespace survive the shell pass. The dispatcher performs no template substitution: `command` and `args[]` are taken verbatim, and any placeholder text reaches `subprocess.run` literally. Consumers that want runtime-context substitution (e.g. ${gigl:foo}) should resolve it at YAML-load time before the proto reaches this module. No call site in the rest of the repo invokes `launch_custom` yet — wiring is added in a follow-up PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds two validators wired into the existing pre-run validation chain: 1. A bypass in `_validate_machine_config` (via the trainer / inferencer resource-config-valid checks) so a `CustomResourceConfig` trainer / inferencer skips machine-shape checks — the new oneof arm carries no machine spec to validate. 2. `check_custom_resource_config_shape`, which raises if a populated `CustomResourceConfig` has an empty `command`. Pure shape check, no subprocess spawn. 3. `check_custom_resource_config_requires_glt_backend`, an unconditional gate that pairs `CustomResourceConfig` with `feature_flags.should_run_glt_backend`. The v1 dispatchers do not route through `launch_custom`; this catches the misconfig at validate time. Both new gates run unconditionally inside `kfp_validation_checks` for both Trainer and Inferencer components. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

kmontemayor2-sc · 2026-05-06T23:08:39Z

/all_test

github-actions · 2026-05-06T23:08:49Z

GiGL Automation

@ 23:08:49UTC : 🔄 Lint Test started.

@ 23:13:51UTC : ❌ Workflow failed.
Please check the logs for more details.

github-actions · 2026-05-06T23:08:50Z

GiGL Automation

@ 23:08:49UTC : 🔄 C++ Unit Test started.

@ 23:11:21UTC : ✅ Workflow completed successfully.

github-actions · 2026-05-06T23:08:54Z

GiGL Automation

@ 23:08:54UTC : 🔄 E2E Test started.

@ 24:34:40UTC : ✅ Workflow completed successfully.

github-actions · 2026-05-06T23:08:54Z

GiGL Automation

@ 23:08:54UTC : 🔄 Python Unit Test started.

@ 24:20:28UTC : ✅ Workflow completed successfully.

github-actions · 2026-05-06T23:08:55Z

GiGL Automation

@ 23:08:54UTC : 🔄 Scala Unit Test started.

@ 23:16:55UTC : ✅ Workflow completed successfully.

github-actions · 2026-05-06T23:08:58Z

GiGL Automation

@ 23:08:58UTC : 🔄 Integration Test started.

@ 24:28:47UTC : ✅ Workflow completed successfully.

semgrep-code-snapchat · 2026-05-06T23:11:16Z

+
+    shell_line = " ".join([command, *(shlex.quote(a) for a in args)])
+    logger.info(f"Launching {component.name} via subprocess: {shell_line!r}")
+    subprocess.run(shell_line, shell=True, check=True)


Semgrep identified an issue in your code:

The subprocess.run() call uses shell=True on user-controlled command strings, allowing shell injection attacks that could execute arbitrary commands with the process's privileges.

More details about this

The subprocess.run() call executes shell_line with shell=True, which spawns a shell process to interpret the command. This is dangerous because if resolved_command or any element in resolved_args comes from untrusted input (e.g., user-provided configuration, external data), an attacker can inject arbitrary shell commands.

For example, if an attacker controls custom_resource_config.command and sets it to id; rm -rf /, the shell will execute both id and the destructive rm -rf / command. Even though shlex.quote() escapes individual arguments, it doesn't protect against injection in the command itself—only in the args list. An attacker who controls the command field can bypass this protection entirely.

Exploit scenario:

Attacker provides custom_resource_config.command = "echo test; cat /etc/passwd #"

After joining with args, shell_line becomes something like "echo test; cat /etc/passwd #"

When subprocess.run() executes this with shell=True, the shell interprets the semicolon as a command separator

Both echo test and cat /etc/passwd execute, leaking sensitive system files

If the process runs with elevated privileges, the attacker can exfiltrate or modify sensitive data

The shell inherits environment variables and settings from the parent process, which further expands the attack surface. Using shell=False would treat the entire string as a single command name rather than allowing shell metacharacters to be interpreted.

To resolve this comment:

💡 Follow autofix suggestion

Suggested change

subprocess.run(shell_line, shell=True, check=True)

subprocess.run(shell_line, shell=False, check=True)

View step-by-step instructions

Replace subprocess.run(shell_line, shell=True, check=True) with an invocation that does not use shell=True.

Change the code to directly pass the command and arguments as a list, rather than joining into a string. Use: subprocess.run([resolved_command] + resolved_args, check=True).

Remove any uses of shlex.quote when building the argument list, since quoting is only necessary when passing a string to the shell.

Ensure that resolved_command and every item in resolved_args are unquoted strings representing the command and its arguments.

Passing the command and arguments as a list with shell=False (the default) is safer, because it avoids any interpretation by a shell, preventing command injection vulnerabilities.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

/fp <comment> for false positive

/ar <comment> for acceptable risk

/other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by subprocess-shell-true.

_{You can view more details about this finding in the Semgrep AppSec Platform.}

kmontemayor and others added 3 commits May 6, 2026 22:56

semgrep-code-snapchat Bot reviewed May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate CustomResourceConfig shape + GLT-backend compatibility#627

Validate CustomResourceConfig shape + GLT-backend compatibility#627
kmontemayor2-sc wants to merge 3 commits intomainfrom
kmonte/custom-resource-config-pr3-validation

kmontemayor2-sc commented May 6, 2026

Uh oh!

kmontemayor2-sc commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 6, 2026 •

edited

Loading

Uh oh!

semgrep-code-snapchat Bot May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	subprocess.run(shell_line, shell=True, check=True)
	subprocess.run(shell_line, shell=False, check=True)

Conversation

kmontemayor2-sc commented May 6, 2026

Uh oh!

kmontemayor2-sc commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

semgrep-code-snapchat Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 6, 2026 •

edited

Loading

github-actions Bot commented May 6, 2026 •

edited

Loading

github-actions Bot commented May 6, 2026 •

edited

Loading

github-actions Bot commented May 6, 2026 •

edited

Loading

github-actions Bot commented May 6, 2026 •

edited

Loading

github-actions Bot commented May 6, 2026 •

edited

Loading