Skip to content

spec: configurable word delimiters for word-deletion shortcuts (#10348)#10459

Open
lonexreb wants to merge 4 commits intowarpdotdev:masterfrom
lonexreb:spec/10348-configurable-word-delimiters
Open

spec: configurable word delimiters for word-deletion shortcuts (#10348)#10459
lonexreb wants to merge 4 commits intowarpdotdev:masterfrom
lonexreb:spec/10348-configurable-word-delimiters

Conversation

@lonexreb
Copy link
Copy Markdown
Contributor

@lonexreb lonexreb commented May 8, 2026

Summary

Spec for #10348. Makes the set of characters that act as word boundaries configurable for all four word-cursor shortcuts: Delete Word Left, Delete Word Right, Move Cursor Word Left, Move Cursor Word Right. Today these shortcuts treat path-like strings (e.g. /var/www/example.com/logs) as a single word; with a configurable delimiter set including /, ., -, _, :, =, the shortcuts split per segment, matching VS Code, JetBrains, and Sublime conventions.

What's specified

  • New editor.word_delimiters setting (string of literal characters) with a sensible default: "/.-_:= \t\n\r\x0b\x0c".
  • Per-input-context overrides: terminal.word_delimiters, agent.word_delimiters (both fall back to editor.word_delimiters, then to the built-in default).
  • Whitespace is always a delimiter regardless of configuration (B5) — prevents broken configs.
  • Delimiter run collapse semantics matching modern editors (B7).
  • Acceptance criteria (A1–A9), test plan (T1–T10), implementation pointers, and a V1.5 deferral for regex-based boundaries.

This spec affects 4 shortcuts (Delete Word Left/Right, Move Cursor Word Left/Right) so they share one boundary definition rather than diverging.

Test plan

Spec-only PR. Tests T1–T10 will be implemented alongside the code change.

  • T1: default behavior on path string
  • T2: custom delimiter set
  • T3: context override precedence
  • T4: empty setting fallback
  • T5: whitespace forced as delimiter
  • T6: delimiter run collapse on foo//bar
  • T7: word-navigation parity with deletion
  • T8: live effect without restart
  • T9: setting persists across restart
  • T10: UTF-8 multi-byte safety

/label ready-to-spec

@cla-bot cla-bot Bot added the cla-signed label May 8, 2026
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 8, 2026

@lonexreb

I'm starting a first review of this spec-only pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@github-actions github-actions Bot added the external-contributor Indicates that a PR has been opened by someone outside the Warp team. label May 8, 2026
Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This spec defines configurable word delimiter behavior for word deletion and word navigation across terminal, agent, and editor input contexts. The scope is appropriate for the linked feature and does not introduce a meaningful high-level security surface.

Concerns

  • The core deletion behavior contract contradicts the later examples and acceptance criteria, so implementers could build incompatible behavior for delimiter runs.
  • The delimiter setting representation is underspecified for escaped whitespace in config and the single-line settings UI.
  • The settings schema guidance conflicts with the fallback behavior for missing, empty, and whitespace-only values.
  • The test plan should explicitly cover Delete Word Right, not only left deletion plus navigation parity.

Verdict

Found: 1 critical, 2 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH10348/SPEC.md Outdated

### B2. Word-deletion semantics

For Delete Word Left and Delete Word Right, characters in the delimiter set are word boundaries. Delete Word Left removes characters from the cursor leftward until (and including) the next delimiter run, OR start-of-line, whichever comes first. Delete Word Right is the mirror: rightward until a delimiter run or end-of-line.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 [CRITICAL] B2 contradicts B7/A1/A2: until (and including) the next delimiter run would make Delete Word Left from foo//bar| remove //bar, but B7 requires removing bar first and the delimiter run on the next press. Rewrite the deletion contract to distinguish deleting a non-delimiter run from deleting an adjacent delimiter run.

Comment thread specs/GH10348/SPEC.md Outdated
A new setting `editor.word_delimiters` (string of characters) defines the global word-boundary set. Default value:

```
"/.-_:= \t\n\r\x0b\x0c"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The default is shown with escape sequences while the setting is defined as a literal character list and UI is a single-line input. Specify whether \t, \n, etc. are decoded escape sequences, raw backslash characters, or display-only notation, and how users enter/reset whitespace delimiters.

Comment thread specs/GH10348/SPEC.md Outdated
- **Resolution helper.** Add `resolve_word_delimiters(context: InputContext) -> String` in the same module, implementing the B4 fallback chain. `InputContext` enum: `Terminal`, `Agent`, `Editor`.
- **Key handler dispatch.** Update `app/src/editor/key_handler.rs` so the four shortcut handlers (Delete Word Left, Delete Word Right, Move Cursor Word Left, Move Cursor Word Right) call into `WordBoundaryClassifier` instead of the existing whitespace-only check.
- **Whitespace floor.** Inside `WordBoundaryClassifier`, OR the user's set with the fixed whitespace set so B5 holds without callsite branching.
- **Settings schema.** Add the three keys to `app/src/settings/editor.rs` (and matching terminal/agent settings modules). All three are optional `Option<String>` except `editor.word_delimiters`, which has the literal default.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This conflicts with B4/B6: editor.word_delimiters is described as defaulted here, but the behavior contract still requires missing/empty/whitespace-only editor values to fall back to the built-in default. Specify the loaded type and normalization path for each key so implementations do not diverge.

Comment thread specs/GH10348/SPEC.md Outdated

## Tests

- **T1.** Default behavior on `/var/www/example.com`: Delete Word Left from end pops segments per A1.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] Add explicit tests for Delete Word Right, including delimiter-run collapse and default path behavior. The feature covers both directions, but T1/T2/T6 only assert left deletion and T7 only covers navigation parity.

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

Round-1 fixes for oz-for-oss review: (1) CRITICAL fixed — wrote ONE canonical Delete Word Left/Right algorithm (B2) with worked example for /var/www/example.com| and rewrote all examples + acceptance criteria + tests to match (no more contradictions); (2) added Delimiter Setting Representation subsection (B6) covering TOML escape sequences, Settings UI text-field with Show-Escapes toggle, validation rules; (3) resolved fallback semantics — empty string = missing (default); whitespace-only = invalid (default + warning); whitespace always forced as delimiter regardless; per-context override invalid → falls through to editor (B6.5); (4) Tests now cover Delete Word Right (T8a-d) and Move Cursor Word Right (T7a), plus fallback behaviors (T9a-g).

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 8, 2026

@lonexreb

I'm re-reviewing this spec-only pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

Reviewed the spec-only PR for configurable word delimiters across deletion and navigation shortcuts.

Concerns

  • The Delete Word Right acceptance criteria contradict the canonical algorithm and would lead to conflicting implementation/tests.
  • The settings documentation has invalid TOML escape guidance and conflicting UI behavior for empty values.
  • The left-delete test sequence omits a required intermediate state.

Verdict

Found: 1 critical, 3 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH10348/SPEC.md Outdated
All criteria below match the canonical algorithm in B2.

- **A1.** Delete Word Left on `/var/www/example.com|` (cursor at end) with the default `D` produces, on successive presses: `/var/www/example.|` → `/var/www/|` → `/var/|` → `/|` → `|`. (5 presses to fully clear; matches B2 worked example.)
- **A2.** Delete Word Right is the mirror. On `|/var/www/example.com` (cursor at start) with the default `D`, successive presses produce: `|/var/www/example.com` (Rule 3 — char right=`/` ∈ D — consume `/` then `var`) → `|/www/example.com` → `|/example.com` → `|.com` → `|com` → `|`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 [CRITICAL] A2 contradicts B2's Delete Word Right rule: starting before / should consume /var, and |.com should go directly to |, not |com. Resolve the canonical behavior and update A2/T8a/T7a to match.

Comment thread specs/GH10348/SPEC.md Outdated

| Key | Type | Default | Notes |
| -------------------------- | -------- | ---------------------------- | ------------------------------------------------------ |
| `editor.word_delimiters` | `string` | `"/.-_:= \t\n\r\x0b\x0c"` | Global default. See B6 for value-string semantics, escape sequences, validation, and missing/empty/whitespace-only/invalid handling. Whitespace floor (B5) always applies. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The documented default uses \x0b/\x0c, but B6.2 says users configure a TOML basic string and TOML has no \xNN escape. Document a TOML-safe representation, such as \u000B/\f, and make the examples/tests use the same syntax.

Comment thread specs/GH10348/SPEC.md Outdated
| `terminal.word_delimiters` | `string` | unset (falls back to editor) | Override for terminal input. Per B4 + B6.5, missing/empty/invalid falls through to `editor.word_delimiters` (NOT directly to the built-in default). |
| `agent.word_delimiters` | `string` | unset (falls back to editor) | Override for agent prompt input. Same fallthrough as terminal. |

UI placement: **Settings → Editor → "Word Boundary Characters"**. Single-line text input with a "Reset to default" button and a "Show escapes" toggle (per B6.3). Validation per B6.4 — disallowed control characters trigger an inline error and disable save. Whitespace-only or empty inline error per B6.5. Below the input, a live preview line:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This says empty values show an inline error, but B6.5 defines "" as equivalent to missing with an empty field and default placeholder. Make this line say only whitespace-only/invalid values error, while empty clears to default.

Comment thread specs/GH10348/SPEC.md Outdated

### Delete Word Left

- **T1.** Default behavior on `/var/www/example.com|`: 5 successive Delete Word Left presses produce `/var/www/example.|`, `/var/www/|`, `/var/|`, `/|`, `|` (matches A1).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] T1 says five presses and claims to match A1, but the expected sequence omits the /| state before the final |. Add /| so the test covers the delimiter-only final step.

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

Round-2 fixes: (1) CRITICAL resolved — every Delete Word Right acceptance criterion now matches the canonical mirror algorithm (cursor-mid-non-delim consumes to end-of-run; otherwise consumes delim+non-delim); two worked examples added; (2) TOML escape guidance corrected — only standard escapes (\b\t\n\f\r"\ and \uXXXX/\UXXXXXXXX) — non-standard \xXX removed; default value now uses / ; (3) empty value behavior reconciled — empty string in TOML or UI -> fall back to default; UI shows default in field with reset indicator; no error on empty; (4) left-delete test sequence now walks all 6 states for /var/www/example.com|.

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 8, 2026

@lonexreb

I'm re-reviewing this spec-only pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This spec defines configurable word delimiters across deletion and navigation shortcuts, including context overrides, fallback behavior, UI expectations, and tests.

Concerns

  • The TOML escape guidance for VT/FF needs to be unambiguous and copy-pasteable before implementation.
  • The acceptance criteria currently flatten whitespace-only fallback to the built-in default, contradicting the per-context fallback rule.
  • The implementation pointers make editor.word_delimiters a literal-default schema value, which conflicts with the spec's absent/empty-as-default behavior and UI state.

Verdict

Found: 0 critical, 3 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH10348/SPEC.md Outdated

**Non-standard `\xXX` is NOT a valid TOML escape** and will fail TOML parsing if used. Earlier drafts of this spec referenced `\x0b` and `\x0c` for vertical tab and form feed; those references were incorrect. The valid encodings are:

- Vertical tab (U+000B) — encode as ` `. (TOML has no single-character escape for VT.)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The VT/FF encoding examples must be visible, valid TOML escapes rather than rendered control characters; spell them out as \u000B and \f/\u000C so the default can be implemented and documented consistently.

Comment thread specs/GH10348/SPEC.md Outdated
- **A3.** With default set, on `/path/to/file|` the first Delete Word Left removes only `file`, leaving `/path/to/|`. (Rule 3: char left=`e` ∉ D, consume just the non-delimiter run `file`.)
- **A4.** Per-context override applies only in that input context; other contexts use the editor default or their own override (per B4).
- **A5.** Missing/empty setting value falls back to the built-in default from B1 (per B6.5).
- **A6.** Whitespace-only setting value is rejected as invalid; runtime falls back to the built-in default and logs a warning; Settings UI shows inline error (per B6.5).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This conflicts with B6.5 for context overrides, where invalid terminal.word_delimiters/agent.word_delimiters fall through to editor.word_delimiters instead of directly to the built-in default; scope A6 to the editor setting or split the acceptance criterion by context.

Comment thread specs/GH10348/SPEC.md Outdated
- **Resolution helper.** Add `resolve_word_delimiters(context: InputContext) -> String` in the same module, implementing the B4 fallback chain. `InputContext` enum: `Terminal`, `Agent`, `Editor`.
- **Key handler dispatch.** Update `app/src/editor/key_handler.rs` so the four shortcut handlers (Delete Word Left, Delete Word Right, Move Cursor Word Left, Move Cursor Word Right) call into `WordBoundaryClassifier` instead of the existing whitespace-only check.
- **Whitespace floor.** Inside `WordBoundaryClassifier`, OR the user's set with the fixed whitespace set so B5 holds without callsite branching.
- **Settings schema.** Add the three keys to `app/src/settings/editor.rs` (and matching terminal/agent settings modules). All three are optional `Option<String>` except `editor.word_delimiters`, which has the literal default.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Giving editor.word_delimiters a schema-level literal default conflicts with B6.5's absent/empty behavior and the UI requirement that the stored value remains empty/absent; specify an optional stored value with the B1 default applied only by the resolver.

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

Addressed all 3 round-N concerns:

1. TOML escape for VT/FF is now unambiguous and copy-pasteable. Replaced the placeholder backticks with explicit text: VT (U+000B) is encoded as the 4-hex-digit Unicode escape \u000B (TOML has no single-character VT escape, so this is the only valid form); FF (U+000C) is encoded as \f (preferred) or \u000C. Added a copy-pasteable TOML literal with the full whitespace floor: editor.word_delimiters = "/.-_:= \t\n\r\u000B\f" and noted that the simpler "/.-_:= \t\n\r" also works because B5's whitespace floor adds VT/FF back at runtime.

2. A6 no longer flattens whitespace-only fallback to the built-in default. Split A6 into A6.a (global editor.word_delimiters whitespace-only → built-in default per B1) and A6.b (per-context terminal. / agent. whitespace-only → fall THROUGH to the resolved editor.word_delimiters, NOT directly to the B1 default). Aligns A6 with the B6.5 per-context fallthrough rule and with T9g.

3. Implementation pointers no longer make editor.word_delimiters a literal schema default. Reworded the Settings schema bullet: all three keys are Option<String> with schema default None. The B1 character set is applied at RESOLUTION TIME by resolve_word_delimiters(InputContext::Editor), not at schema-load time. Documented why a literal schema default would break the B6.5 contract that 'absent' and 'empty string' are indistinguishable, and would also corrupt the Settings UI hint/'Reset to default' indicator state. The B1 character set lives as a const DEFAULT_WORD_DELIMITERS next to WordBoundaryClassifier, not in the settings schema.

Commit: 6b439d5

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 8, 2026

@lonexreb

I'm re-reviewing this spec-only pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

The spec defines configurable word delimiters for deletion/navigation shortcuts, settings resolution, UI behavior, and test coverage. The overall structure covers the feature, but several internal contradictions would send implementation and tests in different directions.

Concerns

  • The B1 TOML example still uses invalid \x escapes despite later sections declaring them invalid and providing a corrected form.
  • The missing/empty fallback rule is stated as direct built-in-default fallback, which conflicts with the per-context fallback chain to editor.word_delimiters.
  • The Move Cursor Word Right test contradicts the canonical Delete Word Right parity rule and would validate the wrong cursor stops.
  • The Settings UI text-field behavior should clarify how literal newlines work in a single-line input.

Verdict

Found: 1 critical, 2 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH10348/SPEC.md
Expressed as a valid TOML basic string (see B6.2 for the full escape rules — note that **`\xXX` is NOT a valid TOML escape**; vertical tab and form feed must be encoded with the Unicode `\u00XX` form):

```toml
editor.word_delimiters = "/.-_:= \t\n\r "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This example still uses \x0b/\x0c while the surrounding prose says those escapes are invalid TOML; replace it with the B6.2 form (\u000B\f) or remove this duplicate snippet.

Comment thread specs/GH10348/SPEC.md

The Settings text field accepts characters either literally (e.g., typing `/`) or as escape sequences (e.g., typing `\t`). Two display modes:

- **Literal mode (default).** Whitespace characters render as themselves. Tabs and newlines may be hard to see but are still present in the value.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] A single-line field cannot literally render or hold newline characters as visible line breaks; specify that newlines are entered/displayed via escapes in the UI while stored as codepoints, or define multiline behavior.

Comment thread specs/GH10348/SPEC.md

**Empty value behavior (single source of truth — no error on empty):**

The "missing" and "empty string" cases are treated **identically** as "use the built-in default from B1". This applies whether the value is absent from the TOML config OR the user clears the Settings UI text field. Specifically:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This global fallback rule conflicts with the per-context override rule below: missing/empty terminal or agent values should fall through to editor.word_delimiters, not directly to B1. Scope this sentence to editor.word_delimiters or restate the per-context behavior here.

Comment thread specs/GH10348/SPEC.md
### Move Cursor Word Left / Right

- **T7.** Move Cursor Word Left / Right land at the same boundaries Delete Word Left / Right would operate on, for identical input and setting (per A9 / B3). Verify on the A1 and A2 examples.
- **T7a.** Move Cursor Word Right on `foo|/bar/baz` with default set lands at `foo/|bar/baz` after the first press, `foo/bar/|baz` after the second, `foo/bar/baz|` after the third.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 [CRITICAL] T7a contradicts B3/A9: using the Delete Word Right parity rules from a cursor before /, the first move should land after bar (foo/bar|/baz), not after only the delimiter (foo/|bar/baz). Update these expected states or explicitly define navigation differently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed external-contributor Indicates that a PR has been opened by someone outside the Warp team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant