fix(mdxish): properly parse JSX comments in editor deserialization by maximilianfalco · Pull Request #1419 · readmeio/markdown

maximilianfalco · 2026-04-01T00:28:27Z

🎫 Resolve RM-15676

🎯 What does this PR do?

Warning

This PR should not affect rendering whatsoever. The mdxish() rendering pipeline still strips JSX comments at the string level via removeJSXComments before the tokenizer ever runs, so rendered output is byte-identical.

JSX comments ({/* ... */}) wrapping magic blocks (or any multiline content with blank lines) were being split into separate AST nodes during editor deserialization, causing commented-out content to render as if it were active.

escapeProblematicBraces in preprocessing escaped { to \{ whenever an expression spanned a blank line, which includes JSX comments because they are essentially expressions
Without the mdxExpression flow construct (removed to prevent breaking magic blocks), the parser had no way to recognize {/* ... */} as a single expression

Note

Technically, introducing the flow construct of the mdxExpression would partly fix this issue but we cannot do this because it would break magic block parsing. magic blocks usually contain { \n \n } JSON payloads, and the flow construct would eat them before the magic block tokenizer got a chance to run.

created a dedicated jsxComment micromark tokenizer for JSX comments. enabled for all branches instead of being gated by newEditorTypes.

Note

In the future as a follow up, we can work in integrating this new tokenizer and fully removing the removeJSXComments preprocessing step. similar to what we do in the stripComments

🧪 QA tips

Try pasting this in the editor (both rich and raw):

{/* Dropbox hosted. What was it?

[block:image]
{
  "images": [
    {
      "image": [
        "https://paper-attachments.dropboxusercontent.com/s_924615C008BE616D2ABE30441190A9401D739971D9AD94E8B5B46A426C5F05E8_1683123013834_Dev+Dash+-+hero+docs+shot.png",
        null,
        "The My Developers page in your ReadMe project dashboard"
      ],
      "align": "center",
      "border": true,
      "caption": "The My Developers page in your ReadMe project dashboard"
    }
  ]
}
[/block]

*/}

It should be treated as one big comment block instead of a rendered image.

📸 Screenshot or Loom

Screen.Recording.2026-04-02.at.01.32.19.mov

eaglethrost · 2026-04-01T07:04:32Z

Question, If the opening comment section is indented, should this tokenizer still work? Seems like it does with MDX, example:

{/* [block:image] { "images": [{"image": ["https://example.com/img.png", null, "Alt"]}] } [/block] */} Hi there!

yes it should work! are you seeing something different?

added a test case for this here 05e3622

okay this looks to be broken in the editor but it looks like its a monorepo issue not an engine issue

…rm-15676-does-not-deserialize-magic-block-within-jsx-comment

…o they dont diverge

maximilianfalco · 2026-04-10T06:46:43Z

  return mdastV6(doc, { rdmd });
 };
+
+export const parseMdxishMdast = (md: string, opts: MdxishOpts = {}): MdastRoot => {


might be useful for other test files so hoisted this up in the general helpers file

maximilianfalco · 2026-04-10T06:48:18Z

the ideal path is that sometime in the future, we can revisit this work and remove the removeJsxComment preprocessing so we dont have to keep this file and no longer worry about diverging regex rules. but this should act as a bandaid to keep them in check for the mean time

happy to revisit in the future to unify those two

kevinports · 2026-04-13T15:39:54Z

Ok I barely comprehend how what's happening in this PR fits into the greater pipeline. But why are we emitting custom namesjsxComment, jsxCommentLineEnding, jsxCommentMarkerEnd, etc. that you have defined on L7-L15?

Couldn't this tokenizer just emit mdxFlowExpression and mdxFlowExpressionChunk types? I think those are the expected types that'd normally come from micromark-extension-mdx-expression if we weren't excluding detection of flow constructs here:

markdown/lib/mdxish.ts

Lines 152 to 157 in f20d1d0

// Get mdxExpression extension and remove its flow construct to prevent

// `{...}` from interrupting paragraphs (which breaks multiline magic blocks)

const mdxExprExt = mdxExpression({ allowEmpty: true });

const mdxExprTextOnly: Extension = {

text: mdxExprExt.text,

};

That'd altogether remove the need to process these special types downstream with lib/mdast-util/jsx-comment/index.ts and clean this PR quite a bit I think.

Another question is have you considered what it'd look like to refactor the magic block tokenizer to be robust enough to coexist with the full mdxExpression extension from micromark-extension-mdx-expression instead of limiting it to non-flow constructs? I just wonder if we're gonna keep running into issues with multi-line expressions.

Couldn't this tokenizer just emit mdxFlowExpression and mdxFlowExpressionChunk types? I think those are the expected types that'd normally come from micromark-extension-mdx-expression if we weren't excluding detection of flow constructs here:

yea i think i was overthinking this, thanks for catching this. going to simplify it!

simplified in 4177147

…rm-15676-does-not-deserialize-magic-block-within-jsx-comment

maximilianfalco · 2026-04-14T02:24:54Z

Another question is have you considered what it'd look like to refactor the magic block tokenizer to be robust enough to coexist with the full mdxExpression extension from micromark-extension-mdx-expression instead of limiting it to non-flow constructs? I just wonder if we're gonna keep running into issues with multi-line expressions.

@kevinports i did try to consider this wayyy back when i introduced the magic block tokenizer but the main issue was that if say we have this:

[block:image]

{
  "images": [{"image": ["https://files.readme.io/6f52e22-man-eating-pizza-and-making-an-ok-gesture.jpg", null, "Alt"]}]
}

[/block]

this is technically valid magic block and it is rendered atm, but the main issue is that there is a gap between the [block] markers and the actual data (the one encased in {}). the issue is that since there is a new line, the opening and closing {} are considered standalone and hence being consumed by the flowExpression tokenizer when it should be consumed by the magic block tokenizer instead

ill keep trying to make it more robust atm but it might be worth experimenting on that in a follow up PR

…rm-15676-does-not-deserialize-magic-block-within-jsx-comment

## Version 14.0.0 ### ⚠ BREAKING CHANGES * **mdxish:** magic blocks), the parser had no way to recognize `{/* ... */}` as a single expression > [!NOTE] > Technically, introducing the flow construct of the `mdxExpression` would partly fix this issue but we cannot do this because it would break magic block parsing. magic blocks usually contain `{ \n \n }` JSON payloads, and the flow construct would eat them before the magic block tokenizer got a chance to run. created a dedicated `jsxComment` micromark tokenizer for JSX comments. enabled for all branches instead of being gated by `newEditorTypes`. > [!NOTE] In the future as a follow up, we can work in integrating this new tokenizer and fully removing the `removeJSXComments` preprocessing step. similar to what we do in the `stripComments` ## 🧪 QA tips Try pasting this in the editor (both rich and raw): ``` {/* Dropbox hosted. What was it? [block:image] { "images": [ { "image": [ "https://paper-attachments.dropboxusercontent.com/s_924615C008BE616D2ABE30441190A9401D739971D9AD94E8B5B46A426C5F05E8_1683123013834_Dev+Dash+-+hero+docs+shot.png", null, "The My Developers page in your ReadMe project dashboard" ], "align": "center", "border": true, "caption": "The My Developers page in your ReadMe project dashboard" } ] } [/block] */} ``` It should be treated as one big comment block instead of a rendered image. ## 📸 Screenshot or Loom https://github.com/user-attachments/assets/9b619f54-3d6e-456f-83be-27d4076a7a3a ### ✨ New & Improved * **mdxish:** add micromark tokenizer for MDX component parsing ([#1426](#1426)) ([a77707d](a77707d)) * move numbered lists for 1 -> a -> i ([#1438](#1438)) ([7e614d4](7e614d4)) ### 🛠 Fixes & Updates * **mdxish:** properly parse JSX comments in editor deserialization ([#1419](#1419)) ([95e1969](95e1969))

rafegoldberg · 2026-04-16T13:50:39Z

This PR was released!

🚀 Changes included in v14.0.0

maximilianfalco added 3 commits April 1, 2026 11:26

feat: add jsx comment micromark tokenizer

cac5ab9

fix: preserve jsx comment braces in preprocessing

e6cd6bc

bundle watch

17e96ce

maximilianfalco changed the title ~~fix(mdxish): properly parse jsx comments in editor deserialization~~ fix(mdxish): properly parse JSX comments in editor deserialization Apr 1, 2026

Jadenzzz reviewed Apr 1, 2026

View reviewed changes

Comment thread package.json Outdated

maximilianfalco requested a review from eaglethrost April 1, 2026 01:20

eaglethrost reviewed Apr 1, 2026

View reviewed changes

Comment thread processor/transform/mdxish/preprocess-jsx-expressions.ts Outdated

eaglethrost reviewed Apr 1, 2026

View reviewed changes

maximilianfalco added 3 commits April 2, 2026 00:54

fix(mdxish): add bounds check and flatten preserveJsxComments param

c6d4450

refactor(mdxish): deduplicate micromark extension array construction

eeb5cb2

chore: add test case

05e3622

maximilianfalco requested review from kevinports and rafegoldberg April 1, 2026 14:20

Merge branch 'next' of https://github.com/readmeio/markdown into falco/…

47929d0

…rm-15676-does-not-deserialize-magic-block-within-jsx-comment

kevinports reviewed Apr 9, 2026

View reviewed changes

Comment thread lib/mdxish.ts Outdated

maximilianfalco added 2 commits April 10, 2026 16:35

refactor: remove the gate behind the jsxComment tokenizer

3861176

refactor: unify the regex syntax for tokenizer and removeJsxComment s…

9619c05

…o they dont diverge

maximilianfalco commented Apr 10, 2026

View reviewed changes

maximilianfalco added 2 commits April 10, 2026 17:04

fix: only exempt pure JSX comments

8363b67

chore: add more edge case tests

92c68dd

maximilianfalco requested a review from kevinports April 11, 2026 15:56

kevinports reviewed Apr 13, 2026

View reviewed changes

maximilianfalco added 4 commits April 14, 2026 11:30

Merge branch 'next' of https://github.com/readmeio/markdown into falco/…

87665cc

…rm-15676-does-not-deserialize-magic-block-within-jsx-comment

Merge branch 'next' of https://github.com/readmeio/markdown into falco/…

b26c9bc

…rm-15676-does-not-deserialize-magic-block-within-jsx-comment

chore: remove duplicate helper functions

29373d0

refactor: emit standard mdxFlowExpression tokens

4177147

maximilianfalco requested a review from kevinports April 14, 2026 02:25

kevinports approved these changes Apr 14, 2026

View reviewed changes

Comment thread __tests__/transformers/preprocess-jsx-expressions.test.ts

Comment thread __tests__/helpers.ts

maximilianfalco added 3 commits April 15, 2026 10:22

Merge branch 'next' of https://github.com/readmeio/markdown into falco/…

42236fe

…rm-15676-does-not-deserialize-magic-block-within-jsx-comment

chore: restore jsdocs and changed assertion to toBeNull

d808dad

restore using toBeUndefined

669c7c9

maximilianfalco merged commit 95e1969 into next Apr 15, 2026
10 of 11 checks passed

maximilianfalco deleted the falco/rm-15676-does-not-deserialize-magic-block-within-jsx-comment branch April 15, 2026 01:15

rafegoldberg added the released label Apr 16, 2026

	// Get mdxExpression extension and remove its flow construct to prevent
	// `{...}` from interrupting paragraphs (which breaks multiline magic blocks)
	const mdxExprExt = mdxExpression({ allowEmpty: true });
	const mdxExprTextOnly: Extension = {
	text: mdxExprExt.text,
	};

Conversation

maximilianfalco commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 What does this PR do?

🧪 QA tips

📸 Screenshot or Loom

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kevinports Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maximilianfalco commented Apr 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rafegoldberg commented Apr 16, 2026

This PR was released!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

maximilianfalco commented Apr 1, 2026 •

edited

Loading

kevinports Apr 13, 2026 •

edited

Loading