Skip to content

[code-infra] Add HTML validation to broken links checker#1241

Merged
Janpot merged 15 commits intomasterfrom
add-html-validate-to-broken-links-checker
Apr 15, 2026
Merged

[code-infra] Add HTML validation to broken links checker#1241
Janpot merged 15 commits intomasterfrom
add-html-validate-to-broken-links-checker

Conversation

@Janpot
Copy link
Copy Markdown
Member

@Janpot Janpot commented Mar 24, 2026

Summary

  • Adds optional HTML validation to the broken links checker via a new htmlValidate option (true, false, or a custom config object)
  • Ships a mui:recommended preset that extends html-validate:standard, html-validate:document, and html-validate:browser
  • Moves the entire per-URL crawl pipeline (fetch, parse, link/target extraction, HTML validation) into a dedicated crawlWorker.mjs worker thread — one worker per URL
  • Main thread now only handles queue management, deduplication, and post-crawl analysis
  • Adds typed CrawlWorkerInput/CrawlWorkerOutput contracts between parent and worker
  • Unifies all issues (broken links, broken targets, HTML validation) into a single issues array on CrawlResult using a discriminated union (BrokenLinkIssue | HtmlValidateIssue), removing the separate htmlValidateResults field

Janpot added a commit to Janpot/material-ui that referenced this pull request Mar 24, 2026
Update @mui/internal-code-infra to preview version from
mui/mui-public#1241 and enable htmlValidate: true.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Janpot Janpot force-pushed the add-html-validate-to-broken-links-checker branch from 4638e80 to 8d7f40e Compare March 25, 2026 10:30
@Janpot Janpot changed the title [code-infra] Add optional HTML validation to broken links checker [code-infra] Add HTML validation to broken links checker Mar 25, 2026
@oliviertassinari oliviertassinari temporarily deployed to add-html-validate-to-broken-links-checker - code-infra-dashboard PR #1241 March 27, 2026 14:53 — with Render Destroyed
@oliviertassinari oliviertassinari temporarily deployed to add-html-validate-to-broken-links-checker - code-infra-dashboard PR #1241 March 28, 2026 22:56 — with Render Destroyed
Janpot added a commit to Janpot/material-ui that referenced this pull request Mar 30, 2026
Update @mui/internal-code-infra to preview version from
mui/mui-public#1241 and enable htmlValidate: true.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Janpot added a commit to Janpot/material-ui that referenced this pull request Mar 30, 2026
Update @mui/internal-code-infra to preview version from
mui/mui-public#1241 and enable htmlValidate: true.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Janpot and others added 10 commits March 30, 2026 16:41
Add an `htmlValidate` option to the crawl config that validates HTML
content of crawled pages using the html-validate library. The option
accepts `true` (uses recommended rules), or a config object supporting
`extends: ['mui:recommended']` for the default preset. Config is always
static (never loaded from disk). Reports are printed per page.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace manual string replacement of mui:recommended in extends with
html-validate's staticResolver API, which properly registers the preset
so html-validate's own config resolution handles it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move fetch, parse, link/target extraction, and HTML validation into a
single crawlWorker per URL. The main thread now only handles queue
management, deduplication, and post-crawl analysis.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge broken link and HTML validation issues into a single `issues` array
on CrawlResult, using a discriminated union (`BrokenLinkIssue | HtmlValidateIssue`).
This removes the separate `htmlValidateResults` field and simplifies consumers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@oliviertassinari oliviertassinari temporarily deployed to add-html-validate-to-broken-links-checker - mui-tools-public PR #1241 March 30, 2026 14:52 — with Render Destroyed
Janpot added a commit to Janpot/material-ui that referenced this pull request Mar 30, 2026
Update @mui/internal-code-infra to preview version from
mui/mui-public#1241 and enable htmlValidate: true.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zannager zannager added the scope: code-infra Involves the code-infra product (https://www.notion.so/mui-org/5562c14178aa42af97bc1fa5114000cd). label Apr 6, 2026
@oliviertassinari oliviertassinari temporarily deployed to add-html-validate-to-broken-links-checker - code-infra-dashboard PR #1241 April 8, 2026 12:42 — with Render Destroyed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@oliviertassinari oliviertassinari temporarily deployed to add-html-validate-to-broken-links-checker - mui-tools-public PR #1241 April 8, 2026 14:13 — with Render Destroyed
@Janpot Janpot marked this pull request as ready for review April 8, 2026 14:24
@Janpot Janpot requested a review from a team April 13, 2026 11:21
@oliviertassinari oliviertassinari temporarily deployed to add-html-validate-to-broken-links-checker - code-infra-dashboard PR #1241 April 15, 2026 12:24 — with Render Destroyed
@oliviertassinari oliviertassinari temporarily deployed to add-html-validate-to-broken-links-checker - mui-tools-public PR #1241 April 15, 2026 12:32 — with Render Destroyed
@code-infra-dashboard
Copy link
Copy Markdown

code-infra-dashboard Bot commented Apr 15, 2026

Deploy preview

https://deploy-preview-1241--mui-internal.netlify.app/

Bundle size

Bundle Parsed size Gzip size
@base-ui/react 0B(0.00%) 0B(0.00%)
@mui/x-charts-pro 0B(0.00%) 0B(0.00%)

Details of bundle changes

Performance

Total duration: 24.40 ms 🔺+8.27 ms(+51.2%) | Renders: 4 (+0) | Paint: 109.92 ms 🔺+40.94 ms(+59.3%)

Test Duration Renders
DataGrid mount with paint timing 3.31 ms 🔺+1.40 ms(+73.2%) 1 (+0)
HeavyList mount 16.85 ms 🔺+6.91 ms(+69.4%) 1 (+0)
Counter click 4.24 ms ▼-0.04 ms(-0.9%) 2 (+0)

Details of benchmark changes


Check out the code infra dashboard for more information about this PR.

@mui mui deleted a comment from mui-bot Apr 15, 2026
Janpot added a commit to Janpot/material-ui that referenced this pull request Apr 15, 2026
Update @mui/internal-code-infra to preview version from
mui/mui-public#1241 and enable htmlValidate: true.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Janpot Janpot merged commit 78ca9de into master Apr 15, 2026
13 checks passed
@Janpot Janpot deleted the add-html-validate-to-broken-links-checker branch April 15, 2026 13:15
@oliviertassinari oliviertassinari added the type: new feature Expand the scope of the product to solve a new problem. label Apr 16, 2026
Copy link
Copy Markdown
Member

@oliviertassinari oliviertassinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff, e.g. would catch mui/mui-x#12285 early.

I will be curious to see how https://html-validate.org/ performs compared. to https://validator.w3.org/.

@Janpot
Copy link
Copy Markdown
Member Author

Janpot commented Apr 16, 2026

I will be curious to see how https://html-validate.org/ performs compared. to https://validator.w3.org/.

Yeah, it just integrates very easily, but it has sharp edges. The maintainer is very responsive though. We can always swap it out for something else or drop it altogether.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

scope: code-infra Involves the code-infra product (https://www.notion.so/mui-org/5562c14178aa42af97bc1fa5114000cd). type: new feature Expand the scope of the product to solve a new problem.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants