diff --git a/.npmignore b/.npmignore index dd3a92ca5..771f7db12 100644 --- a/.npmignore +++ b/.npmignore @@ -26,8 +26,11 @@ example # Exclude all TypeScript source files *.ts +*.mts !*.d.ts +!*.d.mts *test*.d.ts +*test*.d.mts # Exclude native fuzzer sources diff --git a/README.md b/README.md index 47879108d..85873303e 100644 --- a/README.md +++ b/README.md @@ -113,9 +113,13 @@ module.exports.fuzz = function (data /*: Buffer */) { }; ``` +ES modules are supported on Node.js >= 20.6 — use `export function fuzz` in a +`.js`/`.mjs` file with `"type": "module"` in your `package.json`. See +[docs/fuzz-targets.md](docs/fuzz-targets.md#esm-support) for details. + ## Documentation -Further documentation is available at [docs/readme.md](docs/README.md). +Further documentation is available at [docs/README.md](docs/README.md). ### Demo Video - Introduction to Jazzer.js diff --git a/docs/architecture.md b/docs/architecture.md index 7d57dcce3..7f0e95746 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -25,14 +25,16 @@ The _fuzzing library_ is the main entry point and is used to start fuzzing. It is invoked with the name of a _fuzz target_ module and possible fuzzer options. As a first step it loads a [native plugin](https://nodejs.org/api/addons.html#wrapping-c-objects) Node.js -addon, _fuzzer plugin_, to interact with libFuzzer and registers a `require` -hook, _interceptor_, to instrument subsequently loaded code. - -The _interceptor_ transforms code using [Babel](https://babeljs.io/) to provide -feedback to the fuzzer in multiple ways. It extends the loaded code to gather -coverage statistics so that the fuzzer can detect when new code paths are -reached. And it also intercepts comparison functions, like `==` or `!=`, to -detect the parts of the provided input that should be mutated and in what way. +addon, _fuzzer plugin_, to interact with libFuzzer and registers instrumentation +hooks to transform subsequently loaded code. + +For CJS modules, a `require` hook (via `istanbul-lib-hook`) intercepts loading. +For ES modules (Node >= 20.6), a loader hook registered via `module.register()` +intercepts `import` statements in a dedicated loader thread. + +Both paths use [Babel](https://babeljs.io/) to transform source code, inserting +coverage counters so the fuzzer can detect new code paths, and comparison hooks +(for `==`, `!=`, etc.) so the fuzzer can mutate input toward specific values. Available feedback methods are defined in libFuzzer's [hook interface definitions](https://github.com/llvm/llvm-project/blob/main/compiler-rt/include/sanitizer/common_interface_defs.h). @@ -70,14 +72,14 @@ information to detect progress and comparison hints to improve mutations. These feedback mechanisms have to be added to the application code without user intervention. -**Decision**: Use the `istanbul-lib-hook` library to hook into dynamic code -loading and `babel` plugins to extend the loaded code with the required feedback -functionality. Instrumenting the code at such a high level leads to an -independence of the underlying JavaScript engine. Furthermore, it is easily -possible to decide if a loaded module should be instrumented or not. +**Decision**: Use `istanbul-lib-hook` to hook into CJS loading and +`module.register()` to hook into ESM loading, with Babel plugins to insert +feedback instrumentation. Instrumenting at the source level decouples from the +underlying JavaScript engine and allows fine-grained control over which modules +are instrumented. **Consequences**: Independence of JavaScript engine, fine-grained -instrumentation control +instrumentation control, ESM support on Node >= 20.6. ## Visualization diff --git a/docs/bug-detectors.md b/docs/bug-detectors.md index 9d0b270e3..4c9d5c9e0 100644 --- a/docs/bug-detectors.md +++ b/docs/bug-detectors.md @@ -91,7 +91,7 @@ prototype, it will be able also find a way to modify other properties of the prototype that are not functions. If you find a use case where this assumption does not hold, feel free to open an issue. -_Disable with:_ `--disableBugDetectors=prototype-pollution`in CLI mode; or when +_Disable with:_ `--disableBugDetectors=prototype-pollution` in CLI mode; or when using Jest in `.jazzerjsrc.json`: ```json @@ -104,7 +104,7 @@ Hooks the `eval` and `Function` functions and reports a finding if the fuzzer was able to pass a special string to `eval` and to the function body of `Function`. -_Disable with:_ `--disable_bug_detectors=remote-code-execution`in CLI mode; or +_Disable with:_ `--disableBugDetectors=remote-code-execution` in CLI mode; or when using Jest in `.jazzerjsrc.json`: ```json @@ -127,8 +127,8 @@ getBugDetectorConfiguration("ssrf") .addPermittedUDPConnection("localhost", 9090); ``` -_Disable with:_ `--disable_bug_detectors=ssrf` in CLI mode; or when using Jest -in `.jazzerjsrc.json`: +_Disable with:_ `--disableBugDetectors=ssrf` in CLI mode; or when using Jest in +`.jazzerjsrc.json`: ```json { "disableBugDetectors": ["ssrf"] } diff --git a/docs/fuzz-settings.md b/docs/fuzz-settings.md index b36bca69f..43f3489a3 100644 --- a/docs/fuzz-settings.md +++ b/docs/fuzz-settings.md @@ -297,7 +297,7 @@ configured for Jest. add the following environment variable to the command: ```bash -JAZZER_COVERAGE='["json","lcov"]' npx jazzer my-fuzz-file --coverage +JAZZER_COVERAGE_REPORTERS='["json","lcov"]' npx jazzer my-fuzz-file --coverage ``` _Note:_ Setting this environmental variable in Jest mode has no effect. @@ -366,11 +366,11 @@ provide them as an array of strings, `Uint8Arrays`, or `Int8Arrays` to the `dictionaryEntries` option of the `it.fuzz` function: ```javascript -const xmlDictionary = ["IDREF"," {...}, - {dictionaryEntries: xmlDictionary} + {dictionaryEntries: xmlDictionary}); ``` ### `disableBugDetectors` : [array\] @@ -814,8 +814,7 @@ In _regression_ mode on command line, Jazzer.js runs each input from the seed and regression corpus directories on the fuzz target once, and then stops. Under the hood, this option adds `-runs=0` to the option [`fuzzerOptions`](#fuzzeroptions--arraystring). Setting the fuzzer option to -`-runs=0` (run each input only once) or `-runs=-1` (run each input indefinitely) -can be used to achieve the same behavior. +`-runs=0` (run each input only once) can be used to achieve the same behavior. **Jest:** Default: `"regression"`. @@ -954,7 +953,7 @@ example how a timeout of 1 second can be set for the test "My test 1": ```javascript it.fuzz("My test 1", (data) => {...}, - 1000 + 1000); ``` _Two:_ by providing it as part of an object with options as the third argument diff --git a/docs/fuzz-targets.md b/docs/fuzz-targets.md index 8c89218e3..dee120037 100644 --- a/docs/fuzz-targets.md +++ b/docs/fuzz-targets.md @@ -107,42 +107,97 @@ supported! However, it is possible to use the [Jest integration](jest-integration.md) to execute Jest fuzz tests written in TypeScript. -### ⚠️ Using Jazzer.js on pure ESM projects ⚠️ +### ESM support -ESM brings a couple of challenges to the table, which are currently not fully -solved. Jazzer.js does have general ESM support as in your project should be -loaded properly. If your project internally still relies on calls to -`require()`, all of these dependencies will be hooked. However, _pure_ -ECMAScript projects will currently not be instrumented! +Jazzer.js instruments ES modules via a +[Node.js loader hook](https://nodejs.org/api/module.html#customization-hooks) +(`module.register`). Coverage counters, compare hooks, and function hooks all +work on ESM code — the fuzzer sees the same feedback it gets from CJS modules. -The Jest integration can improve on this and use Jest's ESM features to properly -transform external code and dependencies. However, -[ESM support](https://jestjs.io/docs/ecmascript-modules) in Jest is also only -experimental. +**Requirements:** Node.js >= 20.6. Function hooks (bug detectors) additionally +require Node.js >= 20.11 (for `transferList` support in `module.register`). On +older Node versions, ESM loading still works but modules are not instrumented. -One such example that Jazzer.js can handle just fine can be found at -[examples/protobufjs/fuzz.js](../examples/protobufjs/protobufjs.fuzz.js): +#### Minimal ESM fuzz target ```js -import proto from "protobufjs"; -import { temporaryWriteSync } from "tempy"; - -describe("protobufjs", () => { - test.fuzz("loadSync", (data) => { - const file = temporaryWriteSync(data); - proto.loadSync(file); - }); -}); -``` +// fuzz.js (or fuzz.mjs) +import { parseInput } from "my-library"; -You also have to adapt your `package.json` accordingly, by adding: +export function fuzz(data) { + parseInput(data.toString()); +} +``` ```json { - "type": "module" + "type": "module", + "main": "fuzz.js", + "scripts": { + "fuzz": "jazzer fuzz -i my-library corpus" + }, + "devDependencies": { + "@jazzer.js/core": "^3.0.0" + } } ``` +```shell +npm run fuzz +``` + +The `-i` flag tells Jazzer.js which packages to instrument. Without it, +everything outside `node_modules` is instrumented by default. + +#### Direct ESM vs. Jest ESM — when to use which + +There are two ways to fuzz ESM code with Jazzer.js: + +| | Direct (`npx jazzer`) | Jest (`@jazzer.js/jest-runner`) | +| ----------------------------- | ------------------------------------------------------ | ------------------------------------------------------------------------ | +| **How ESM is instrumented** | Loader hook (`module.register`) — native ESM stays ESM | Babel transform via `jest.config` — ESM is converted to CJS at test time | +| **Node.js requirement** | >= 20.6 (function hooks: >= 20.11) | Any supported Node.js | +| **Fuzz target format** | `export function fuzz(data)` in a `.js`/`.mjs` file | `it.fuzz(name, fn)` inside a `.fuzz.cjs` test file | +| **Async targets** | Works out of the box (default mode) | Works out of the box | +| **Regression testing** | `--mode=regression` replays the corpus | Default Jest mode replays corpus seeds automatically | +| **IDE integration** | None (CLI only) | VS Code / IntelliJ run individual inputs | +| **Multiple targets per file** | No — one exported `fuzz` function per file | Yes — multiple `it.fuzz()` blocks in one file | +| **Corpus management** | Manual directory, passed as positional arg | Automatic per-test directories | + +**Rule of thumb:** use the Jest integration when you want multiple fuzz tests in +one file, IDE debugging, or need to support Node < 20.6. Use direct ESM when you +want a minimal setup with no Babel/Jest indirection — just your `.mjs` target +and `npx jazzer`. + +#### Jest ESM setup (Babel transform approach) + +When fuzzing an ESM library through Jest, the fuzz tests themselves must be CJS +(`.fuzz.cjs`), and Jest's Babel transform converts the library's ESM imports to +`require()` calls so that Jazzer.js's CJS instrumentation hooks can intercept +them: + +```js +// jest.config.cjs +module.exports = { + projects: [ + { + testRunner: "@jazzer.js/jest-runner", + testMatch: ["/fuzz/**/*.fuzz.cjs"], + transform: { + "\\.js$": [ + "babel-jest", + { plugins: ["@babel/plugin-transform-modules-commonjs"] }, + ], + }, + transformIgnorePatterns: ["/node_modules/"], + }, + ], +}; +``` + +This requires `@babel/core`, `babel-jest`, and +`@babel/plugin-transform-modules-commonjs` as dev dependencies. + ## Running the fuzz target After adding `@jazzer.js/core` as a `dev-dependency` to a project, the fuzzer @@ -179,7 +234,7 @@ Here we list some of the most important parameters: | --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `` | Import path to the fuzz target module. | | `[corpus...]` | Paths to the corpus directories. If not given, no initial seeds are used nor interesting inputs saved. | -| `-f`, `--fuzzFunction` | Name of the fuzz test entry point. It must be an exported function with a single [Buffer](https://nodejs.org/api/buffer.html) parameter. Default is `fuzz`. | +| `-f`, `--fuzzEntryPoint` | Name of the fuzz test entry point. It must be an exported function with a single [Buffer](https://nodejs.org/api/buffer.html) parameter. Default is `fuzz`. | | `-i`, `--includes` / `-e`, `--excludes` | Part of filepath names to include/exclude in the instrumentation. A tailing `/` should be used to include directories and prevent confusion with filenames. `*` can be used to include all files. Can be specified multiple times. Default will include everything outside the `node_modules` directory. If either of these flags are set the default value for the other is ignored. | | `--sync` | Enables synchronous fuzzing. **May only be used for entirely synchronous code**. | | `-h`, `--customHooks` | Filenames with custom hooks. Several hooks per file are possible. See further details in [docs/fuzz-settings.md](fuzz-settings.md#customhooks--arraystring). | @@ -194,14 +249,14 @@ In the following example, the `--coverage` flag is combined with the mode flag any fuzzing. ```shell -npx jazzer --mode=regression --corpus --coverage -- +npx jazzer --mode=regression --coverage -- ``` Alternatively, you can add a new script to your `package.json`: ```json "scripts": { - "coverage": "jazzer --mode=regression --includes=fileToInstrument --includes=anotherFileToInstrument --corpus --coverage -- " + "coverage": "jazzer --mode=regression -i fileToInstrument -i anotherFileToInstrument --coverage " } ``` @@ -219,6 +274,6 @@ This default directory can be changed by setting the flag ### Coverage reporters The desired report format can be set by the flag `--coverageReporters`, which by -default is set to `--coverageReporters clover json lcov text`. See +default is set to `--coverageReporters json text lcov clover`. See [here](https://github.com/istanbuljs/istanbuljs/tree/master/packages/istanbul-reports/lib) for a list of supported coverage reporters. diff --git a/docs/jest-integration.md b/docs/jest-integration.md index 52044c075..57c618e40 100644 --- a/docs/jest-integration.md +++ b/docs/jest-integration.md @@ -53,8 +53,8 @@ runner. "coverage": "jest --coverage" }, "devDependencies": { - "@jazzer.js/jest-runner": "2.1.0", - "jest": "29.3.1" + "@jazzer.js/jest-runner": "^3.0.0", + "jest": "^29.0.0" }, "jest": { "projects": [ @@ -145,6 +145,12 @@ enable source map generation in the TypeScript compiler options: These settings should be enough to start writing Jest fuzz tests in TypeScript. +With source maps enabled, libFuzzer outputs such as `-print_pcs` and +`-print_funcs` are also mapped back to original TypeScript files and line +numbers whenever mappings are available. If a location cannot be mapped (for +example, downlevel helper code), Jazzer.js falls back to the generated +JavaScript location. + **Note**: Using custom hooks written in TypeScript is currently not supported, as those are not pre-processed by Jest. @@ -188,13 +194,15 @@ const target = require("./target"); const { FuzzedDataProvider } = require("@jazzer.js/core"); describe("My describe", () => { - it.fuzz("My fuzz test", (data) => { - const provider = new FuzzedDataProvider(data); - target.fuzzMeMore( - provider.consumeNumber(), - provider.consumeBoolean(), - provider.consumeRemainingAsString()); - }); + it.fuzz("My fuzz test", (data) => { + const provider = new FuzzedDataProvider(data); + target.fuzzMeMore( + provider.consumeNumber(), + provider.consumeBoolean(), + provider.consumeRemainingAsString(), + ); + }); +}); ``` For more information on how to use the `FuzzedDataProvider` class, please refer @@ -211,14 +219,14 @@ possible to use `async/await`, `Promise` and `done callback` based tests. ```javascript describe("My describe", () => { - it.fuzz("My callback fuzz test", (data, done) => { - target.callbackFuzzMe(data, done); - }); - - it.fuzz("My async fuzz test", async (data) => { - await target.asyncFuzzMe(data); - }); -)}; + it.fuzz("My callback fuzz test", (data, done) => { + target.callbackFuzzMe(data, done); + }); + + it.fuzz("My async fuzz test", async (data) => { + await target.asyncFuzzMe(data); + }); +}); ``` ### TypeScript Jest fuzz tests @@ -372,7 +380,7 @@ Additional options for coverage report generation are described in the [fuzz targets documentation](./fuzz-targets.md#coverage-report-generation). The desired report format can be set by the flag `--coverageReporters`, which by -default is set to `--coverageReporters clover json lcov text`. See +default is set to `--coverageReporters json text lcov clover`. See [here](https://github.com/istanbuljs/istanbuljs/tree/master/packages/istanbul-reports/lib) for a list of supported coverage reporters. diff --git a/docs/release.md b/docs/release.md index 03e4ef34b..570a9d6af 100644 --- a/docs/release.md +++ b/docs/release.md @@ -21,7 +21,7 @@ To release a new version of Jazzer.js follow the described process: - An automatic changelog, based on the included merge requests, is added to the prerelease description - The prerelease is listed on the - [release page](https://github.com/CodeIntelligenceTesting/jazzer.js-commercial/releases) + [release page](https://github.com/CodeIntelligenceTesting/jazzer.js/releases) 8. Release the prerelease in GitHub - Adjust the prerelease description to include the highlights of the release - If you find some problems with the prerelease and want to start over: diff --git a/eslint.config.mjs b/eslint.config.mjs index e57a5326e..ceeee0cc3 100644 --- a/eslint.config.mjs +++ b/eslint.config.mjs @@ -128,6 +128,7 @@ export default tseslint.config( // Markdown files { files: ["**/*.md"], + ignores: ["AGENTS.md"], plugins: { markdownlint, }, diff --git a/examples/jest_typescript_integration/README.md b/examples/jest_typescript_integration/README.md index f7ae0fb69..f1d2adc3c 100644 --- a/examples/jest_typescript_integration/README.md +++ b/examples/jest_typescript_integration/README.md @@ -1,4 +1,4 @@ -# Jest Typscript Integration Example +# Jest TypeScript Integration Example Detailed documentation on the Jest integration is available in the main [Jazzer.js](https://github.com/CodeIntelligenceTesting/jazzer.js/blob/main/docs/jest-integration.md) @@ -14,26 +14,28 @@ The example below shows how to configure the Jazzer.js Jest integration in combination with the normal Jest runner. ```json - "jest": { - "projects": [ - { - "displayName": "Jest", - "preset": "ts-jest", - }, - { - "displayName": { - "name": "Jazzer.js", - "color": "cyan", - }, - "preset": "ts-jest", - "runner": "@jazzer.js/jest-runner", - "testEnvironment": "node", - "testMatch": ["/*.fuzz.[jt]s"], - }, - ], - "coveragePathIgnorePatterns": ["/node_modules/", "/dist/"], - "modulePathIgnorePatterns": ["/node_modules", "/dist/"], - } +{ + "jest": { + "projects": [ + { + "displayName": "Jest", + "preset": "ts-jest" + }, + { + "displayName": { + "name": "Jazzer.js", + "color": "cyan" + }, + "preset": "ts-jest", + "testRunner": "@jazzer.js/jest-runner", + "testEnvironment": "node", + "testMatch": ["/*.fuzz.[jt]s"] + } + ], + "coveragePathIgnorePatterns": ["/node_modules/", "/dist/"], + "modulePathIgnorePatterns": ["/node_modules", "/dist/"] + } +} ``` Further configuration can be specified in `.jazzerjsrc`, like in any other @@ -51,7 +53,7 @@ Write a Jest fuzz test like: ```typescript // file: jazzerjs.fuzz.ts -import "@jazzer.js/jest-runner/jest-extension"; +import "@jazzer.js/jest-runner"; describe("My describe", () => { it.fuzz("My fuzz test", (data: Buffer) => { target.fuzzMe(data); diff --git a/jest.config.js b/jest.config.js index dbb47871e..92fd7ad47 100644 --- a/jest.config.js +++ b/jest.config.js @@ -18,12 +18,22 @@ module.exports = { preset: "ts-jest", testEnvironment: "node", - modulePathIgnorePatterns: [ - "dist", - "packages/fuzzer/build", - "tests/code_coverage", - ], + modulePathIgnorePatterns: ["packages/fuzzer/build", "tests/code_coverage"], + testPathIgnorePatterns: ["/dist/", "/node_modules/"], testMatch: ["/packages/**/*.test.[jt]s"], collectCoverageFrom: ["packages/**/*.ts"], coveragePathIgnorePatterns: ["/node_modules/", "/dist/"], + transform: { + "^.+\\.tsx?$": [ + "ts-jest", + { + // ts-jest does not support composite project references. + // It compiles workspace .ts sources in one flat program, + // which breaks cross-package type resolution. Disabling + // diagnostics lets tsc -b (which does understand project + // refs) be the single source of truth for type checking. + diagnostics: false, + }, + ], + }, }; diff --git a/package-lock.json b/package-lock.json index 2d0ac1cd2..89879fbcd 100644 --- a/package-lock.json +++ b/package-lock.json @@ -37,7 +37,7 @@ "typescript-eslint": "^8.57.2" }, "engines": { - "node": ">= 20.0.0", + "node": ">= 20.*", "npm": ">= 7.0.0" } }, @@ -7469,7 +7469,7 @@ }, "devDependencies": {}, "engines": { - "node": ">= 20.0.0", + "node": ">= 14.0.0", "npm": ">= 7.0.0" } }, @@ -7485,17 +7485,17 @@ "istanbul-lib-coverage": "^3.2.2", "istanbul-lib-report": "^3.0.1", "istanbul-reports": "^3.1.7", - "tmp": "^0.2.3", - "yargs": "^18.0.0" + "tmp": "^0.2.5", + "yargs": "^17.7.2" }, "bin": { "jazzer": "dist/cli.js" }, "devDependencies": { - "@types/yargs": "^17.0.33" + "@types/yargs": "^17.0.35" }, "engines": { - "node": ">= 20.0.0", + "node": ">= 14.0.0", "npm": ">= 7.0.0" } }, @@ -7581,7 +7581,7 @@ "clang-format": "^1.8.0" }, "engines": { - "node": ">= 20.0.0", + "node": ">= 14.0.0", "npm": ">= 7.0.0" } }, @@ -7590,7 +7590,7 @@ "version": "3.1.0", "license": "Apache-2.0", "engines": { - "node": ">= 20.0.0", + "node": ">= 14.0.0", "npm": ">= 7.0.0" } }, @@ -7618,7 +7618,7 @@ "typescript": "^5.6.2" }, "engines": { - "node": ">= 20.0.0", + "node": ">= 14.0.0", "npm": ">= 7.0.0" } }, @@ -7646,7 +7646,7 @@ "tmp": "^0.2.3" }, "engines": { - "node": ">= 20.0.0", + "node": ">= 14.0.0", "npm": ">= 7.0.0" }, "peerDependencies": { diff --git a/package.json b/package.json index cc1893666..b377c9891 100644 --- a/package.json +++ b/package.json @@ -73,7 +73,7 @@ "**/!(compile_commands.json)*": "prettier --write --ignore-unknown --allow-empty --log-level debug" }, "engines": { - "node": ">= 20.0.0", + "node": ">= 20.*", "npm": ">= 7.0.0" } } diff --git a/packages/bug-detectors/README.md b/packages/bug-detectors/README.md index 064bdd1d8..f70922a6c 100644 --- a/packages/bug-detectors/README.md +++ b/packages/bug-detectors/README.md @@ -1,8 +1,8 @@ # @jazzer.js/bug-detectors The `@jazzer.js/bug-detectors` module is used by -[Jazzer.js](https://github.com/CodeIntelligenceTesting/jazzer.js-commercial#readme) -to detect and report bugs in JavaScript code. +[Jazzer.js](https://github.com/CodeIntelligenceTesting/jazzer.js#readme) to +detect and report bugs in JavaScript code. ## Install @@ -15,5 +15,5 @@ npm install --save-dev @jazzer.js/bug-detectors ## Documentation - Up-to-date - [information](https://github.com/CodeIntelligenceTesting/jazzer.js-commercial/blob/main/docs/fuzz-settings.md#bug-detectors) + [information](https://github.com/CodeIntelligenceTesting/jazzer.js/blob/main/docs/bug-detectors.md) about currently available bug detectors diff --git a/packages/core/README.md b/packages/core/README.md index 75fb83538..65f4ff05c 100644 --- a/packages/core/README.md +++ b/packages/core/README.md @@ -3,7 +3,7 @@ This is the main entry point and all most users have to install as a dev-dependency, so that they can fuzz their projects. -The `@jazzer.js/core` module provide a CLI interface via the `jazzer` command. +The `@jazzer.js/core` module provides a CLI interface via the `jazzer` command. It can be used by `npx` or node script command. To display a command documentation use the `--help` flag. @@ -25,5 +25,5 @@ npm install --save-dev @jazzer.js/core ## Documentation See -[Jazzer.js README](https://github.com/CodeIntelligenceTesting/jazzer.js-commercial#readme) +[Jazzer.js README](https://github.com/CodeIntelligenceTesting/jazzer.js#readme) for more information. diff --git a/packages/core/core.ts b/packages/core/core.ts index 7dc2bc99f..46b53b5c7 100644 --- a/packages/core/core.ts +++ b/packages/core/core.ts @@ -125,6 +125,12 @@ export async function initFuzzing( getJazzerJsGlobal("vmContext") ?? globalThis, ); + // Send the finalized hook definitions to the ESM loader thread + // so it can apply function-hook transforms to user modules. + // This must happen after finalizeHooks (hooks are complete) and + // before loadFuzzFunction (user modules are imported). + instrumentor.sendHooksToLoader(); + return instrumentor; } @@ -369,20 +375,38 @@ export function asFindingAwareFuzzFn( try { callbacks.runBeforeEachCallbacks(); result = (originalFuzzFn as fuzzer.FuzzTargetAsyncOrValue)(data); - // Explicitly set promise handlers to process findings, but still return - // the fuzz target result directly, so that sync execution is still - // possible. if (isPromiseLike(result)) { - result = result.then( - (result) => { - callbacks.runAfterEachCallbacks(); - return throwIfError() ?? result; - }, - (reason) => { - callbacks.runAfterEachCallbacks(); - return throwIfError(reason); - }, - ); + // Check if a finding was already detected synchronously + // (e.g., a before-hook threw inside an async function body, + // which stores the finding and returns a rejected Promise). + // If so, handle it synchronously and do NOT attach .then() + // handlers, as that would cause BOTH the synchronous throw + // (caught by the C++ catch block) AND the .then() rejection + // handler to resolve the C++ promise and deferred -- which + // is undefined behavior (double set_value on std::promise, + // double napi_reject_deferred) that can hang forked child + // processes. + const syncFinding = clearFirstFinding(); + if (syncFinding) { + // Suppress the unhandled rejection from the abandoned + // rejected Promise returned by the async fuzz target. + result.catch(() => {}); + callbacks.runAfterEachCallbacks(); + fuzzTargetError = syncFinding; + } else { + // No synchronous finding -- let the async chain handle + // findings that occur during promise resolution. + result = result.then( + (result) => { + callbacks.runAfterEachCallbacks(); + return throwIfError() ?? result; + }, + (reason) => { + callbacks.runAfterEachCallbacks(); + return throwIfError(reason); + }, + ); + } } else { callbacks.runAfterEachCallbacks(); } diff --git a/packages/core/finding.test.ts b/packages/core/finding.test.ts index 18ca91313..7e2de3652 100644 --- a/packages/core/finding.test.ts +++ b/packages/core/finding.test.ts @@ -81,6 +81,53 @@ describe("Finding", () => { ); expect(lines[3]).toEqual(""); }); + + it("print error with inherited/stale stack (e.g. pdf.js BaseException)", () => { + const printer = mockPrinter(); + // Simulate pdf.js's BaseException pattern: .name and .message are own + // properties, but .stack is inherited from a prototype Error instance. + const proto = new Error(); + const error = Object.create(proto) as Error; + Object.defineProperty(error, "message", { + value: "Command token too long: 128", + }); + Object.defineProperty(error, "name", { + value: "UnknownErrorException", + }); + // error.stack is inherited from proto — stale "Error" header + + printFinding(error, printer); + + const output = printer.printed(); + expect(output).toContain("Uncaught Exception:"); + expect(output).toContain( + "UnknownErrorException: Command token too long: 128", + ); + expect(output).not.toMatch(/\nError[:\s]*\n/); + }); + + it("print error without stack shows name: message", () => { + const printer = mockPrinter(); + const error = { name: "CustomError", message: "something broke" }; + + printFinding(error as Error, printer); + + const output = printer.printed(); + expect(output).toContain("Uncaught Exception:"); + expect(output).toContain("CustomError: something broke"); + }); + + it("print duck-typed error without .name falls back to Error", () => { + const printer = mockPrinter(); + const error = { message: "oops" }; + + printFinding(error as Error, printer); + + const output = printer.printed(); + expect(output).toContain("Uncaught Exception:"); + expect(output).toContain("Error: oops"); + expect(output).not.toContain("undefined"); + }); }); function mockPrinter() { diff --git a/packages/core/finding.ts b/packages/core/finding.ts index 52c80244e..baff60f16 100644 --- a/packages/core/finding.ts +++ b/packages/core/finding.ts @@ -108,9 +108,17 @@ export function printFinding( if (isError(error)) { if (error.stack) { cleanErrorStack(error); - print(error.stack); + if (error instanceof Finding) { + print(error.stack); + } else { + print(repairStackHeader(error)); + } } else { - print(error.message); + if (error instanceof Finding) { + print(error.message); + } else { + print(`${error.name || "Error"}: ${error.message}`); + } } } else if (typeof error === "string" || error instanceof String) { print(error.toString()); @@ -167,6 +175,25 @@ export function cleanErrorStack(error: unknown): void { .join("\n"); } +/** + * Fix the first line of error.stack when it doesn't reflect the actual + * .name/.message. This happens with legacy constructor patterns (e.g. pdf.js + * BaseException) where .stack is inherited from a prototype Error and shows + * a stale header from construction time. + */ +function repairStackHeader(error: Error): string { + const stack = error.stack!; + const name = error.name || "Error"; + const expectedPrefix = error.message ? `${name}: ${error.message}` : name; + const firstNewline = stack.indexOf("\n"); + const firstLine = firstNewline === -1 ? stack : stack.slice(0, firstNewline); + if (firstLine === expectedPrefix) { + return stack; + } + const rest = firstNewline === -1 ? "" : stack.slice(firstNewline); + return expectedPrefix + rest; +} + export function errorName(error: unknown): string { if (error instanceof Error) { // error objects diff --git a/packages/core/package.json b/packages/core/package.json index a786c6183..a0f098174 100644 --- a/packages/core/package.json +++ b/packages/core/package.json @@ -23,14 +23,14 @@ "@jazzer.js/fuzzer": "3.1.0", "@jazzer.js/hooking": "3.1.0", "@jazzer.js/instrumentor": "3.1.0", - "tmp": "^0.2.3", "istanbul-lib-coverage": "^3.2.2", "istanbul-lib-report": "^3.0.1", "istanbul-reports": "^3.1.7", - "yargs": "^18.0.0" + "tmp": "^0.2.5", + "yargs": "^17.7.2" }, "devDependencies": { - "@types/yargs": "^17.0.33" + "@types/yargs": "^17.0.35" }, "engines": { "node": ">= 14.0.0", diff --git a/packages/fuzzer/README.md b/packages/fuzzer/README.md index 41d34bcb8..9333484c9 100644 --- a/packages/fuzzer/README.md +++ b/packages/fuzzer/README.md @@ -6,10 +6,11 @@ shared object from GitHub but falls back to compilation on the user's machine if there is no suitable binary. Loading the addon initializes libFuzzer and the sanitizer runtime. Users can -then start the fuzzer with the exported `startFuzzing` function; see -[the test](fuzzer.test.ts) for an example. For the time being, the fuzzer runs -on the main thread and therefore blocks Node's event loop; this is most likely -what users want, so that their JS fuzz target can run in its normal environment. +then start the fuzzer with the exported `startFuzzing` or `startFuzzingAsync` +functions; see [the test](fuzzer.test.ts) for an example. In sync mode +(`--sync`), the fuzzer runs on the main thread and blocks the event loop. In the +default async mode, libFuzzer runs on a separate native thread and communicates +with the JS event loop via a thread-safe function. ## Development diff --git a/packages/fuzzer/addon.ts b/packages/fuzzer/addon.ts index fc939bc1c..b89495f18 100644 --- a/packages/fuzzer/addon.ts +++ b/packages/fuzzer/addon.ts @@ -40,6 +40,13 @@ export type StartFuzzingAsyncFn = ( type NativeAddon = { registerCoverageMap: (buffer: Buffer) => void; registerNewCounters: (oldNumCounters: number, newNumCounters: number) => void; + registerModuleCounters: (buffer: Buffer) => number; + registerPCLocations: ( + filename: string, + funcNames: string[], + entries: Int32Array, + pcBase: number, + ) => void; traceUnequalStrings: ( hookId: number, diff --git a/packages/fuzzer/coverage.ts b/packages/fuzzer/coverage.ts index 0f0cedded..c49e58109 100644 --- a/packages/fuzzer/coverage.ts +++ b/packages/fuzzer/coverage.ts @@ -22,6 +22,11 @@ export class CoverageTracker { private readonly coverageMap: Buffer; private currentNumCounters: number; + // Per-module counter buffers registered independently with libFuzzer. + // We must prevent GC from reclaiming these while libFuzzer still + // monitors the underlying memory. + private readonly moduleCounters: Buffer[] = []; + constructor() { this.coverageMap = Buffer.alloc(CoverageTracker.MAX_NUM_COUNTERS, 0); this.currentNumCounters = CoverageTracker.INITIAL_NUM_COUNTERS; @@ -65,6 +70,44 @@ export class CoverageTracker { readCounter(edgeId: number): number { return this.coverageMap.readUint8(edgeId); } + + /** + * Allocate an independent counter buffer for a single module and + * register it with libFuzzer as a new coverage region. This lets + * each ESM module own its own counters without sharing global IDs. + */ + /** + * Allocate an independent counter buffer for a single ES module and + * register it with libFuzzer as a new coverage region. + * + * Returns `{ counters, pcBase }` — the counter buffer for the module + * body and the base PC to pass to `registerPCLocations`. + */ + createModuleCounters(size: number): { counters: Buffer; pcBase: number } { + const buf = Buffer.alloc(size, 0); + this.moduleCounters.push(buf); + const pcBase = addon.registerModuleCounters(buf); + return { counters: buf, pcBase }; + } + + /** + * Register edge-to-source mappings for PC symbolization. + * + * @param filename Source file path + * @param funcNames Deduplicated function name table + * @param entries Flat Int32Array: + * [edgeId, line, col, funcIdx, isFuncEntry, ...] + * @param pcBase For ESM: the pcBase from createModuleCounters. + * For CJS: pass 0 (edge IDs are already global PCs). + */ + registerPCLocations( + filename: string, + funcNames: string[], + entries: Int32Array, + pcBase: number, + ): void { + addon.registerPCLocations(filename, funcNames, entries, pcBase); + } } export const coverageTracker = new CoverageTracker(); diff --git a/packages/fuzzer/shared/callbacks.cpp b/packages/fuzzer/shared/callbacks.cpp index 65bd06c92..d0217ad8d 100644 --- a/packages/fuzzer/shared/callbacks.cpp +++ b/packages/fuzzer/shared/callbacks.cpp @@ -21,6 +21,10 @@ void RegisterCallbackExports(Napi::Env env, Napi::Object exports) { Napi::Function::New(env); exports["registerNewCounters"] = Napi::Function::New(env); + exports["registerModuleCounters"] = + Napi::Function::New(env); + exports["registerPCLocations"] = + Napi::Function::New(env); exports["traceUnequalStrings"] = Napi::Function::New(env); exports["traceStringContainment"] = diff --git a/packages/fuzzer/shared/coverage.cpp b/packages/fuzzer/shared/coverage.cpp index 137bb3671..678537fbf 100644 --- a/packages/fuzzer/shared/coverage.cpp +++ b/packages/fuzzer/shared/coverage.cpp @@ -13,7 +13,10 @@ // limitations under the License. #include "coverage.h" -#include +#include +#include +#include +#include extern "C" { void __sanitizer_cov_8bit_counters_init(uint8_t *start, uint8_t *end); @@ -25,15 +28,24 @@ namespace { // We register an array of 8-bit coverage counters with libFuzzer. The array is // populated from JavaScript using Buffer. uint8_t *gCoverageCounters = nullptr; +size_t gCoverageCountersSize = 0; // PC-Table is used by libfuzzer to keep track of program addresses // corresponding to coverage counters. The flags determine whether the -// corresponding counter is the beginning of a function; we don't currently use -// it. +// corresponding counter is the beginning of a function. struct PCTableEntry { uintptr_t PC, PCFlags; }; +struct ModulePCTable { + uintptr_t basePC; + size_t numEntries; + PCTableEntry *entries; +}; + +std::vector gModulePCTables; +std::unordered_map gModulePCTableIndex; + // The array of supplementary information for coverage counters. Each entry // corresponds to an entry in gCoverageCounters; since we don't know the actual // addresses of our counters in JS land, we fill this table with fake @@ -52,6 +64,7 @@ void RegisterCoverageMap(const Napi::CallbackInfo &info) { auto buf = info[0].As>(); gCoverageCounters = reinterpret_cast(buf.Data()); + gCoverageCountersSize = buf.Length(); // Fill the PC table with fake entries. The only requirement is that the fake // addresses must not collide with the locations of real counters (e.g., from // instrumented C++ code). Therefore, we just use the address of the counter @@ -89,3 +102,261 @@ void RegisterNewCounters(const Napi::CallbackInfo &info) { __sanitizer_cov_pcs_init((uintptr_t *)(gPCEntries + old_num_counters), (uintptr_t *)(gPCEntries + new_num_counters)); } + +// Monotonically increasing fake PC so that each module's counters get +// unique program-counter entries that don't collide with the shared +// coverage map or with each other. +static uintptr_t gNextModulePC = 0x10000000; + +// Register an independent coverage counter region for a single ES module. +// libFuzzer supports multiple disjoint counter regions; each call here +// hands it a fresh one. Returns the base PC assigned to this module +// so the caller can pass it to RegisterPCLocations. +Napi::Value RegisterModuleCounters(const Napi::CallbackInfo &info) { + if (info.Length() != 1 || !info[0].IsBuffer()) { + throw Napi::Error::New(info.Env(), + "Need one argument: a Buffer of 8-bit counters"); + } + + auto buf = info[0].As>(); + auto size = buf.Length(); + if (size == 0) { + return Napi::Number::New(info.Env(), 0); + } + + auto basePC = gNextModulePC; + auto *pcEntries = new PCTableEntry[size]; + for (std::size_t i = 0; i < size; ++i) { + pcEntries[i] = {gNextModulePC++, 0}; + } + + __sanitizer_cov_8bit_counters_init(buf.Data(), buf.Data() + size); + __sanitizer_cov_pcs_init(reinterpret_cast(pcEntries), + reinterpret_cast(pcEntries + size)); + gModulePCTableIndex[basePC] = gModulePCTables.size(); + gModulePCTables.push_back({basePC, size, pcEntries}); + + return Napi::Number::New(info.Env(), static_cast(basePC)); +} + +// ── PC-to-source symbolization ─────────────────────────────────── +// +// Thread safety +// ~~~~~~~~~~~~~ +// These data structures are written by RegisterPCLocations (called from +// the JS event-loop thread via N-API) and read by SymbolizePC (called by +// libFuzzer via __sanitizer_symbolize_pc). +// +// In sync mode both paths share the same thread, so there is no race. +// +// In async mode libFuzzer runs on a dedicated native thread. There is +// no explicit lock protecting gStringTable / gCjsLocations / +// gEsmLocations, yet the access is still safe: +// +// JS thread libFuzzer thread +// ───────── ──────────────── +// CallJsFuzzCallback() FuzzCallbackAsync() +// fuzz target runs TSFN.BlockingCall(…) +// (may load modules → future.get() ← BLOCKS +// RegisterPCLocations writes) +// promise->set_value(…) ← unblocks +// returns to Fuzzer::RunOne +// TPC.UpdateObservedPCs() +// → PrintPC → SymbolizePC (reads) +// +// std::promise::set_value happens-before std::future::get returns +// (C++ [futures.state] §33.10.5), so every write made by the JS +// thread during a fuzzer iteration is visible to the native thread +// when it resumes and calls the symbolizer. +// +// This guarantee is implicit. It would break if module registration +// ever happened outside the synchronous scope of a TSFN callback +// (e.g. from a Node.js worker thread or a detached timer). +// +// Async-signal safety: libFuzzer installs signal handlers for SIGBUS, +// SIGABRT, etc., whose crash path calls PrintFinalStats → +// PrintCoverage → DescribePC → __sanitizer_symbolize_pc. DescribePC +// uses a try_to_lock mutex that returns "" on +// contention, but std::mutex::try_lock is itself not async-signal-safe. +// This is a pre-existing libFuzzer limitation, not specific to +// jazzer.js. jazzer.js overrides SIGINT and SIGSEGV with its own +// handlers that do not call the symbolizer. + +namespace { + +struct PCLocation { + uint32_t fileIdx; + uint32_t funcIdx; + uint32_t line; + uint32_t col; +}; + +// Deduplicated string table shared across all modules. The vector +// provides O(1) indexed access in SymbolizePC; the map provides O(1) +// amortized deduplication in internString. +std::vector gStringTable; +std::unordered_map gStringIndex; +// CJS location entries indexed directly by edge ID (PC = edge ID). +std::vector gCjsLocations; +// ESM location entries indexed by (pc - ESM_BASE). +std::vector gEsmLocations; +constexpr uintptr_t ESM_BASE = 0x10000000; + +uint32_t internString(const std::string &s) { + auto it = gStringIndex.find(s); + if (it != gStringIndex.end()) return it->second; + auto idx = static_cast(gStringTable.size()); + gStringTable.push_back(s); + gStringIndex.emplace(s, idx); + return idx; +} + +ModulePCTable *findModulePCTable(uintptr_t basePC) { + auto it = gModulePCTableIndex.find(basePC); + if (it == gModulePCTableIndex.end()) return nullptr; + return &gModulePCTables[it->second]; +} + +// Undo libFuzzer's GetNextInstructionPc before lookup. +uintptr_t toPCTablePC(uintptr_t symbolizerPC) { +#if defined(__aarch64__) || defined(__arm__) + return symbolizerPC - 4; +#else + return symbolizerPC - 1; +#endif +} + +} // namespace + +// Called from JS: registerPCLocations(filename, funcNames[], entries[], pcBase) +// entries is a flat Int32Array: +// [edgeId, line, col, funcIdx, isFuncEntry, ...] +// pcBase: for ESM pass the value returned by registerModuleCounters; +// for CJS pass 0 (edge IDs are already global PCs). +void RegisterPCLocations(const Napi::CallbackInfo &info) { + auto env = info.Env(); + if (info.Length() != 4) { + throw Napi::Error::New(env, "Expected 4 arguments: filename, " + "funcNames[], entries (Int32Array), pcBase"); + } + + auto filename = info[0].As().Utf8Value(); + auto funcArray = info[1].As(); + auto entries = info[2].As(); + auto pcBase = + static_cast(info[3].As().Int64Value()); + + uint32_t fileIdx = internString(filename); + + // Intern function names. + std::vector funcIndices(funcArray.Length()); + for (uint32_t i = 0; i < funcArray.Length(); ++i) { + auto name = funcArray.Get(i).As().Utf8Value(); + funcIndices[i] = internString(name); + } + + auto *data = static_cast( + entries.As().Data()); + auto length = entries.ElementLength(); + + bool isEsm = pcBase >= ESM_BASE; + auto baseOffset = isEsm ? pcBase - ESM_BASE : pcBase; + auto &locations = isEsm ? gEsmLocations : gCjsLocations; + auto *modulePCTable = isEsm ? findModulePCTable(pcBase) : nullptr; + + for (size_t i = 0; i + 4 < length; i += 5) { + auto edgeId = static_cast(data[i]); + auto line = static_cast(data[i + 1]); + auto col = static_cast(data[i + 2]); + auto localFuncIdx = static_cast(data[i + 3]); + bool isFuncEntry = data[i + 4] != 0; + + auto idx = baseOffset + edgeId; + if (idx >= locations.size()) { + locations.resize(idx + 1); + } + + uint32_t globalFuncIdx = + localFuncIdx < funcIndices.size() ? funcIndices[localFuncIdx] : 0; + locations[idx] = {fileIdx, globalFuncIdx, line, col}; + + if (!isFuncEntry) continue; + + if (isEsm) { + if (modulePCTable != nullptr && edgeId < modulePCTable->numEntries) { + modulePCTable->entries[edgeId].PCFlags |= 1; + } + } else if (gPCEntries != nullptr && edgeId < gCoverageCountersSize) { + gPCEntries[edgeId].PCFlags |= 1; + } + } +} + +void SymbolizePC(uintptr_t pc, const char *fmt, char *out_buf, + size_t out_buf_size) { + if (out_buf_size == 0) return; + + auto origPC = toPCTablePC(pc); + + const char *file = ""; + const char *func = ""; + uint32_t line = 0, col = 0; + + const PCLocation *loc = nullptr; + if (origPC >= ESM_BASE && origPC - ESM_BASE < gEsmLocations.size()) { + loc = &gEsmLocations[origPC - ESM_BASE]; + } else if (origPC < ESM_BASE && origPC < gCjsLocations.size()) { + loc = &gCjsLocations[origPC]; + } + if (loc && loc->line != 0) { + file = gStringTable[loc->fileIdx].c_str(); + func = gStringTable[loc->funcIdx].c_str(); + line = loc->line; + col = loc->col; + } + + size_t pos = 0; + // remaining() reserves one byte for the null terminator, so snprintf + // calls pass remaining()+1 as the buffer size (snprintf counts the + // null in its size parameter but we exclude it from remaining()). + auto remaining = [&]() { return out_buf_size - pos - 1; }; + auto advance = [&](int n) { if (n > 0) pos += std::min(static_cast(n), remaining()); }; + + for (const char *f = fmt; *f && remaining() > 0; ++f) { + if (*f == '%' && *(f + 1)) { + ++f; + switch (*f) { + case 'p': + // Virtual PCs are meaningless and %L already prints the file path. + // Eat the trailing space so the output doesn't start with " in". + if (*(f + 1) == ' ') ++f; + break; + case 'F': + advance(snprintf(out_buf + pos, remaining() + 1, "in %s", func)); + break; + case 'L': + advance(snprintf(out_buf + pos, remaining() + 1, "%s:%u:%u", + file, line, col)); + break; + case 's': + advance(snprintf(out_buf + pos, remaining() + 1, "%s", file)); + break; + case 'l': + advance(snprintf(out_buf + pos, remaining() + 1, "%u", line)); + break; + case 'c': + advance(snprintf(out_buf + pos, remaining() + 1, "%u", col)); + break; + default: + if (remaining() >= 2) { + out_buf[pos++] = '%'; + out_buf[pos++] = *f; + } + break; + } + } else { + out_buf[pos++] = *f; + } + } + out_buf[pos] = '\0'; +} diff --git a/packages/fuzzer/shared/coverage.h b/packages/fuzzer/shared/coverage.h index a95c8ec9a..3fab023a5 100644 --- a/packages/fuzzer/shared/coverage.h +++ b/packages/fuzzer/shared/coverage.h @@ -13,7 +13,17 @@ // limitations under the License. #pragma once +#include +#include + #include void RegisterCoverageMap(const Napi::CallbackInfo &info); void RegisterNewCounters(const Napi::CallbackInfo &info); +Napi::Value RegisterModuleCounters(const Napi::CallbackInfo &info); +void RegisterPCLocations(const Napi::CallbackInfo &info); + +// Resolve a fake PC to a human-readable description. Called by the +// __sanitizer_symbolize_pc override in sanitizer_symbols.cpp. +void SymbolizePC(uintptr_t pc, const char *fmt, char *out_buf, + size_t out_buf_size); diff --git a/packages/fuzzer/shared/sanitizer_symbols.cpp b/packages/fuzzer/shared/sanitizer_symbols.cpp index 77aab4e4e..eaf019b27 100644 --- a/packages/fuzzer/shared/sanitizer_symbols.cpp +++ b/packages/fuzzer/shared/sanitizer_symbols.cpp @@ -12,6 +12,8 @@ // See the License for the specific language governing permissions and // limitations under the License. +#include "coverage.h" + namespace libfuzzer { void (*PrintCrashingInput)() = nullptr; } @@ -27,3 +29,11 @@ __jazzer_set_death_callback(void (*callback)()) { // Suppress libFuzzer warnings about missing sanitizer methods extern "C" [[maybe_unused]] int __sanitizer_acquire_crash_state() { return 1; } extern "C" [[maybe_unused]] void __sanitizer_print_stack_trace() {} + +// Override libFuzzer's weak __sanitizer_symbolize_pc so that +// -print_pcs=1 and -print_coverage=1 show JS source locations. +extern "C" [[maybe_unused]] void +__sanitizer_symbolize_pc(void *pc, const char *fmt, char *out_buf, + size_t out_buf_size) { + SymbolizePC(reinterpret_cast(pc), fmt, out_buf, out_buf_size); +} diff --git a/packages/instrumentor/README.md b/packages/instrumentor/README.md index e4a3ba275..b7e202e4c 100644 --- a/packages/instrumentor/README.md +++ b/packages/instrumentor/README.md @@ -8,9 +8,10 @@ coverage statistics, so that the fuzzer can detect when new code paths are reached, and comparison feedback, to enable the fuzzer to mutate it's input in a meaningful way. -Code loading is intercepted using -[istanbul-lib-hook](https://github.com/istanbuljs/istanbuljs/tree/master/packages/istanbul-lib-hook) -, which also enables fine-grained control of when to apply the instrumentatino. +CJS modules are intercepted using +[istanbul-lib-hook](https://github.com/istanbuljs/istanbuljs/tree/master/packages/istanbul-lib-hook). +ES modules are intercepted via a Node.js loader hook (`module.register`, +requires Node >= 20.6). ## Install @@ -23,5 +24,5 @@ npm install --save-dev @jazzer.js/instrumentor ## Documentation See -[Jazzer.js README](https://github.com/CodeIntelligenceTesting/jazzer.js-commercial#readme) +[Jazzer.js README](https://github.com/CodeIntelligenceTesting/jazzer.js#readme) for more information. diff --git a/packages/instrumentor/SourceMapRegistry.ts b/packages/instrumentor/SourceMapRegistry.ts index c8ee4e2eb..fe581cf35 100644 --- a/packages/instrumentor/SourceMapRegistry.ts +++ b/packages/instrumentor/SourceMapRegistry.ts @@ -14,6 +14,10 @@ * limitations under the License. */ +import * as fs from "fs"; +import * as path from "path"; +import { fileURLToPath } from "url"; + import { RawSourceMap } from "source-map"; import sms from "source-map-support"; @@ -39,6 +43,8 @@ const regex = RegExp( "mg", ); +const URL_PREFIX = /^[a-zA-Z][a-zA-Z0-9+.-]*:\/\//; + /** * Extracts the inline source map from a code string. * @@ -54,14 +60,128 @@ export function extractInlineSourceMap(code: string): SourceMap | undefined { } } +/** + * Extracts a source map from code, preferring inline data URLs and + * falling back to file-based sourceMappingURL comments. + */ +export function extractSourceMap( + code: string, + filename: string, +): SourceMap | undefined { + return ( + extractInlineSourceMap(code) ?? extractExternalSourceMap(code, filename) + ); +} + +function extractExternalSourceMap( + code: string, + filename: string, +): SourceMap | undefined { + const sourceMapUrl = extractSourceMapUrl(code); + if (!sourceMapUrl || sourceMapUrl.startsWith("data:")) { + return; + } + + const sanitizedUrl = sourceMapUrl.split("#", 1)[0].split("?", 1)[0]; + const mapPath = resolveSourceMapPath(filename, sanitizedUrl); + if (!mapPath) { + return; + } + + try { + const mapContent = fs.readFileSync(mapPath, "utf8"); + return JSON.parse(mapContent); + } catch { + return; + } +} + +function extractSourceMapUrl(code: string): string | undefined { + let lineEnd = code.length; + while (lineEnd > 0) { + let lineStart = code.lastIndexOf("\n", lineEnd - 1); + lineStart = lineStart === -1 ? 0 : lineStart + 1; + + const sourceMapUrl = parseSourceMapDirective( + code.slice(lineStart, lineEnd).trim(), + ); + if (sourceMapUrl) { + return sourceMapUrl; + } + + if (lineStart === 0) { + break; + } + + lineEnd = lineStart - 1; + if (lineEnd > 0 && code[lineEnd - 1] === "\r") { + lineEnd--; + } + } +} + +function parseSourceMapDirective(line: string): string | undefined { + if (!line) { + return; + } + + let body: string; + if ((line.startsWith("//#") || line.startsWith("//@")) && line.length >= 3) { + body = line.slice(3); + } else if ( + (line.startsWith("/*#") || line.startsWith("/*@")) && + line.length >= 3 + ) { + body = line.endsWith("*/") ? line.slice(3, -2) : line.slice(3); + } else { + return; + } + + body = body.trimStart(); + const directive = "sourceMappingURL="; + if (!body.startsWith(directive)) { + return; + } + + const sourceMapUrl = body.slice(directive.length).trim(); + return sourceMapUrl || undefined; +} + +function resolveSourceMapPath( + filename: string, + sourceMapUrl: string, +): string | undefined { + if (!sourceMapUrl) { + return; + } + + if (sourceMapUrl.startsWith("file://")) { + return fileURLToPath(sourceMapUrl); + } + if (URL_PREFIX.test(sourceMapUrl)) { + return; + } + + let decodedUrl = sourceMapUrl; + try { + decodedUrl = decodeURIComponent(sourceMapUrl); + } catch { + // Keep undecoded value if it contains invalid escapes. + } + + return path.resolve(path.dirname(filename), decodedUrl); +} + export function toRawSourceMap( sourceMap?: SourceMap, ): RawSourceMap | undefined { if (sourceMap) { return { version: sourceMap.version.toString(), + file: sourceMap.file, sources: sourceMap.sources ?? [], names: sourceMap.names, + sourceRoot: sourceMap.sourceRoot, sourcesContent: sourceMap.sourcesContent, mappings: sourceMap.mappings, }; diff --git a/packages/instrumentor/edgeIdStrategy.ts b/packages/instrumentor/edgeIdStrategy.ts index dfc0c25e3..18903c2da 100644 --- a/packages/instrumentor/edgeIdStrategy.ts +++ b/packages/instrumentor/edgeIdStrategy.ts @@ -40,6 +40,8 @@ if (process.listeners) { export interface EdgeIdStrategy { nextEdgeId(): number; + /** Return the next edge ID that will be allocated, without consuming it. */ + peekNextEdgeId(): number; startForSourceFile(filename: string): void; commitIdCount(filename: string): void; } @@ -52,6 +54,10 @@ export abstract class IncrementingEdgeIdStrategy implements EdgeIdStrategy { return this._nextEdgeId++; } + peekNextEdgeId(): number { + return this._nextEdgeId; + } + abstract startForSourceFile(filename: string): void; abstract commitIdCount(filename: string): void; } @@ -241,6 +247,10 @@ export class ZeroEdgeIdStrategy implements EdgeIdStrategy { return 0; } + peekNextEdgeId(): number { + return 0; + } + startForSourceFile(filename: string): void { // Nothing to do here } diff --git a/packages/instrumentor/esm-loader.mts b/packages/instrumentor/esm-loader.mts new file mode 100644 index 000000000..b244e1819 --- /dev/null +++ b/packages/instrumentor/esm-loader.mts @@ -0,0 +1,314 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * Node.js module-loader hook for ESM instrumentation. + * + * Registered via module.register() from registerInstrumentor(). + * Runs in a dedicated loader thread — it has no access to the + * native fuzzer addon or to globalThis.Fuzzer. All it does is + * transform source code and hand it back. The transformed code + * executes in the main thread, where the Fuzzer global exists. + */ + +import type { PluginItem } from "@babel/core"; +import { createRequire } from "node:module"; +import * as path from "node:path"; +import { fileURLToPath } from "node:url"; +import { receiveMessageOnPort, type MessagePort } from "node:worker_threads"; + +// Load CJS-compiled Babel plugins via createRequire so we don't +// depend on Node.js CJS-named-export detection (varies by version). +const require = createRequire(import.meta.url); +const { transformSync } = + require("@babel/core") as typeof import("@babel/core"); +const { esmCodeCoverage } = + require("./plugins/esmCodeCoverage.js") as typeof import("./plugins/esmCodeCoverage.js"); +const { compareHooks } = + require("./plugins/compareHooks.js") as typeof import("./plugins/compareHooks.js"); +const { sourceCodeCoverage } = + require("./plugins/sourceCodeCoverage.js") as typeof import("./plugins/sourceCodeCoverage.js"); +const { functionHooks } = + require("./plugins/functionHooks.js") as typeof import("./plugins/functionHooks.js"); +const { buildPCLocationBatches } = + require("./pcLocationBatches.js") as typeof import("./pcLocationBatches.js"); +const { extractSourceMap, toRawSourceMap } = + require("./SourceMapRegistry.js") as typeof import("./SourceMapRegistry.js"); + +// The loader thread has its own CJS module cache, so this is a +// separate HookManager instance from the main thread's. We populate +// it with stub hooks from the serialized data we receive via the port. +const { hookManager: loaderHookManager } = + require("@jazzer.js/hooking") as typeof import("@jazzer.js/hooking"); + +// Already-instrumented code contains this marker. +const INSTRUMENTATION_MARKER = "Fuzzer.coverageTracker.incrementCounter"; + +// Counter buffer variable injected into each instrumented module. +const COUNTER_ARRAY = "__jazzer_cov"; + +const PROJECT_ROOT_PREFIX = (() => { + const cwd = path.resolve(process.cwd()); + return cwd.endsWith(path.sep) ? cwd : `${cwd}${path.sep}`; +})(); + +function stripProjectRootPrefix(filename: string): string { + return filename.startsWith(PROJECT_ROOT_PREFIX) + ? filename.slice(PROJECT_ROOT_PREFIX.length) + : filename; +} + +interface LoaderConfig { + includes: string[]; + excludes: string[]; + coverage: boolean; + port?: MessagePort; +} + +let config: LoaderConfig; +let loaderPort: MessagePort | null = null; + +export function initialize(data: LoaderConfig): void { + config = data; + if (data.port) { + loaderPort = data.port; + } +} + +interface LoadResult { + format?: string; + source?: string | ArrayBuffer | SharedArrayBuffer | Uint8Array; + shortCircuit?: boolean; +} + +type BabelInputSourceMap = { + version: number; + sources: string[]; + names: string[]; + sourceRoot?: string; + sourcesContent?: string[]; + mappings: string; + file: string; +}; + +type LoadFn = ( + url: string, + context: { format?: string | null }, + nextLoad: ( + url: string, + context: { format?: string | null }, + ) => Promise, +) => Promise; + +export const load: LoadFn = async function load(url, context, nextLoad) { + const result = await nextLoad(url, context); + + if (result.format !== "module" || !result.source) { + return result; + } + + // Only instrument file:// URLs (skip builtins, data:, https:, etc.) + if (!url.startsWith("file://")) { + return result; + } + + const filename = fileURLToPath(url); + if (!shouldInstrument(filename)) { + return result; + } + + const code = result.source.toString(); + + // Avoid double-instrumenting code already processed by the CJS path + // or by the Jest transformer. + if (code.includes(INSTRUMENTATION_MARKER)) { + return result; + } + + const instrumented = instrumentModule(code, filename); + if (!instrumented) { + return result; + } + + return { ...result, source: instrumented }; +}; + +// ── Instrumentation ────────────────────────────────────────────── + +function instrumentModule(code: string, filename: string): string | null { + drainHookUpdates(); + const inputSourceMap = extractSourceMap(code, filename); + + const fuzzerCoverage = esmCodeCoverage(); + + const plugins: PluginItem[] = [fuzzerCoverage.plugin, compareHooks]; + + // When --coverage is active, also apply Istanbul instrumentation so + // that ESM modules appear in the human-readable coverage report. + // The plugin writes to globalThis.__coverage__ at runtime (on the + // main thread), just like the CJS path does. + if (config.coverage) { + plugins.push(sourceCodeCoverage(filename)); + } + + // Apply function hooks if the main thread has sent hook definitions + // and any of them target functions in this file. The instrumented + // code calls HookManager.callHook(id, ...) at runtime, which + // resolves to the real hook function on the main thread. + if (loaderHookManager.hasFunctionsToHook(filename)) { + plugins.push(functionHooks(filename)); + } + + let transformed: ReturnType; + try { + const rawInputSourceMap = toRawSourceMap(inputSourceMap); + const babelInputSourceMap: BabelInputSourceMap | undefined = + rawInputSourceMap + ? { + version: Number(rawInputSourceMap.version), + sources: rawInputSourceMap.sources, + names: rawInputSourceMap.names, + sourceRoot: rawInputSourceMap.sourceRoot, + sourcesContent: rawInputSourceMap.sourcesContent, + mappings: rawInputSourceMap.mappings, + file: rawInputSourceMap.file ?? filename, + } + : undefined; + transformed = transformSync(code, { + filename, + sourceFileName: filename, + sourceMaps: true, + inputSourceMap: babelInputSourceMap, + plugins, + sourceType: "module", + }); + } catch { + // Babel parse failures on non-JS assets should not crash the + // loader — fall through and return the original source. + return null; + } + + const edges = fuzzerCoverage.edgeCount(); + if (edges === 0 || !transformed?.code) { + return null; + } + // Build a preamble that runs on the main thread before the module + // body. It allocates the per-module coverage counter buffer and, + // when a source map is available, registers it with the main-thread + // SourceMapRegistry so that source-map-support can remap stack + // traces back to the original source. + const preambleLines = [ + `const {counters: ${COUNTER_ARRAY}, pcBase: __jazzer_pcBase} = Fuzzer.coverageTracker.createModuleCounters(${edges});`, + ]; + + // Register edge-to-source mappings for PC symbolization. + // Serialized as a flat array: + // [id, line, col, funcIdx, isFuncEntry, ...] + const edgeEntries = fuzzerCoverage.edgeEntries(); + if (edgeEntries.length > 0) { + const funcNames = fuzzerCoverage.funcNames(); + const batches = buildPCLocationBatches( + edgeEntries, + filename, + inputSourceMap, + stripProjectRootPrefix, + ); + for (const batch of batches) { + preambleLines.push( + `Fuzzer.coverageTracker.registerPCLocations(` + + `${JSON.stringify(batch.filename)},` + + `${JSON.stringify(funcNames)},` + + `new Int32Array(${JSON.stringify(Array.from(batch.entries))}),` + + `__jazzer_pcBase);`, + ); + } + } + + if (transformed.map) { + // Shift the source map to account for the preamble lines we are + // about to prepend. In VLQ-encoded mappings each semicolon + // represents one generated line; prepending them pushes all real + // mappings down by the right amount. + const preambleOffset = preambleLines.length + 1; // +1 for the source map line itself + const shifted = { + ...transformed.map, + mappings: ";".repeat(preambleOffset) + transformed.map.mappings, + }; + preambleLines.push( + `__jazzer_registerSourceMap(${JSON.stringify(filename)}, ${JSON.stringify(shifted)});`, + ); + } + + return preambleLines.join("\n") + "\n" + transformed.code; +} + +// ── Function hooks from the main thread ────────────────────────── + +interface SerializedHook { + id: number; + type: number; + target: string; + pkg: string; + async: boolean; +} + +const noop = () => {}; + +/** + * Synchronously drain any hook-definition messages from the main + * thread. Uses receiveMessageOnPort — a non-blocking, synchronous + * read — so we never have to await or restructure the load() flow. + * + * The main thread sends hook data after finalizeHooks() and before + * user modules are loaded, so the message is always available by the + * time we process user code. + */ +function drainHookUpdates(): void { + if (!loaderPort) return; + + let msg; + while ((msg = receiveMessageOnPort(loaderPort))) { + const hooks = msg.message.hooks as SerializedHook[]; + for (const h of hooks) { + const stub = loaderHookManager.registerHook( + h.type, + h.target, + h.pkg, + h.async, + noop, + ); + // Sanity check: the stub's index in the loader must match the + // main thread's index so that runtime HookManager.callHook(id) + // invokes the correct hook function. + const actualId = loaderHookManager.hookIndex(stub); + if (actualId !== h.id) { + throw new Error( + `ESM hook ID mismatch: expected ${h.id}, got ${actualId} ` + + `for ${h.target} in ${h.pkg}`, + ); + } + } + } +} + +// ── Include / exclude filtering ────────────────────────────────── + +function shouldInstrument(filepath: string): boolean { + const { includes, excludes } = config; + const included = includes.some((p) => filepath.includes(p)); + const excluded = excludes.some((p) => filepath.includes(p)); + return included && !excluded; +} diff --git a/packages/instrumentor/esmFunctionHooks.test.ts b/packages/instrumentor/esmFunctionHooks.test.ts new file mode 100644 index 000000000..a347aea8c --- /dev/null +++ b/packages/instrumentor/esmFunctionHooks.test.ts @@ -0,0 +1,207 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { transformSync } from "@babel/core"; + +import { hookManager, HookType } from "@jazzer.js/hooking"; + +import { Instrumentor, SerializedHook } from "./instrument"; +import { functionHooks } from "./plugins/functionHooks"; + +/** + * These tests verify the ESM function-hook wiring: serialization of + * hooks on the main thread, registration of stubs in a simulated + * loader hookManager, and correct Babel output with matching IDs. + * + * We cannot spawn a real loader thread in a unit test, so we exercise + * the same logic inline: register stub hooks in the global hookManager + * (which the functionHooks plugin reads from) and verify the output. + */ + +afterEach(() => { + hookManager.clearHooks(); +}); + +describe("ESM function hook serialization", () => { + it("should serialize hooks with explicit IDs", () => { + hookManager.registerHook( + HookType.Before, + "execSync", + "child_process", + false, + () => {}, + ); + hookManager.registerHook( + HookType.Replace, + "fetch", + "node-fetch", + true, + () => {}, + ); + + const serialized: SerializedHook[] = hookManager.hooks.map( + (hook, index) => ({ + id: index, + type: hook.type, + target: hook.target, + pkg: hook.pkg, + async: hook.async, + }), + ); + + expect(serialized).toEqual([ + { + id: 0, + type: HookType.Before, + target: "execSync", + pkg: "child_process", + async: false, + }, + { + id: 1, + type: HookType.Replace, + target: "fetch", + pkg: "node-fetch", + async: true, + }, + ]); + }); + + it("should round-trip through JSON (MessagePort serialization)", () => { + hookManager.registerHook(HookType.After, "readFile", "fs", false, () => {}); + + const serialized: SerializedHook[] = hookManager.hooks.map( + (hook, index) => ({ + id: index, + type: hook.type, + target: hook.target, + pkg: hook.pkg, + async: hook.async, + }), + ); + + // structuredClone simulates what MessagePort does + const received = structuredClone(serialized); + expect(received).toEqual(serialized); + expect(received[0].type).toBe(HookType.After); + }); +}); + +describe("ESM function hook stub registration", () => { + it("should produce matching IDs when stubs are registered in order", () => { + // Simulate the main thread registering real hooks + const realHook1 = hookManager.registerHook( + HookType.Before, + "execSync", + "child_process", + false, + () => {}, + ); + const realHook2 = hookManager.registerHook( + HookType.Replace, + "connect", + "net", + false, + () => {}, + ); + + const mainId1 = hookManager.hookIndex(realHook1); + const mainId2 = hookManager.hookIndex(realHook2); + + // Serialize + const serialized: SerializedHook[] = hookManager.hooks.map( + (hook, index) => ({ + id: index, + type: hook.type, + target: hook.target, + pkg: hook.pkg, + async: hook.async, + }), + ); + + // Clear and re-register as the loader thread would + hookManager.clearHooks(); + for (const h of serialized) { + const stub = hookManager.registerHook( + h.type, + h.target, + h.pkg, + h.async, + () => {}, + ); + expect(hookManager.hookIndex(stub)).toBe(h.id); + } + + // IDs in the loader match the original main-thread IDs + expect(hookManager.hookIndex(hookManager.hooks[0])).toBe(mainId1); + expect(hookManager.hookIndex(hookManager.hooks[1])).toBe(mainId2); + }); +}); + +describe("ESM function hook Babel output", () => { + it("should insert HookManager.callHook with the correct hook ID", () => { + hookManager.registerHook( + HookType.Before, + "processInput", + "target-pkg", + false, + () => {}, + ); + + const result = transformSync( + "function processInput(data) { return data.trim(); }", + { + filename: "/app/node_modules/target-pkg/index.js", + plugins: [functionHooks("/app/node_modules/target-pkg/index.js")], + }, + ); + + expect(result?.code).toContain("HookManager.callHook(0,"); + expect(result?.code).toContain("this, [data]"); + }); + + it("should not hook functions in non-matching files", () => { + hookManager.registerHook( + HookType.Before, + "dangerous", + "target-pkg", + false, + () => {}, + ); + + const result = transformSync("function dangerous(x) { return x; }", { + filename: "/app/node_modules/other-pkg/lib.js", + plugins: [functionHooks("/app/node_modules/other-pkg/lib.js")], + }); + + expect(result?.code).not.toContain("HookManager.callHook"); + }); + + it("should use sendHooksToLoader to serialize from Instrumentor", () => { + hookManager.registerHook( + HookType.Before, + "exec", + "child_process", + false, + () => {}, + ); + + const instrumentor = new Instrumentor(); + + // Without a port, sendHooksToLoader is a no-op (no crash) + expect(() => instrumentor.sendHooksToLoader()).not.toThrow(); + }); +}); diff --git a/packages/instrumentor/esmSourceMaps.test.ts b/packages/instrumentor/esmSourceMaps.test.ts new file mode 100644 index 000000000..1758b3b80 --- /dev/null +++ b/packages/instrumentor/esmSourceMaps.test.ts @@ -0,0 +1,176 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { PluginItem, transformSync } from "@babel/core"; + +import { compareHooks } from "./plugins/compareHooks"; +import { esmCodeCoverage } from "./plugins/esmCodeCoverage"; +import { SourceMap, SourceMapRegistry } from "./SourceMapRegistry"; + +const COUNTER_ARRAY = "__jazzer_cov"; + +/** + * Replicate the ESM loader's instrumentModule logic so we can test + * the source map handling without running a real loader thread. + */ +function instrumentModule( + code: string, + filename: string, + extraPlugins: PluginItem[] = [], +): { source: string; map: SourceMap | null } | null { + const fuzzerCoverage = esmCodeCoverage(); + const plugins: PluginItem[] = [ + fuzzerCoverage.plugin, + compareHooks, + ...extraPlugins, + ]; + + const transformed = transformSync(code, { + filename, + sourceFileName: filename, + sourceMaps: true, + plugins, + sourceType: "module", + }); + + const edges = fuzzerCoverage.edgeCount(); + if (edges === 0 || !transformed?.code) { + return null; + } + + const preambleLines = [ + `const {counters: ${COUNTER_ARRAY}, pcBase: __jazzer_pcBase} = Fuzzer.coverageTracker.createModuleCounters(${edges});`, + ]; + + let shiftedMap: SourceMap | null = null; + if (transformed.map) { + const preambleOffset = preambleLines.length + 1; + shiftedMap = { + ...transformed.map, + mappings: ";".repeat(preambleOffset) + transformed.map.mappings, + } as SourceMap; + preambleLines.push( + `__jazzer_registerSourceMap(${JSON.stringify(filename)}, ${JSON.stringify(shiftedMap)});`, + ); + } + + return { + source: preambleLines.join("\n") + "\n" + transformed.code, + map: shiftedMap, + }; +} + +describe("ESM source map handling", () => { + it("should produce a separate source map, not an inline one", () => { + const result = instrumentModule( + "export function greet() { return 'hi'; }", + "/app/greet.mjs", + ); + + expect(result).not.toBeNull(); + expect(result!.source).not.toContain("sourceMappingURL=data:"); + expect(result!.map).not.toBeNull(); + expect(result!.map!.version).toBe(3); + }); + + it("should shift mappings by the number of preamble lines", () => { + const result = instrumentModule( + "export function greet() { return 'hi'; }", + "/app/greet.mjs", + ); + + expect(result!.map).not.toBeNull(); + const mappings = result!.map!.mappings; + + // The preamble has 2 lines (counter allocation + source map registration). + // Each prepended ";" represents an unmapped generated line. + expect(mappings.startsWith(";;")).toBe(true); + + // The real mappings follow — they should not be empty. + const realMappings = mappings.replace(/^;+/, ""); + expect(realMappings.length).toBeGreaterThan(0); + }); + + it("should embed a registration call in the preamble", () => { + const filename = "/app/target.mjs"; + const result = instrumentModule( + "export function check(s) { if (s === 'x') throw new Error(); }", + filename, + ); + + const lines = result!.source.split("\n"); + + // Line 1: counter allocation + expect(lines[0]).toContain("Fuzzer.coverageTracker.createModuleCounters"); + + // Line 2: source map registration with the correct filename + expect(lines[1]).toContain("__jazzer_registerSourceMap"); + expect(lines[1]).toContain(JSON.stringify(filename)); + + // The registration call should contain valid JSON for the source map + const match = lines[1].match(/__jazzer_registerSourceMap\([^,]+, (.+)\);$/); + expect(match).not.toBeNull(); + const embeddedMap = JSON.parse(match![1]); + expect(embeddedMap.version).toBe(3); + expect(embeddedMap.sources).toContain(filename); + }); + + it("should register maps with SourceMapRegistry via the global", () => { + const registry = new SourceMapRegistry(); + const filename = "/app/module.mjs"; + const fakeMap: SourceMap = { + version: 3, + sources: [filename], + names: [], + mappings: "AAAA", + file: filename, + }; + + // Simulate what Instrumentor.init() installs + (globalThis as Record).__jazzer_registerSourceMap = ( + f: string, + m: SourceMap, + ) => registry.registerSourceMap(f, m); + + // Simulate what the preamble does at module evaluation time + const register = (globalThis as Record) + .__jazzer_registerSourceMap as (f: string, m: SourceMap) => void; + register(filename, fakeMap); + + expect(registry.getSourceMap(filename)).toEqual(fakeMap); + + // Cleanup + delete (globalThis as Record).__jazzer_registerSourceMap; + }); + + it("should preserve original source file in the map", () => { + const filename = "/project/src/lib.mjs"; + const result = instrumentModule( + [ + "export function add(a, b) {", + " return a + b;", + "}", + "export function sub(a, b) {", + " return a - b;", + "}", + ].join("\n"), + filename, + ); + + expect(result!.map!.sources).toContain(filename); + expect(result!.map!.mappings.split(";").length).toBeGreaterThan(2); + }); +}); diff --git a/packages/instrumentor/instrument.ts b/packages/instrumentor/instrument.ts index 0f07c5784..f7587093b 100644 --- a/packages/instrumentor/instrument.ts +++ b/packages/instrumentor/instrument.ts @@ -14,6 +14,10 @@ * limitations under the License. */ +import * as path from "path"; +import { pathToFileURL } from "url"; +import { MessageChannel, type MessagePort } from "worker_threads"; + import { BabelFileResult, PluginItem, @@ -22,16 +26,18 @@ import { } from "@babel/core"; import { hookRequire, TransformerOptions } from "istanbul-lib-hook"; -import { hookManager } from "@jazzer.js/hooking"; +import { fuzzer } from "@jazzer.js/fuzzer"; +import { hookManager, HookType } from "@jazzer.js/hooking"; import { EdgeIdStrategy, MemorySyncIdStrategy } from "./edgeIdStrategy"; +import { buildPCLocationBatches } from "./pcLocationBatches"; import { instrumentationPlugins } from "./plugin"; -import { codeCoverage } from "./plugins/codeCoverage"; +import { cjsCoverage, CjsCoverageResult } from "./plugins/codeCoverage"; import { compareHooks } from "./plugins/compareHooks"; import { functionHooks } from "./plugins/functionHooks"; import { sourceCodeCoverage } from "./plugins/sourceCodeCoverage"; import { - extractInlineSourceMap, + extractSourceMap, SourceMap, SourceMapRegistry, toRawSourceMap, @@ -46,7 +52,34 @@ export { } from "./edgeIdStrategy"; export { SourceMap } from "./SourceMapRegistry"; +/** + * Serializable hook descriptor sent from the main thread to the ESM + * loader thread. The hook function itself stays on the main thread; + * only the metadata needed for the Babel transform crosses the boundary. + */ +export interface SerializedHook { + id: number; + type: HookType; + target: string; + pkg: string; + async: boolean; +} + +const PROJECT_ROOT_PREFIX = (() => { + const cwd = path.resolve(process.cwd()); + return cwd.endsWith(path.sep) ? cwd : `${cwd}${path.sep}`; +})(); + +function stripProjectRootPrefix(filename: string): string { + return filename.startsWith(PROJECT_ROOT_PREFIX) + ? filename.slice(PROJECT_ROOT_PREFIX.length) + : filename; +} + export class Instrumentor { + private loaderPort: MessagePort | null = null; + private readonly cjsCoverage: CjsCoverageResult; + constructor( private readonly includes: string[] = [], private readonly excludes: string[] = [], @@ -63,26 +96,41 @@ export class Instrumentor { } this.includes = Instrumentor.cleanup(includes); this.excludes = Instrumentor.cleanup(excludes); + this.cjsCoverage = cjsCoverage(this.idStrategy); } init(): () => void { if (this.includes.includes("jazzer.js")) { this.unloadInternalModules(); } + + // Expose a registration function so ESM modules can feed their + // source maps back to the main-thread registry. The ESM loader + // thread cannot access this registry directly, but the preamble + // code it emits runs on the main thread during module evaluation + // — before the module body, and therefore before any error could + // need the map for stack-trace rewriting. + const registry = this.sourceMapRegistry; + (globalThis as Record).__jazzer_registerSourceMap = ( + filename: string, + map: SourceMap, + ) => registry.registerSourceMap(filename, map); + return this.sourceMapRegistry.installSourceMapSupport(); } instrument(code: string, filename: string, sourceMap?: SourceMap) { - // Extract inline source map from code string and use it as input source map - // in further transformations. - const inputSourceMap = sourceMap ?? extractInlineSourceMap(code); + // Extract source maps from the transformed code (inline or external) + // and use them as input source maps in further transformations. + const inputSourceMap = sourceMap ?? extractSourceMap(code, filename); const transformations: PluginItem[] = []; const shouldInstrumentFile = this.shouldInstrumentForFuzzing(filename); if (shouldInstrumentFile) { + this.cjsCoverage.clear(); transformations.push( ...instrumentationPlugins.plugins, - codeCoverage(this.idStrategy), + this.cjsCoverage.plugin, compareHooks, ); } @@ -122,11 +170,36 @@ export class Instrumentor { } } if (shouldInstrumentFile) { + this.registerCjsPCLocations(filename, inputSourceMap); this.idStrategy.commitIdCount(filename); } return result; } + private registerCjsPCLocations( + filename: string, + sourceMap?: SourceMap, + ): void { + const entries = this.cjsCoverage.edgeEntries(); + if (entries.length === 0) return; + + const funcNames = this.cjsCoverage.funcNames(); + const batches = buildPCLocationBatches( + entries, + filename, + sourceMap, + stripProjectRootPrefix, + ); + for (const batch of batches) { + fuzzer.coverageTracker.registerPCLocations( + batch.filename, + funcNames, + batch.entries, + 0, + ); + } + } + // eslint-disable-next-line @typescript-eslint/no-explicit-any private asInputSourceOption(inputSourceMap: any): any { // Empty input source maps mess up the coverage report. @@ -183,6 +256,47 @@ export class Instrumentor { ); } + get dryRun(): boolean { + return this.isDryRun; + } + + get includePatterns(): string[] { + return this.includes; + } + + get excludePatterns(): string[] { + return this.excludes; + } + + get coverageEnabled(): boolean { + return this.shouldCollectSourceCodeCoverage; + } + + /** Connect the main-thread side of the loader MessagePort. */ + setLoaderPort(port: MessagePort): void { + this.loaderPort = port; + } + + /** + * Send the current hook definitions to the ESM loader thread so it + * can apply function-hook transformations. Must be called after all + * hooks are registered and finalized, but before user modules are + * loaded. + */ + sendHooksToLoader(): void { + if (!this.loaderPort) return; + + const hooks: SerializedHook[] = hookManager.hooks.map((hook, index) => ({ + id: index, + type: hook.type, + target: hook.target, + pkg: hook.pkg, + async: hook.async, + })); + + this.loaderPort.postMessage({ hooks }); + } + private shouldCollectCodeCoverage(filepath: string): boolean { return ( this.shouldCollectSourceCodeCoverage && @@ -223,4 +337,73 @@ export function registerInstrumentor(instrumentor: Instrumentor) { // instrumentor but the filename will still have a .ts extension { extensions: [".js", ".mjs", ".cjs", ".ts", ".mts", ".cts"] }, ); + + registerEsmHooks(instrumentor); +} + +/** + * On Node.js >= 20.6 register an ESM loader hook so that + * import() and static imports are instrumented too. + * + * On Node >= 20.11 (where module.register supports transferList) we + * also establish a MessagePort to the loader thread. This lets us + * send function-hook definitions after bug detectors are loaded — + * well before user modules are imported. + */ +function registerEsmHooks(instrumentor: Instrumentor): void { + if (instrumentor.dryRun) { + return; + } + + const [major, minor] = process.versions.node.split(".").map(Number); + if (major < 20 || (major === 20 && minor < 6)) { + return; + } + + // transferList (needed for MessagePort) requires Node >= 20.11. + // On older 20.x builds, ESM gets coverage and compare-hooks but + // not function hooks — a MessagePort in `data` without transferList + // would throw DataCloneError, so we simply omit it. + const supportsTransferList = major > 20 || (major === 20 && minor >= 11); + + try { + const { register } = require("node:module") as { + register: ( + specifier: string, + options: { + parentURL: string; + data: unknown; + transferList?: unknown[]; + }, + ) => void; + }; + + const loaderUrl = pathToFileURL( + path.join(__dirname, "esm-loader.mjs"), + ).href; + + // eslint-disable-next-line @typescript-eslint/no-explicit-any + const data: Record = { + includes: instrumentor.includePatterns, + excludes: instrumentor.excludePatterns, + coverage: instrumentor.coverageEnabled, + }; + + const options: { + parentURL: string; + data: unknown; + transferList?: unknown[]; + } = { parentURL: pathToFileURL(__filename).href, data }; + + if (supportsTransferList) { + const { port1, port2 } = new MessageChannel(); + data.port = port2; + options.transferList = [port2]; + instrumentor.setLoaderPort(port1); + } + + register(loaderUrl, options); + } catch { + // Silently fall back to CJS-only instrumentation. + } } diff --git a/packages/instrumentor/instrumentPcLocations.test.ts b/packages/instrumentor/instrumentPcLocations.test.ts new file mode 100644 index 000000000..512ba4746 --- /dev/null +++ b/packages/instrumentor/instrumentPcLocations.test.ts @@ -0,0 +1,224 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import * as fs from "fs"; +import * as path from "path"; + +import * as tmp from "tmp"; +import ts from "typescript"; + +import { fuzzer } from "@jazzer.js/fuzzer"; + +import { Instrumentor } from "./instrument"; +import { SourceMap } from "./SourceMapRegistry"; + +jest.mock("@jazzer.js/fuzzer"); + +tmp.setGracefulCleanup(); + +describe("PC location source map remapping", () => { + const registerPCLocationsMock = jest.mocked( + fuzzer.coverageTracker.registerPCLocations, + ); + + beforeEach(() => { + registerPCLocationsMock.mockClear(); + }); + + it("maps CJS edge locations to TypeScript files", () => { + const sourceFile = path.join(process.cwd(), "src", "target.ts"); + const generatedFile = path.join(process.cwd(), "dist", "target.js"); + + const transpiled = transpile( + [ + "interface Marker {", + " value: number;", + "}", + "", + "class Parser {", + " makeFilter(stream: string, maybeLength: number) {", + " if (maybeLength === 0) return stream;", + " return stream + 'x';", + " }", + "}", + "", + "export function run(x: number) {", + " if (x > 1) return x;", + " return 0;", + "}", + ].join("\n"), + sourceFile, + path.join(process.cwd(), "src"), + path.join(process.cwd(), "dist"), + ); + + const instrumentor = new Instrumentor(); + instrumentor.instrument(transpiled.code, generatedFile, transpiled.map); + + expect(registerPCLocationsMock).toHaveBeenCalled(); + + const tsCall = registerPCLocationsMock.mock.calls.find(([filename]) => + String(filename).endsWith(path.join("src", "target.ts")), + ); + expect(tsCall).toBeDefined(); + + if (!tsCall) { + return; + } + + const [, funcNames, entries, pcBase] = tsCall; + expect(pcBase).toBe(0); + expect(entries.length % 5).toBe(0); + + const tuples = toTuples(entries); + expect( + tuples.some( + ([, line, , funcIdx, isFuncEntry]) => + isFuncEntry === 1 && + funcNames[funcIdx] === "Parser.makeFilter" && + line === 6, + ), + ).toBe(true); + }); + + it("loads external source map files for CJS symbolization", () => { + const dir = tmp.dirSync({ unsafeCleanup: true }); + const srcDir = path.join(dir.name, "src"); + const distDir = path.join(dir.name, "dist"); + fs.mkdirSync(srcDir, { recursive: true }); + fs.mkdirSync(distDir, { recursive: true }); + + const sourceFile = path.join(srcDir, "target.ts"); + const generatedFile = path.join(distDir, "target.js"); + + const transpiled = transpile( + [ + "export class Parser {", + " makeFilter(stream: string, maybeLength: number) {", + " if (maybeLength === 0) return stream;", + " return stream + 'x';", + " }", + "}", + ].join("\n"), + sourceFile, + srcDir, + distDir, + ); + + const sourceMapPath = path.join(distDir, "target.js.map"); + fs.writeFileSync(sourceMapPath, JSON.stringify(transpiled.map)); + + const instrumentor = new Instrumentor(); + instrumentor.instrument(transpiled.code, generatedFile); + + expect(registerPCLocationsMock).toHaveBeenCalled(); + expect( + registerPCLocationsMock.mock.calls.some(([filename]) => + String(filename).endsWith(path.join("src", "target.ts")), + ), + ).toBe(true); + }); + + it("keeps generated JS locations when TypeScript mappings are missing", () => { + const sourceFile = path.join(process.cwd(), "src", "downlevel.ts"); + const generatedFile = path.join(process.cwd(), "dist", "downlevel.js"); + + const transpiled = transpile( + [ + "class Base { value = 1; }", + "class Child extends Base {", + " method(x: number) {", + " if (x > 0) return this.value + x;", + " return x;", + " }", + "}", + "export const run = (n: number) => new Child().method(n);", + ].join("\n"), + sourceFile, + path.join(process.cwd(), "src"), + path.join(process.cwd(), "dist"), + ts.ScriptTarget.ES5, + ); + + const instrumentor = new Instrumentor(); + instrumentor.instrument(transpiled.code, generatedFile, transpiled.map); + + const filenames = registerPCLocationsMock.mock.calls.map(([filename]) => + String(filename), + ); + expect( + filenames.some((filename) => + filename.endsWith(path.join("src", "downlevel.ts")), + ), + ).toBe(true); + expect( + filenames.some((filename) => + filename.endsWith(path.join("dist", "downlevel.js")), + ), + ).toBe(true); + + const totalRegisteredEntries = registerPCLocationsMock.mock.calls.reduce( + (total, [, , entries]) => total + entries.length / 5, + 0, + ); + expect(totalRegisteredEntries).toBeGreaterThan(0); + }); +}); + +function transpile( + code: string, + sourceFile: string, + rootDir: string, + outDir: string, + target: ts.ScriptTarget = ts.ScriptTarget.ES2018, +): { code: string; map: SourceMap } { + const transpiled = ts.transpileModule(code, { + compilerOptions: { + target, + module: ts.ModuleKind.CommonJS, + sourceMap: true, + inlineSources: true, + rootDir, + outDir, + }, + fileName: sourceFile, + }); + + if (!transpiled.sourceMapText) { + throw new Error( + "Expected TypeScript transpilation to produce a source map", + ); + } + + return { + code: transpiled.outputText, + map: JSON.parse(transpiled.sourceMapText), + }; +} + +function toTuples(entries: Int32Array): number[][] { + const tuples: number[][] = []; + for (let i = 0; i + 4 < entries.length; i += 5) { + tuples.push([ + entries[i], + entries[i + 1], + entries[i + 2], + entries[i + 3], + entries[i + 4], + ]); + } + return tuples; +} diff --git a/packages/instrumentor/pcLocationBatches.ts b/packages/instrumentor/pcLocationBatches.ts new file mode 100644 index 000000000..c7461a8e1 --- /dev/null +++ b/packages/instrumentor/pcLocationBatches.ts @@ -0,0 +1,178 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import * as path from "path"; +import { fileURLToPath } from "url"; + +import { SourceMapConsumer } from "source-map"; + +import type { EdgeEntry } from "./plugins/esmCodeCoverage"; +import { SourceMap, toRawSourceMap } from "./SourceMapRegistry"; + +export interface PCLocationBatch { + filename: string; + entries: Int32Array; +} + +interface RemappedPosition { + filename: string; + line: number; + col: number; +} + +const URL_PREFIX = /^[a-zA-Z][a-zA-Z0-9+.-]*:\/\//; + +export function buildPCLocationBatches( + edgeEntries: EdgeEntry[], + generatedFilename: string, + sourceMap: SourceMap | undefined, + normalizeFilename: (filename: string) => string = (filename) => filename, +): PCLocationBatch[] { + if (edgeEntries.length === 0) { + return []; + } + + if (!sourceMap) { + return [ + { + filename: normalizeFilename(generatedFilename), + entries: flattenEntries(edgeEntries), + }, + ]; + } + + const rawSourceMap = toRawSourceMap(sourceMap); + if (!rawSourceMap) { + return [ + { + filename: normalizeFilename(generatedFilename), + entries: flattenEntries(edgeEntries), + }, + ]; + } + + let consumer: SourceMapConsumer; + try { + consumer = new SourceMapConsumer(rawSourceMap); + } catch { + return [ + { + filename: normalizeFilename(generatedFilename), + entries: flattenEntries(edgeEntries), + }, + ]; + } + + const grouped = new Map(); + const remapCache = new Map(); + + for (const [edgeId, line, col, funcIdx, isFuncEntry] of edgeEntries) { + let targetFilename = normalizeFilename(generatedFilename); + let targetLine = line; + let targetCol = col; + + if (line > 0) { + const cacheKey = `${line}:${col}`; + let remapped = remapCache.get(cacheKey); + if (remapped === undefined) { + const original = consumer.originalPositionFor({ line, column: col }); + if ( + original.source && + original.line !== null && + original.column !== null + ) { + remapped = { + filename: normalizeFilename( + resolveOriginalSourcePath( + original.source, + sourceMap, + generatedFilename, + ), + ), + line: original.line, + col: original.column, + }; + } else { + remapped = null; + } + remapCache.set(cacheKey, remapped); + } + + if (remapped) { + targetFilename = remapped.filename; + targetLine = remapped.line; + targetCol = remapped.col; + } + } + + const batch = grouped.get(targetFilename) ?? []; + batch.push(edgeId, targetLine, targetCol, funcIdx, isFuncEntry); + if (!grouped.has(targetFilename)) { + grouped.set(targetFilename, batch); + } + } + + return Array.from(grouped.entries()).map(([filename, flat]) => ({ + filename, + entries: Int32Array.from(flat), + })); +} + +function flattenEntries(edgeEntries: EdgeEntry[]): Int32Array { + const flat = new Int32Array(edgeEntries.length * 5); + for (let i = 0; i < edgeEntries.length; i++) { + const e = edgeEntries[i]; + flat[i * 5] = e[0]; + flat[i * 5 + 1] = e[1]; + flat[i * 5 + 2] = e[2]; + flat[i * 5 + 3] = e[3]; + flat[i * 5 + 4] = e[4]; + } + return flat; +} + +function resolveOriginalSourcePath( + source: string, + sourceMap: SourceMap, + generatedFilename: string, +): string { + if (source.startsWith("file://")) { + return fileURLToPath(source); + } + if (path.isAbsolute(source) || path.win32.isAbsolute(source)) { + return source; + } + if (URL_PREFIX.test(source)) { + return source; + } + + const sourceRoot = sourceMap.sourceRoot; + if (!sourceRoot) { + return path.resolve(path.dirname(generatedFilename), source); + } + + if (sourceRoot.startsWith("file://")) { + return path.resolve(fileURLToPath(sourceRoot), source); + } + if (path.isAbsolute(sourceRoot) || path.win32.isAbsolute(sourceRoot)) { + return path.resolve(sourceRoot, source); + } + if (URL_PREFIX.test(sourceRoot)) { + return source; + } + + return path.resolve(path.dirname(generatedFilename), sourceRoot, source); +} diff --git a/packages/instrumentor/plugins/codeCoverage.test.ts b/packages/instrumentor/plugins/codeCoverage.test.ts index 51aa3fe0a..38182f646 100644 --- a/packages/instrumentor/plugins/codeCoverage.test.ts +++ b/packages/instrumentor/plugins/codeCoverage.test.ts @@ -19,10 +19,14 @@ import * as os from "os"; import * as tmp from "tmp"; -import { FileSyncIdStrategy, ZeroEdgeIdStrategy } from "../edgeIdStrategy"; +import { + FileSyncIdStrategy, + MemorySyncIdStrategy, + ZeroEdgeIdStrategy, +} from "../edgeIdStrategy"; import { Instrumentor } from "../instrument"; -import { codeCoverage } from "./codeCoverage"; +import { cjsCoverage, codeCoverage } from "./codeCoverage"; import { instrumentWith } from "./testhelpers"; tmp.setGracefulCleanup(); @@ -151,6 +155,37 @@ describe("code coverage instrumentation", () => { |};`; expectInstrumentation(input, output); }); + + it("should mark function-entry edges for print_funcs", () => { + const coverage = cjsCoverage(new MemorySyncIdStrategy()); + const instrumentor = new Instrumentor(); + instrumentor.transform( + "test.js", + "function f(x) { if (x) return 1; return 2; }", + [coverage.plugin], + ); + + const entries = coverage.edgeEntries(); + expect(entries.length).toBeGreaterThan(0); + expect(entries[0][4]).toBe(1); + expect(entries.slice(1).every((entry) => entry[4] === 0)).toBe(true); + }); + + it("should infer class method names", () => { + const coverage = cjsCoverage(new MemorySyncIdStrategy()); + const instrumentor = new Instrumentor(); + instrumentor.transform( + "test.js", + "class Parser { makeFilter(stream, name, maybeLength, params) { if (maybeLength === 0) return null; return stream; } }", + [coverage.plugin], + ); + + const names = coverage.funcNames(); + const hasMethodName = coverage + .edgeEntries() + .some((entry) => names[entry[3]] === "Parser.makeFilter"); + expect(hasMethodName).toBe(true); + }); }); describe("LogicalExpression", () => { diff --git a/packages/instrumentor/plugins/codeCoverage.ts b/packages/instrumentor/plugins/codeCoverage.ts index 2afcf153f..b7df4f19a 100644 --- a/packages/instrumentor/plugins/codeCoverage.ts +++ b/packages/instrumentor/plugins/codeCoverage.ts @@ -14,111 +14,62 @@ * limitations under the License. */ -import { NodePath, PluginTarget, types } from "@babel/core"; -import { - BlockStatement, - ConditionalExpression, - Expression, - ExpressionStatement, - Function, - IfStatement, - isBlockStatement, - isLogicalExpression, - LogicalExpression, - Loop, - Statement, - SwitchStatement, - TryStatement, -} from "@babel/types"; +import { PluginTarget, types } from "@babel/core"; import { EdgeIdStrategy } from "../edgeIdStrategy"; -export function codeCoverage(idStrategy: EdgeIdStrategy): () => PluginTarget { - function addCounterToStmt(stmt: Statement): BlockStatement { - const counterStmt = makeCounterIncStmt(); - if (isBlockStatement(stmt)) { - const br = stmt as BlockStatement; - br.body.unshift(counterStmt); - return br; - } else { - return types.blockStatement([counterStmt, stmt]); - } - } +import { + EdgeLocation, + makeCoverageVisitor, + StringInterner, +} from "./coverageVisitor"; +import type { EdgeEntry } from "./esmCodeCoverage"; - function makeCounterIncStmt(): ExpressionStatement { - return types.expressionStatement(makeCounterIncExpr()); - } +export interface CjsCoverageResult { + plugin: () => PluginTarget; + /** Deduplicated function name table accumulated so far. */ + funcNames: () => string[]; + /** Edge entries accumulated since the last clear(). */ + edgeEntries: () => EdgeEntry[]; + /** Reset accumulated entries — call after registering each file's locations. */ + clear: () => void; +} - function makeCounterIncExpr(): Expression { - return types.callExpression( - types.identifier("Fuzzer.coverageTracker.incrementCounter"), - [types.numericLiteral(idStrategy.nextEdgeId())], - ); - } +export function cjsCoverage(idStrategy: EdgeIdStrategy): CjsCoverageResult { + const funcNames = new StringInterner(); + const entries: EdgeEntry[] = []; - return () => { - return { - visitor: { - Function(path: NodePath) { - if (isBlockStatement(path.node.body)) { - const bodyStmt = path.node.body as BlockStatement; - if (bodyStmt) { - bodyStmt.body.unshift(makeCounterIncStmt()); - } - } - }, - IfStatement(path: NodePath) { - path.node.consequent = addCounterToStmt(path.node.consequent); - if (path.node.alternate) { - path.node.alternate = addCounterToStmt(path.node.alternate); - } - path.insertAfter(makeCounterIncStmt()); - }, - SwitchStatement(path: NodePath) { - path.node.cases.forEach((caseStmt) => - caseStmt.consequent.unshift(makeCounterIncStmt()), - ); - path.insertAfter(makeCounterIncStmt()); - }, - Loop(path: NodePath) { - path.node.body = addCounterToStmt(path.node.body); - path.insertAfter(makeCounterIncStmt()); - }, - TryStatement(path: NodePath) { - const catchStmt = path.node.handler; - if (catchStmt) { - catchStmt.body.body.unshift(makeCounterIncStmt()); - } - path.insertAfter(makeCounterIncStmt()); - }, - LogicalExpression(path: NodePath) { - if (!isLogicalExpression(path.node.left)) { - path.node.left = types.sequenceExpression([ - makeCounterIncExpr(), - path.node.left, - ]); - } - if (!isLogicalExpression(path.node.right)) { - path.node.right = types.sequenceExpression([ - makeCounterIncExpr(), - path.node.right, - ]); - } - }, - ConditionalExpression(path: NodePath) { - path.node.consequent = types.sequenceExpression([ - makeCounterIncExpr(), - path.node.consequent, - ]); - path.node.alternate = types.sequenceExpression([ - makeCounterIncExpr(), - path.node.alternate, - ]); - if (isBlockStatement(path.parent)) { - path.insertAfter(makeCounterIncStmt()); - } - }, - }, - }; + const onEdge = (loc: EdgeLocation): void => { + const id = idStrategy.peekNextEdgeId(); + entries.push([ + id, + loc.line, + loc.col, + funcNames.intern(loc.func), + loc.isFuncEntry ? 1 : 0, + ]); }; + + return { + plugin: () => ({ + visitor: makeCoverageVisitor( + () => + types.callExpression( + types.identifier("Fuzzer.coverageTracker.incrementCounter"), + [types.numericLiteral(idStrategy.nextEdgeId())], + ), + onEdge, + ), + }), + funcNames: () => funcNames.strings(), + edgeEntries: () => entries, + clear: () => { + entries.length = 0; + funcNames.clear(); + }, + }; +} + +export function codeCoverage(idStrategy: EdgeIdStrategy): () => PluginTarget { + return cjsCoverage(idStrategy).plugin; } diff --git a/packages/instrumentor/plugins/coverageVisitor.ts b/packages/instrumentor/plugins/coverageVisitor.ts new file mode 100644 index 000000000..8453f9477 --- /dev/null +++ b/packages/instrumentor/plugins/coverageVisitor.ts @@ -0,0 +1,261 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * Shared coverage instrumentation visitor. + * + * Both the CJS instrumentor (incrementCounter calls) and the ESM + * instrumentor (direct array writes) inject counters at the same + * AST locations. This module captures that shared visitor shape + * and lets each variant supply its own expression generator. + */ + +import { NodePath, types, Visitor } from "@babel/core"; +import { + BlockStatement, + ConditionalExpression, + Expression, + ExpressionStatement, + Function, + IfStatement, + isBlockStatement, + isLogicalExpression, + LogicalExpression, + Loop, + Node, + Statement, + SwitchStatement, + TryStatement, +} from "@babel/types"; + +export interface EdgeLocation { + line: number; + col: number; + func: string; + isFuncEntry: boolean; +} + +/** Map-backed string interner for compact edge-location serialization. */ +export class StringInterner { + private readonly table: string[] = []; + private readonly index = new Map(); + + intern(s: string): number { + let idx = this.index.get(s); + if (idx === undefined) { + idx = this.table.length; + this.table.push(s); + this.index.set(s, idx); + } + return idx; + } + + strings(): string[] { + return this.table; + } + + clear(): void { + this.table.length = 0; + this.index.clear(); + } +} + +type Loc = { line: number; column: number } | null | undefined; + +function propertyKeyName(key: Node, computed: boolean): string | null { + if (types.isIdentifier(key)) return key.name; + if (types.isStringLiteral(key)) return key.value; + if (types.isNumericLiteral(key)) return String(key.value); + if (types.isBigIntLiteral(key)) return key.value; + if (types.isPrivateName(key) && types.isIdentifier(key.id)) { + return `#${key.id.name}`; + } + return computed ? "" : null; +} + +function assignmentTargetName(target: Node): string | null { + if (types.isIdentifier(target)) return target.name; + if (!types.isMemberExpression(target)) return null; + return propertyKeyName(target.property, target.computed); +} + +function enclosingClassName(fn: NodePath): string | null { + if (!(fn.isClassMethod() || fn.isClassPrivateMethod())) { + return null; + } + + const classPath = fn.findParent( + (parent) => parent.isClassDeclaration() || parent.isClassExpression(), + ); + if (!classPath) return null; + + if (classPath.isClassDeclaration() || classPath.isClassExpression()) { + if (classPath.node.id) { + return classPath.node.id.name; + } + } + + const parent = classPath.parentPath; + if (parent?.isVariableDeclarator() && types.isIdentifier(parent.node.id)) { + return parent.node.id.name; + } + if (parent?.isAssignmentExpression()) { + return assignmentTargetName(parent.node.left); + } + if (parent?.isObjectProperty() || parent?.isClassProperty()) { + return propertyKeyName(parent.node.key, !!parent.node.computed); + } + + return null; +} + +function enclosingFuncName(path: NodePath): string { + const fn = path.isFunction() ? path : path.getFunctionParent(); + if (!fn) return ""; + const node = fn.node; + if (types.isFunctionDeclaration(node) && node.id) return node.id.name; + if (types.isFunctionExpression(node) && node.id) return node.id.name; + if (fn.isClassMethod() || fn.isClassPrivateMethod()) { + const name = propertyKeyName(fn.node.key, !!fn.node.computed); + if (name) { + const className = enclosingClassName(fn); + return className ? `${className}.${name}` : name; + } + } + if (fn.isObjectMethod()) { + const name = propertyKeyName(fn.node.key, !!fn.node.computed); + if (name) return name; + } + + const parent = fn.parentPath; + if (parent?.isVariableDeclarator() && types.isIdentifier(parent.node.id)) + return parent.node.id.name; + if (parent?.isAssignmentExpression()) { + const name = assignmentTargetName(parent.node.left); + if (name) return name; + } + if (parent?.isObjectProperty() || parent?.isClassProperty()) { + const name = propertyKeyName(parent.node.key, !!parent.node.computed); + if (name) return name; + } + return ""; +} + +/** + * Build a Babel visitor that inserts a counter expression at every + * branch point. The caller decides what that expression looks like. + * + * When `onEdge` is provided it is called once per counter, receiving + * the source location and enclosing function name. This powers + * PC-to-source symbolization for libFuzzer's `-print_pcs` output. + */ +export function makeCoverageVisitor( + makeCounterExpr: () => Expression, + onEdge?: (loc: EdgeLocation) => void, +): Visitor { + /** @param locNode AST node whose `.loc` supplies the source position. */ + function emitCounter( + path: NodePath, + locNode: Node, + isFuncEntry = false, + ): Expression { + if (onEdge) { + const loc: Loc = locNode.loc?.start; + onEdge({ + line: loc?.line ?? 0, + col: loc?.column ?? 0, + func: enclosingFuncName(path), + isFuncEntry, + }); + } + return makeCounterExpr(); + } + + function makeStmt( + path: NodePath, + locNode: Node, + isFuncEntry = false, + ): ExpressionStatement { + return types.expressionStatement(emitCounter(path, locNode, isFuncEntry)); + } + + function wrapWithCounter(path: NodePath, stmt: Statement): BlockStatement { + const counter = makeStmt(path, stmt); + if (isBlockStatement(stmt)) { + stmt.body.unshift(counter); + return stmt; + } + return types.blockStatement([counter, stmt]); + } + + return { + Function(path: NodePath) { + if (isBlockStatement(path.node.body)) { + path.node.body.body.unshift(makeStmt(path, path.node, true)); + } + }, + IfStatement(path: NodePath) { + path.node.consequent = wrapWithCounter(path, path.node.consequent); + if (path.node.alternate) { + path.node.alternate = wrapWithCounter(path, path.node.alternate); + } + path.insertAfter(makeStmt(path, path.node)); + }, + SwitchStatement(path: NodePath) { + for (const caseClause of path.node.cases) { + caseClause.consequent.unshift(makeStmt(path, caseClause)); + } + path.insertAfter(makeStmt(path, path.node)); + }, + Loop(path: NodePath) { + path.node.body = wrapWithCounter(path, path.node.body); + path.insertAfter(makeStmt(path, path.node)); + }, + TryStatement(path: NodePath) { + if (path.node.handler) { + path.node.handler.body.body.unshift(makeStmt(path, path.node.handler)); + } + path.insertAfter(makeStmt(path, path.node)); + }, + LogicalExpression(path: NodePath) { + if (!isLogicalExpression(path.node.left)) { + path.node.left = types.sequenceExpression([ + emitCounter(path, path.node), + path.node.left, + ]); + } + if (!isLogicalExpression(path.node.right)) { + path.node.right = types.sequenceExpression([ + emitCounter(path, path.node), + path.node.right, + ]); + } + }, + ConditionalExpression(path: NodePath) { + path.node.consequent = types.sequenceExpression([ + emitCounter(path, path.node), + path.node.consequent, + ]); + path.node.alternate = types.sequenceExpression([ + emitCounter(path, path.node), + path.node.alternate, + ]); + if (isBlockStatement(path.parent)) { + path.insertAfter(makeStmt(path, path.node)); + } + }, + }; +} diff --git a/packages/instrumentor/plugins/esmCodeCoverage.test.ts b/packages/instrumentor/plugins/esmCodeCoverage.test.ts new file mode 100644 index 000000000..da2b3c606 --- /dev/null +++ b/packages/instrumentor/plugins/esmCodeCoverage.test.ts @@ -0,0 +1,265 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { PluginItem, transformSync } from "@babel/core"; + +import { compareHooks } from "./compareHooks"; +import { esmCodeCoverage } from "./esmCodeCoverage"; +import { removeIndentation } from "./testhelpers"; + +function transform( + code: string, + extraPlugins: PluginItem[] = [], +): { code: string; edgeCount: number } { + const coverage = esmCodeCoverage(); + const result = transformSync(removeIndentation(code), { + filename: "test-module.mjs", + plugins: [coverage.plugin, ...extraPlugins], + }); + return { + code: removeIndentation(result?.code), + edgeCount: coverage.edgeCount(), + }; +} + +describe("ESM code coverage instrumentation", () => { + it("should emit direct array writes, not incrementCounter", () => { + const { code, edgeCount } = transform(` + |function foo() { + | return 1; + |}`); + + expect(code).toContain("__jazzer_cov[0]"); + expect(code).not.toContain("incrementCounter"); + expect(edgeCount).toBe(1); + }); + + it("should implement NeverZero via % 255 + 1", () => { + const { code } = transform(` + |function foo() { + | return 1; + |}`); + + expect(code).toContain("% 255"); + expect(code).toContain("+ 1"); + // Must NOT contain || or ?: to avoid infinite visitor recursion. + expect(code).not.toMatch(/\|\||[?:]/); + }); + + it("should assign sequential module-local IDs", () => { + const { code, edgeCount } = transform(` + |function foo() { + | if (a) { + | return 1; + | } else { + | return 2; + | } + |}`); + + // Function body, if-consequent, if-alternate, after-if + expect(edgeCount).toBe(4); + expect(code).toContain("__jazzer_cov[0]"); + expect(code).toContain("__jazzer_cov[1]"); + expect(code).toContain("__jazzer_cov[2]"); + expect(code).toContain("__jazzer_cov[3]"); + }); + + it("should instrument all branch types", () => { + const { edgeCount } = transform(` + |function foo(x) { + | if (x > 0) { return 1; } + | switch (x) { + | case -1: return -1; + | default: return 0; + | } + | for (let i = 0; i < x; i++) { sum += i; } + | try { bar(); } catch (e) { log(e); } + | const y = x > 0 ? 1 : 0; + | const z = a || b; + |}`); + + // This is a smoke test -- the exact count depends on how + // many edges each construct produces. We just verify the + // number is reasonable (> 10 for this code) and non-zero. + expect(edgeCount).toBeGreaterThan(10); + }); + + it("should start edge IDs at 0 for each new module", () => { + const first = transform(`|function a() { return 1; }`); + const second = transform(`|function b() { return 2; }`); + + // Both modules should use __jazzer_cov[0] since IDs are + // module-local, not global. + expect(first.code).toContain("__jazzer_cov[0]"); + expect(second.code).toContain("__jazzer_cov[0]"); + expect(first.edgeCount).toBe(1); + expect(second.edgeCount).toBe(1); + }); + + it("should return 0 edges for code with no branches", () => { + const { edgeCount } = transform(`|const x = 42;`); + expect(edgeCount).toBe(0); + }); + + it("should mark function-entry edges for print_funcs", () => { + const coverage = esmCodeCoverage(); + transformSync( + removeIndentation(` + |function f(x) { + | if (x) return 1; + | return 2; + |}`), + { + filename: "test-module.mjs", + plugins: [coverage.plugin], + }, + ); + + const entries = coverage.edgeEntries(); + expect(entries.length).toBeGreaterThan(0); + expect(entries[0][4]).toBe(1); + expect(entries.slice(1).every((entry) => entry[4] === 0)).toBe(true); + }); + + it("should infer class method names", () => { + const coverage = esmCodeCoverage(); + transformSync( + removeIndentation(` + |class Parser { + | makeFilter(stream, name, maybeLength, params) { + | if (maybeLength === 0) return null; + | return stream; + | } + |}`), + { + filename: "test-module.mjs", + plugins: [coverage.plugin], + }, + ); + + const names = coverage.funcNames(); + const hasMethodName = coverage + .edgeEntries() + .some((entry) => names[entry[3]] === "Parser.makeFilter"); + expect(hasMethodName).toBe(true); + }); + + describe("combined with compareHooks", () => { + it("should replace string-literal === with traceStrCmp", () => { + const { code } = transform( + ` + |export function check(s) { + | return s === "secret"; + |}`, + [compareHooks], + ); + + // The === against a string literal must be replaced. + expect(code).toContain("Fuzzer.tracer.traceStrCmp"); + expect(code).toContain('"secret"'); + expect(code).toContain('"==="'); + // The raw === should be gone from the check expression. + expect(code).not.toMatch(/s\s*===\s*"secret"/); + }); + + it("should replace number-literal === with traceNumberCmp", () => { + const { code } = transform( + ` + |export function classify(n) { + | if (n > 10) return "big"; + | if (n === 0) return "zero"; + | return "small"; + |}`, + [compareHooks], + ); + + expect(code).toContain("Fuzzer.tracer.traceNumberCmp"); + }); + + it("should NOT hook variable-to-variable comparisons", () => { + // compareHooks only fires when one operand is a literal. + // Comparing two identifiers is not hooked (same as CJS). + const { code } = transform( + ` + |const target = "something"; + |export function check(s) { + | return s === target; + |}`, + [compareHooks], + ); + + expect(code).not.toContain("Fuzzer.tracer.traceStrCmp"); + }); + + it("should hook slice-then-compare patterns", () => { + // This is the pattern used in the integration tests. + const { code } = transform( + ` + |export function verify(s) { + | if (s.slice(0, 16) === "a]3;d*F!pk29&bAc") { + | throw new Error("found it"); + | } + |}`, + [compareHooks], + ); + + expect(code).toContain("Fuzzer.tracer.traceStrCmp"); + expect(code).toContain("a]3;d*F!pk29&bAc"); + }); + + it("should produce both coverage and hooks together", () => { + const { code, edgeCount } = transform( + ` + |export function check(s) { + | if (s === "secret") { + | return true; + | } + | return false; + |}`, + [compareHooks], + ); + + // Coverage counters from esmCodeCoverage + expect(code).toContain("__jazzer_cov["); + expect(edgeCount).toBeGreaterThan(0); + // Compare hooks + expect(code).toContain("Fuzzer.tracer.traceStrCmp"); + }); + }); + + describe("logical expression handling", () => { + it("should instrument nested logical expressions", () => { + const { code, edgeCount } = transform(` + |const x = a || b && c;`); + + // Should have instrumented the leaves of the logical tree. + expect(edgeCount).toBeGreaterThanOrEqual(2); + expect(code).toContain("__jazzer_cov["); + }); + + it("should not infinite-loop on complex logical chains", () => { + // This would cause infinite recursion if the counter + // expression contained || or &&. + const { code, edgeCount } = transform(` + |function f() { + | return a || b || c || d || e; + |}`); + + expect(edgeCount).toBeGreaterThan(0); + expect(code).toContain("__jazzer_cov["); + }); + }); +}); diff --git a/packages/instrumentor/plugins/esmCodeCoverage.ts b/packages/instrumentor/plugins/esmCodeCoverage.ts new file mode 100644 index 000000000..242f385b9 --- /dev/null +++ b/packages/instrumentor/plugins/esmCodeCoverage.ts @@ -0,0 +1,118 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * Coverage plugin for ES modules. + * + * Unlike the CJS variant (which calls Fuzzer.coverageTracker.incrementCounter), + * this plugin emits direct writes to a module-local Uint8Array: + * + * __jazzer_cov[id] = (__jazzer_cov[id] % 255) + 1 + * + * Each module gets its own small counter buffer, registered independently + * with libFuzzer. Edge IDs start at 0 per module -- no global counter + * coordination is needed. + */ + +import { PluginTarget, types } from "@babel/core"; +import { Expression } from "@babel/types"; + +import { + EdgeLocation, + makeCoverageVisitor, + StringInterner, +} from "./coverageVisitor"; + +const COUNTER_ARRAY = "__jazzer_cov"; + +/** + * Build a NeverZero increment expression: + * + * __jazzer_cov[id] = (__jazzer_cov[id] % 255) + 1 + * + * Values cycle 0 → 1 → 2 → … → 255 → 1 → 2 → …, never landing + * on zero (which libFuzzer would interpret as "edge not hit"). + * + * We deliberately avoid `|| 1` because Babel would re-visit the + * generated LogicalExpression and trigger infinite recursion in + * the coverage visitor. The `% 255 + 1` form uses only binary + * arithmetic, which the visitor does not handle. + */ +function neverZeroIncrement(id: number): Expression { + const element = () => + types.memberExpression( + types.identifier(COUNTER_ARRAY), + types.numericLiteral(id), + true, // computed: __jazzer_cov[N] + ); + + return types.assignmentExpression( + "=", + element(), + types.binaryExpression( + "+", + types.binaryExpression("%", element(), types.numericLiteral(255)), + types.numericLiteral(1), + ), + ); +} + +/** + * Compact per-edge location: + * [localEdgeId, line, col, funcIndex, isFuncEntry]. + */ +export type EdgeEntry = [number, number, number, number, number]; + +export interface EsmCoverageResult { + plugin: () => PluginTarget; + edgeCount: () => number; + /** Deduplicated function name table for this module. */ + funcNames: () => string[]; + /** Flat edge-to-source entries for this module. */ + edgeEntries: () => EdgeEntry[]; +} + +/** + * Create a fresh ESM coverage plugin for one module. + * + * Call this once per module being instrumented. After the Babel + * transform finishes, `edgeCount()` returns the number of counters + * the module needs so the loader can emit the right preamble. + */ +export function esmCodeCoverage(): EsmCoverageResult { + let count = 0; + const funcNames = new StringInterner(); + const entries: EdgeEntry[] = []; + + const onEdge = (loc: EdgeLocation): void => { + entries.push([ + count, + loc.line, + loc.col, + funcNames.intern(loc.func), + loc.isFuncEntry ? 1 : 0, + ]); + }; + + return { + plugin: () => ({ + visitor: makeCoverageVisitor(() => neverZeroIncrement(count++), onEdge), + }), + edgeCount: () => count, + funcNames: () => funcNames.strings(), + edgeEntries: () => entries, + }; +} diff --git a/packages/jest-runner/readme.md b/packages/jest-runner/readme.md index 73811c4eb..c1f474ca1 100644 --- a/packages/jest-runner/readme.md +++ b/packages/jest-runner/readme.md @@ -1,14 +1,14 @@ # Jest Fuzz Runner -Custom Jest runner to executes fuzz tests via Jazzer.js, detailed documentation +Custom Jest runner to execute fuzz tests via Jazzer.js. Detailed documentation can be found at the -[Jazzer.js GitHub page](https://github.com/CodeIntelligenceTesting/jazzer.js-commercial). +[Jazzer.js GitHub page](https://github.com/CodeIntelligenceTesting/jazzer.js). A fuzz test in Jest, in this case written in TypeScript, would look similar to the following example: ```typescript -// file: "Target.fuzz.ts +// file: "Target.fuzz.ts" // Import the fuzz testing extension to compile TS code. import "@jazzer.js/jest-runner"; import * as target from "./target"; diff --git a/tests/esm_cjs_mixed/cjs-check.cjs b/tests/esm_cjs_mixed/cjs-check.cjs new file mode 100644 index 000000000..dabc37d9c --- /dev/null +++ b/tests/esm_cjs_mixed/cjs-check.cjs @@ -0,0 +1,27 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * CJS module — instrumented via hookRequire (require.extensions). + * + * The 10-byte random string cannot be brute-forced; the fuzzer + * needs the CJS compare hooks to discover it. + */ +function checkCjs(s) { + return s === "r4Tp!mZ@8s"; +} + +module.exports = { checkCjs: checkCjs }; diff --git a/tests/esm_cjs_mixed/esm-check.mjs b/tests/esm_cjs_mixed/esm-check.mjs new file mode 100644 index 000000000..17f377dfa --- /dev/null +++ b/tests/esm_cjs_mixed/esm-check.mjs @@ -0,0 +1,25 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * ESM module — instrumented via the ESM loader hook (module.register). + * + * The 10-byte random string cannot be brute-forced; the fuzzer + * needs the ESM compare hooks to discover it. + */ +export function checkEsm(s) { + return s === "Vj9!xR2#nP"; +} diff --git a/tests/esm_cjs_mixed/esm_cjs_mixed.test.js b/tests/esm_cjs_mixed/esm_cjs_mixed.test.js new file mode 100644 index 000000000..c0e7519fc --- /dev/null +++ b/tests/esm_cjs_mixed/esm_cjs_mixed.test.js @@ -0,0 +1,50 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +const { FuzzTestBuilder, cleanCrashFilesIn } = require("../helpers.js"); + +// module.register() is needed for ESM loader hooks. +const [major, minor] = process.versions.node.split(".").map(Number); +const supportsEsmHooks = major > 20 || (major === 20 && minor >= 6); + +const describeOrSkip = supportsEsmHooks ? describe : describe.skip; + +describeOrSkip("Mixed CJS + ESM instrumentation", () => { + afterAll(async () => { + await cleanCrashFilesIn(__dirname); + }); + + it("should find a secret split across a CJS and an ESM module", () => { + // The fuzz target imports checkCjs from cjs-check.cjs + // (instrumented via hookRequire) and checkEsm from + // esm-check.mjs (instrumented via the ESM loader hook). + // Both are 10-byte random string literals that can only + // be discovered through their respective compare hooks. + const fuzzTest = new FuzzTestBuilder() + .fuzzEntryPoint("fuzz") + .fuzzFile("fuzz.mjs") + .dir(__dirname) + .sync(true) + .disableBugDetectors([".*"]) + .expectedErrors("Error") + .runs(5000000) + .seed(111994470) + .build(); + + fuzzTest.execute(); + expect(fuzzTest.stderr).toContain("Found the mixed CJS+ESM secret!"); + }); +}); diff --git a/tests/esm_cjs_mixed/fuzz.mjs b/tests/esm_cjs_mixed/fuzz.mjs new file mode 100644 index 000000000..a3b862d88 --- /dev/null +++ b/tests/esm_cjs_mixed/fuzz.mjs @@ -0,0 +1,40 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * ESM fuzz target that imports from BOTH a CJS module and an ESM + * module. Each function checks a 10-characters random string literal: + * + * - cjs-check.cjs verifies bytes 0..9 (hookRequire path) + * - esm-check.mjs verifies bytes 10..19 (ESM loader path) + * + * Both functions are called unconditionally so that both compare + * hooks fire on every fuzzing iteration, feeding libFuzzer + * dictionary entries from both instrumentation paths. + */ + +import { checkCjs } from "./cjs-check.cjs"; +import { checkEsm } from "./esm-check.mjs"; +import { FuzzedDataProvider } from "@jazzer.js/core"; + +export function fuzz(data) { + const fdp = new FuzzedDataProvider(data); + const cjsOk = checkCjs(fdp.consumeString(10)); + const esmOk = checkEsm(fdp.consumeString(10)); + if (cjsOk && esmOk) { + throw new Error("Found the mixed CJS+ESM secret!"); + } +} diff --git a/tests/esm_cjs_mixed/package.json b/tests/esm_cjs_mixed/package.json new file mode 100644 index 000000000..be04fa19f --- /dev/null +++ b/tests/esm_cjs_mixed/package.json @@ -0,0 +1,12 @@ +{ + "name": "jazzerjs-esm-cjs-mixed-test", + "version": "1.0.0", + "description": "Integration test: fuzzer finds a secret split across CJS and ESM modules", + "scripts": { + "fuzz": "jest", + "dryRun": "echo \"Skipped: requires Node >= 20.6 for ESM loader hooks\"" + }, + "devDependencies": { + "@jazzer.js/core": "file:../../packages/core" + } +} diff --git a/tests/esm_instrumentation/esm_instrumentation.test.js b/tests/esm_instrumentation/esm_instrumentation.test.js new file mode 100644 index 000000000..f638bf1ea --- /dev/null +++ b/tests/esm_instrumentation/esm_instrumentation.test.js @@ -0,0 +1,48 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +const { FuzzTestBuilder, cleanCrashFilesIn } = require("../helpers.js"); + +// module.register() is needed for ESM loader hooks. +const [major, minor] = process.versions.node.split(".").map(Number); +const supportsEsmHooks = major > 20 || (major === 20 && minor >= 6); + +const describeOrSkip = supportsEsmHooks ? describe : describe.skip; + +describeOrSkip("ESM instrumentation", () => { + afterAll(async () => { + await cleanCrashFilesIn(__dirname); + }); + + it("should find a 16-byte string via compare hooks in an ES module", () => { + // target.mjs compares against the literal "a]3;d*F!pk29&bAc". + // Without the ESM compare hooks replacing === with traceStrCmp, + // libFuzzer cannot discover a 16-byte random string. + const fuzzTest = new FuzzTestBuilder() + .fuzzEntryPoint("fuzz") + .fuzzFile("fuzz.mjs") + .dir(__dirname) + .sync(true) + .disableBugDetectors([".*"]) + .expectedErrors("Error") + .runs(5000000) + .seed(111994470) + .build(); + + fuzzTest.execute(); + expect(fuzzTest.stderr).toContain("Found the ESM secret!"); + }); +}); diff --git a/tests/esm_instrumentation/fuzz.mjs b/tests/esm_instrumentation/fuzz.mjs new file mode 100644 index 000000000..1606be504 --- /dev/null +++ b/tests/esm_instrumentation/fuzz.mjs @@ -0,0 +1,24 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { checkSecret } from "./target.mjs"; + +/** + * @param { Buffer } data + */ +export function fuzz(data) { + checkSecret(data.toString()); +} diff --git a/tests/esm_instrumentation/package.json b/tests/esm_instrumentation/package.json new file mode 100644 index 000000000..81c6b62ac --- /dev/null +++ b/tests/esm_instrumentation/package.json @@ -0,0 +1,12 @@ +{ + "name": "jazzerjs-esm-instrumentation-test", + "version": "1.0.0", + "description": "Integration test: coverage-guided fuzzing of a pure ES module", + "scripts": { + "fuzz": "jest", + "dryRun": "echo \"Skipped: requires Node >= 20.6 for ESM loader hooks\"" + }, + "devDependencies": { + "@jazzer.js/core": "file:../../packages/core" + } +} diff --git a/tests/esm_instrumentation/target.mjs b/tests/esm_instrumentation/target.mjs new file mode 100644 index 000000000..42efbd3ad --- /dev/null +++ b/tests/esm_instrumentation/target.mjs @@ -0,0 +1,27 @@ +/* + * Copyright 2026 Code Intelligence GmbH + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * A pure ES module with a string-literal comparison. The compare + * hooks replace the === with a traceStrCmp call that leaks the + * literal to libFuzzer's mutation engine. Without that feedback + * a 16-byte random string cannot be found by brute force. + */ +export function checkSecret(s) { + if (s === "a]3;d*F!pk29&bAc") { + throw new Error("Found the ESM secret!"); + } +}