feat(voice): add answering machine detection helper#1215
Merged
toubatbrian merged 11 commits intomainfrom Apr 15, 2026
Merged
Conversation
🦋 Changeset detectedLatest commit: e262c05 The changes in this PR will be included in the next version bump. This PR includes changesets to release 24 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
960d8a4 to
bab9d91
Compare
toubatbrian
reviewed
Apr 14, 2026
Comment on lines
+237
to
+246
| const parsed = this.parseDetection(rawResponse); | ||
| return { | ||
| ...parsed, | ||
| transcript, | ||
| rawResponse, | ||
| isMachine: | ||
| parsed.category === AMDCategory.MACHINE_IVR || | ||
| parsed.category === AMDCategory.MACHINE_VM || | ||
| parsed.category === AMDCategory.MACHINE_UNAVAILABLE, | ||
| }; |
Contributor
There was a problem hiding this comment.
why do we not use the tool call here like what was done in python?
toubatbrian
reviewed
Apr 14, 2026
Comment on lines
+249
to
+270
| private parseDetection(rawResponse: string): Pick<AMDResult, 'category' | 'reason'> { | ||
| const normalized = rawResponse.trim(); | ||
| const jsonStart = normalized.indexOf('{'); | ||
| const jsonEnd = normalized.lastIndexOf('}'); | ||
| const jsonChunk = | ||
| jsonStart >= 0 && jsonEnd >= jsonStart | ||
| ? normalized.slice(jsonStart, jsonEnd + 1) | ||
| : normalized; | ||
|
|
||
| try { | ||
| const parsed = JSON.parse(jsonChunk) as { category?: string; reason?: string }; | ||
| return { | ||
| category: this.normalizeCategory(parsed.category), | ||
| reason: parsed.reason?.trim() || 'No reason provided.', | ||
| }; | ||
| } catch { | ||
| return { | ||
| category: AMDCategory.UNCERTAIN, | ||
| reason: normalized || 'Failed to parse AMD model response.', | ||
| }; | ||
| } | ||
| } |
Refactor Answering Machine Detection to match Python's `_AMDClassifier`: - Replace single 8s timeout with dual timers: no-speech (10s → MACHINE_UNAVAILABLE) and detection timeout (20s → UNCERTAIN), matching Python's NO_SPEECH_THRESHOLD and TIMEOUT constants - Wire VAD via UserStateChanged events for speech start/end tracking - Add short-greeting heuristic: speech ≤ 2.5s + 0.5s silence → HUMAN (skip LLM) - Implement two-gate emit system: result only emits when both a verdict (LLM or heuristic) and silence gate (1.5s post-speech) are satisfied - Add generation counter to discard stale LLM results when newer transcripts arrive - Restructure from nested closures to class methods/fields for readability - Extract isMachineCategory() helper and Set-based parseCategory() replacing identity switch statement - Fix quadratic string concat in LLM stream (chunks[] + join) - Fix aclose() to properly clean up timers and listeners Made-with: Cursor
Contributor
|
@chenghao-mou I made some changes to your PR, mostly style improvement + aligning the timeout handling to python. Regarding the tool calling v.s generating raw JSON. Based on my testing it works 100 out of 100, so probably it's fine to leave it as it is right now. Let me know your thoughts as well! |
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…it/agents-js into codex/issue-1204-amd-draft
toubatbrian
approved these changes
Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a JS/TS answering machine detection helper for voice sessions, following the Python AMD work in livekit/agents#4906 and the follow-up OTEL + authorization parity from #5376.
Changes
Testing
Notes
This is the JS counterpart to the Python AMD helper, adapted to the current JS voice/session architecture, with category names aligned to the Python implementation.