Skip to content

workbench-jupyter-with-llm app with MCP features#409

Draft
pantherman594 wants to merge 94 commits into
masterfrom
mcp-app
Draft

workbench-jupyter-with-llm app with MCP features#409
pantherman594 wants to merge 94 commits into
masterfrom
mcp-app

Conversation

@pantherman594
Copy link
Copy Markdown
Contributor

No description provided.

NavidZ and others added 30 commits January 26, 2026 16:12
- New feature: llm-context - generates CLAUDE.md for Claude Code auto-discovery
- Includes generate-context.sh script with embedded skill files
- Auto-runs on container startup to provide workspace context
- Updated workbench-jupyter-with-llm app to include the new feature

The CLAUDE.md file includes:
- Workspace metadata (name, ID, cloud platform, role)
- Resource paths and environment variables
- Data exploration cheatsheet
- Data persistence guidance
- MCP vs CLI usage guide
- Custom app creation skill
- Updated generate-context.sh with latest improvements
- Fixed devcontainer-feature.json (removed problematic postStartCommand)
- Improved install.sh with:
  - Auto-install jq if missing
  - Better error handling
  - Auto-run via .bashrc (runs in background on first terminal)
  - Checks if workspace is set before generating
- Updated README with usage examples
- Version 1.1.0
- Added .devcontainer/features/ symlinks for all features
- Added startupscript symlink
- Removed autorun option (now handled by feature's .bashrc trigger)
- App now uses the llm-context devcontainer feature

To test: Deploy with folder src/workbench-jupyter-with-llm
- Updated install.sh to match workbench-tools/gemini/wb-mcp-server patterns:
  - Added WORKDIR with cleanup trap
  - Added apt_get_update() and check_packages() helpers
  - Consistent variable naming and structure
  - Banner-style output messages

- Updated devcontainer-feature.json:
  - Removed containerEnv (not used by other features)
  - Updated userHomeDir default to match pattern
  - Bumped version to 1.2.0

- Updated README.md:
  - Added Options table matching gemini feature format
  - Added MCP Integration section
  - Added File Locations table
  - Consistent structure with other features
Features referenced as ./.devcontainer/features/xxx in .devcontainer.json
are resolved from repo root, not the app folder. This matches the original
NavidZ repo structure.

- Created .devcontainer/features/ at repo root with symlinks to features/src/
- Removed .devcontainer/features/ from app folder (was incorrect location)
Issues fixed:
1. Bucket mounting: Removed startupscript symlink from app folder.
   Paths like ./startupscript/ resolve from repo root, not app folder.

2. Context generation timing: Now runs via postStartCommand AFTER
   startup scripts complete (auth + workspace setup done first).
   Removed .bashrc auto-trigger which ran too early.

Changes:
- Removed src/workbench-jupyter-with-llm/startupscript symlink
- Updated .devcontainer.json postStartCommand to run context generation
- Simplified install.sh (aliases only, no bashrc auto-trigger)
- Updated README with correct integration instructions
Bug: postStartCommand runs as root, so $HOME = /root. The script was
creating /root/.workbench/CLAUDE.md instead of /home/jupyter/.workbench/CLAUDE.md

Fix: generate-context.sh now accepts home directory as first argument.
Priority: 1) $LLM_CONTEXT_HOME, 2) first arg, 3) /home/jupyter fallback, 4) $HOME

Updated:
- generate-context.sh: Accept home dir arg, smart fallback to /home/jupyter
- .devcontainer.json: Pass /home/jupyter to generate-context.sh
- install.sh: Set LLM_CONTEXT_HOME env var, aliases pass home dir
- README.md: Document the home directory argument
The remount-on-restart.sh script may return non-zero exit code even
when successful, which breaks the && chain. Changed to:
- Use ; instead of && (run regardless of previous exit code)
- Added || true to prevent postStartCommand failure
Added:
- Detailed validation table with specific files and what to check
- Common mistakes section
- Validation commands to run before deploy
- LLM response template for consistent validation output
- Clear instruction for LLM to verify ALL items before suggesting deploy
The postStartCommand was running but context generation wasn't working.
Moving it to postCreateCommand ensures it runs AFTER post-startup.sh
completes authentication and workspace setup.

Also added echo statements to make it visible in logs.
Learning from wb-mcp-server: create the file during install.sh so Claude
finds it immediately. The stub:
- Tells Claude it's in Workbench
- Instructs to run 'generate-llm-context' for full context
- Lists available MCP tools
- Provides basic CLI commands

postCreateCommand still tries to generate full context, but if it fails,
Claude has the stub to work with.

This ensures ~/CLAUDE.md exists as soon as the container starts.
Architecture:
- llm-context feature installs script to /opt/llm-context/
- post-startup.sh checks if feature is installed
- If yes, runs generate-context.sh AFTER auth is complete
- Uses RUN_AS_LOGIN_USER for correct file ownership

This ensures context generation runs at the right time (after auth)
while still using the devcontainer feature for installation.

Changes:
- startupscript/post-startup.sh: Added LLM context generation section
- Removed stub CLAUDE.md from install.sh (not needed)
- Simplified postCreateCommand
- postStartCommand still runs context gen for app restarts
Based on working llm-context-feature branch, adds:
- 4 app templates (flask-api, streamlit-dashboard, rshiny-dashboard, file-processor)
- APP_TEMPLATES.md skill for template selection guidance
- Updated generate-context.sh with both skills embedded
- Updated CLAUDE.md template with decision flow for app creation
When generating URLs for apps/proxies inside Workbench, must use:
https://workbench.verily.com/app/[UUID]/proxy/[PORT]/[PATH]

Common mistake: Using localhost or custom domain patterns which fail
with 'Bad Request' error.
New tool that answers: 'What data collections exist and what resources belong to them?'

Implementation:
1. Gets all resources and identifies their sourceWorkspaceId
2. Looks up each source workspace to get the actual data collection name
3. Groups resources by their source data collection
4. Shows resources created directly in this workspace (no source)

Returns structured JSON with:
- dataCollections: map of collection name -> {sourceWorkspaceId, resources[]}
- localResources: resources created in this workspace
- summary: statistics

This eliminates the need for LLMs to manually piece together
resource lineage information.
CLAUDE.md changes:
- Renamed section: 'Workbench URLs, Dashboards & Interactive Content'
- Explains why file:// URLs don't work (JS blocked by browser)
- Shows how to use Python HTTP server for HTML dashboards
- Quick recipe for building interactive visualizations
- Common ports table and pro tips

CUSTOM_APP.md changes:
- Added file:// to list of wrong URL formats
- Added reference to CLAUDE.md for dashboard guidance

This helps users who ask to 'build a dashboard' or 'visualize data'
understand that they need an HTTP server, not direct file access.
…tep API flow

- Updated CLAUDE.md template to use correct MCP tool names:
  - get_resource -> workspace_list_data_collections
  - list_resources -> workspace_list_resources
  - query_bigquery -> bq_execute
  - run_workflow -> workflow_job_run

- Enhanced workspace_list_data_collections to use two-step API approach:
  1. List all resources to get IDs
  2. Get detailed info for each resource (includes resourceLineage)
  3. Look up source workspace names for data collection grouping

- URL construction matches existing MCP tools pattern
The wb status output may return userFacingId instead of UUID.
Now uses resolveWorkspaceId() like other working tools to properly
convert the workspace ID to UUID before making API calls.
…de metadata

Key fixes:
- resourceLineage is an ARRAY, not an object
- resourceLineage is inside metadata, not at the top level
- Simplified: removed two-step approach since list endpoint includes lineage
- Now matches the working workspace_list_resources pattern
- Added 'How to Get the App UUID' section with command to get running app
- Added ⚡ LLM INSTRUCTION to never ask user for UUID, always get it automatically
- Updated Python dashboard example to show automatic UUID retrieval
- Removed duplicate Pro Tip about UUID
Templates now use minimal devcontainer config without:
- postCreateCommand referencing non-existent startupscript/
- Features referencing non-existent .devcontainer/features/

This makes templates truly standalone and deployable from any repo.

Fixed templates:
- streamlit-dashboard
- rshiny-dashboard
- file-processor (created .devcontainer.json)
- flask-api (already minimal)
New skill includes:
- Critical proxy URL rules and common issues
- Flask server configuration (0.0.0.0, threaded, debug=False)
- Working templates with BigQuery integration
- Comprehensive troubleshooting guide
- Lessons learned from real debugging sessions

Also:
- Streamlined CLAUDE.md with prominent skill triggers
- Temporarily removed APP_TEMPLATES from active skills (kept for future)
- Updated skill selection guide for clarity
Key fixes:
- devcontainer.json MUST be in .devcontainer/ folder (not root)
- Added proxyTargetPort requirement in customizations.workbench
- Fixed dockerComposeFile path (../docker-compose.yaml)
- Added volume mount for live code updates
- Added reference to create-custom-app.sh quick start script
- Added Common Mistakes Checklist
- Simplified directory structure to match working pattern
aculotti-verily and others added 27 commits May 11, 2026 15:33
generate-context.sh now detects cloudPlatform (GCP or AWS) from the
workspace metadata and generates platform-specific content:

- main(): fetch workspace before install_skills so cloudPlatform is
  available early; pass it to both install_skills and generate_claude_md
- generate_embedded_json: adds S3_BUCKET → s3:// mapping (no-op on GCP)
- generate_bucket_list: AWS branch for S3_BUCKET resources
- generate_claude_md: conditional vars for resources table rows and data
  persistence commands (gsutil vs aws s3 cp); GCP output is unchanged
- install_skills: for AWS, overwrites WORKFLOW_TROUBLESHOOT.md and
  DASHBOARD_BUILDER.md with AWS variants (aws s3 cp/aws batch, boto3/S3)

GCP path is completely unchanged — all existing GCP generated content
is byte-for-byte identical.

Co-authored-by: Cursor <cursoragent@cursor.com>
…r handling

- Fix S3 resource type: AWS_S3_STORAGE_FOLDER (not S3_BUCKET); update jq
  filters, bucket list, resource table, and CLI create command accordingly
- S3 path now includes prefix: s3://<bucket>/<prefix>
- Add AWS_AURORA_DATABASE to generate_embedded_json and resource table;
  add AWS_AURORA_DATABASE_REFERENCE to resource table
- AWS CLI commands confirmed from wb CLI docs: s3-storage-folder, aurora-database
- Fix Aurora WORKBENCH env var template in DASHBOARD_BUILDER skill to
  parse the actual "host:port/dbname" connection string format
- Harden generate_embedded_json: two-step local declarations + || '{}' fallbacks
  on each jq assignment + ${var:-{}} guards before final jq -n --argjson, so a
  failed resource fetch never prevents CLAUDE.md from being written
- Fix check_prerequisites auth hint to not be GCP-specific
- Fix stale "GCS/BQ path" comment in CLAUDE.md template
- Update header comment with AWS resource type names

Co-authored-by: Cursor <cursoragent@cursor.com>
- install.sh: replace single wb-check with 8-retry loop (10s between
  attempts) so AWS apps that take longer to initialise IAM credentials
  still get CLAUDE.md generated on first startup
- generate-context.sh: validate resource_paths/env_vars JSON before
  passing to --argjson; log the actual bad value on failure so the root
  cause is visible rather than a cryptic jq error

Co-authored-by: Cursor <cursoragent@cursor.com>
All 5 sections that were hardcoded for GCP now branch on ws_cloud:
- MCP Data & Resources table: removes bq_execute/resource_mount, adds
  S3 equivalents (list_files→aws s3 ls, create→s3-storage-folder)
- Cloud CLIs section: replaces gcloud/gsutil/bq MCP tools with AWS CLI
  terminal guidance (aws s3, aws batch, psql)
- Cloud path hint: adds rwEndpoint+port+databaseName for Aurora
- Env var example: gs:// → s3:// prefix
- Preview Data + Query Data: BigQuery/GCS replaced with S3/boto3/psycopg2
- How to Create Resources: gcs-bucket/bq-dataset → s3-storage-folder/aurora-database

GCP output is byte-for-byte identical to before.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
The resource_paths and env_vars variables were occasionally containing
multiple jq output objects separated by embedded newlines (e.g. when
wb resource list returns non-array JSON). --argjson rejects multi-value
strings, causing the embedded JSON block in CLAUDE.md to be empty.

Fix: pipe first jq output through `jq -cs 'add // {}'` which slurps all
outputs into a single merged object regardless of how many the upstream
jq produced. Also deduplicate the path expression into a shared variable
to keep both maps consistent.

Co-authored-by: Cursor <cursoragent@cursor.com>
The previous approach captured intermediate jq output into bash variables
then passed them via --argjson, which fails when the variables contain
embedded newlines or encoding edge cases on certain jq versions.

Rewrite as a single jq invocation that builds both resourcePaths and
envVars maps directly from the resource list. A jq `def` avoids repeating
the path expression. `head -1` guarantees one output line regardless of
what wb resource list returns. The bash fallback ensures a valid empty
JSON object is always returned even if jq fails completely.

Co-authored-by: Cursor <cursoragent@cursor.com>
Aurora in Workbench requires IAM database authentication — static passwords
are rejected with PAM authentication failed or no encryption errors.

Updates AWS DASHBOARD_BUILDER skill only (GCP skill unchanged):
- Template 2 rewritten with the correct 4-step IAM auth flow:
  wb resource credentials → boto3 IAM token → psycopg2 with sslmode=require
- Aurora troubleshooting section expanded with symptoms, step-by-step fix,
  and AWS CLI alternative using generate-db-auth-token
- Checklist and quick reference table include Aurora IAM and SSL items

Updates AWS data_preview_query_section in CLAUDE.md generation (AWS only):
- Aurora bash preview uses generate-db-auth-token plus PGSSLMODE=require
- Aurora Python example uses full wb credentials → boto3 → psycopg2 flow

Co-authored-by: Cursor <cursoragent@cursor.com>
…e data discovery

New tool searches all data collections accessible to the user platform-wide,
not just those attached to the active workspace. Uses the same
/api/workspaces/v2/filtered endpoint as workspace_list_all but pre-filters
for terra-type=data-collection workspaces.

Features:
- Optional keyword filter (case-insensitive match on name and description)
- Returns id, name, description, underlayName, and workspace properties
- Includes scope label and attach command in response so Claude always
  communicates context to the user
- Consistent map-based response style with rest of codebase

Co-authored-by: Cursor <cursoragent@cursor.com>
…erty metadata

All data collection metadata is stored as terra-* workspace properties —
no additional API calls or workspace context switches needed.

Extracts and returns:
- shortDescription, organization, availability, isFree, isInstantlyAccessible
- patientCount, timeFrame, geographicCoverage, dataModel
- dataModalityTags, therapeuticTags
- underlayName, dataDictionary, usageExamples (incl. sample SQL queries)
- accessGroupName, supportEmail
- dataPublished, metadataLastUpdated, externalDocumentation

Broadens keyword search to match against modality tags, therapeutic tags,
and data model type in addition to name and description.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ons URLs

MCP tool (main.go):
- Extract userFacingId and construct workbenchUrl per collection
  (https://workbench.verily.com/data-collections/<userFacingId>)
- Return uuid separately for API use vs id for UI links

Skill (DATA_DISCOVERY.md):
- Triggers on workspace-scoped and platform-wide data discovery prompts
- Step 0: always asks user to clarify search scope before proceeding
- Step 1: clarifies search criteria (modality, disease, population, access)
- Step 2: uses platform_list_data_collections MCP tool first, CLI fallback
- Step 3: presents results with all rich metadata fields, offers refinement
- Step 4: provides workbenchUrl and instructions to add via Workbench UI

generate-context.sh:
- Copies DATA_DISCOVERY.md into skills directory at context generation time
- Registers skill in CLAUDE.md skills table and trigger guide

Co-authored-by: Cursor <cursoragent@cursor.com>
Removes workspace-scoped trigger phrases to avoid conflicting with
workspace_list_data_collections. Skill now only activates for
cross-workspace / platform-wide discovery. Step 0 explicitly
short-circuits to workspace_list_data_collections if user is asking
about their active workspace.

Co-authored-by: Cursor <cursoragent@cursor.com>
…only

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
LLM context system with AWS support, MCP enhancements, and data discovery skill
The cp approach relied on feature source files being present at runtime,
which they are not inside the devcontainer. All other skills use embedded
heredocs — align DATA_DISCOVERY.md with the same pattern so it is always
written correctly at app startup.

Co-authored-by: Cursor <cursoragent@cursor.com>
fix: embed DATA_DISCOVERY.md as heredoc so it writes correctly at app startup
…igger

- Step 3 now ranks each result 1-5 with a one-sentence justification,
  sorted highest first; all score labels are positively framed
- CLAUDE.md trigger updated to ALWAYS read skill before calling
  platform_list_data_collections, with broader phrase coverage
- Skill header reinforces that the MCP tool should not be called directly

Co-authored-by: Cursor <cursoragent@cursor.com>
DATA_DISCOVERY skill improvements — ranking
…_list_data_collections

The tool was making one sequential API call per data collection to resolve
its display name, causing timeouts on workspaces with 5+ collections.

Fix: one batch POST to /api/workspaces/v2/filtered builds a uuid->name map
upfront. Resources are then grouped by display name in memory. Falls back
to UUID as group key if the batch call fails.

Co-authored-by: Cursor <cursoragent@cursor.com>
fix(mcp): resolve workspace_list_data_collections timeout + skill refinements
- Add requireString/requireStrings helpers to MCP server to prevent
  panics from unchecked type assertions on missing tool arguments
- Extract ~2,150 lines of embedded skill heredocs from generate-context.sh
  into standalone files copied at install time, establishing skills/ as
  the single source of truth
- Create standalone AWS skill variants (WORKFLOW_TROUBLESHOOT, DASHBOARD_BUILDER)
  previously only available as heredocs
- Merge heredoc-unique content into standalone skills (Quick Start in
  CUSTOM_APP, "Be Proactive" behavior in WORKFLOW_TROUBLESHOOT)
- Replace personal repo reference in APP_TEMPLATES with org repo and
  add fork guidance
- Detect architecture dynamically in wb-mcp-server install (amd64/arm64)
- Bump Go from 1.21 to 1.25 to match rest of repo
- Fix path references in llm-context README (~/.workbench -> ~/.claude)
- Fix app README listing non-existent template options
- Add .bashrc idempotency guards to both install scripts
- Fix cp -r reinstall nesting issue in llm-context install

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants