Skip to content

Architecture: move codegen backends toward SemanticSchema as the common backend contract #129

@hardbyte

Description

@hardbyte

Follow-up to #96.

Summary

#96 established the shared compiler-oriented foundations: stable symbol identity, normalization, and SemanticSchema.

#128 now makes the first structural split needed to support that direction cleanly:

  • reflectapi-schema is the raw/interchange schema crate
  • reflectapi-schema-codegen owns compiler-only concepts: symbol IDs, schema ID assignment, normalization, and semantic IR

That separation is an improvement, but backend consumption is still split:

  • Python consumes reflectapi-schema-codegen plus raw Schema
  • Rust mostly renders from raw Schema after backend-local preprocessing
  • TypeScript mostly renders from raw Schema after backend-local preprocessing
  • OpenAPI walks raw Schema directly

This issue is now specifically about the next step: moving backends toward a common backend contract derived from SemanticSchema, instead of each backend rediscovering or bypassing shared compiler meaning.

Why

The main benefits are compiler and architecture quality:

  • one canonical backend-facing contract instead of a mix of raw-schema and semantic-schema paths
  • less backend drift in ordering, naming, dependency handling, and type resolution
  • fewer backend-local schema mutations and ad hoc preprocessing steps
  • stronger guarantees that shared compiler behavior applies consistently across languages
  • a better foundation for future lowering passes, validation, and backend-specific projections

What Is Already Settled

  • backend-local type mappings belong in the backend, not the shared schema crates
  • shared symbol identity, dependency analysis, normalization, and semantic ordering are compiler concerns
  • raw schema and compiler schema are different phases and should stay separate

That means this issue is not about inventing one universal final mapping layer for every language. It is about making shared meaning explicit once, then letting each backend render from that shared meaning.

Remaining Pain Points

  • Python still has to synchronize semantic ordering with raw schema lookups for rendering details
  • Rust and TypeScript still rely on raw schema traversal after local consolidation
  • OpenAPI still bypasses the semantic layer entirely
  • backend phase boundaries are still implicit in places because preprocessing is repeated inside backends
  • some backend requirements may still be missing from SemanticSchema, forcing raw-schema reads

Proposed Direction

  1. Define the intended common backend contract explicitly.
    Either:
  • all backends consume SemanticSchema, or
  • all backends consume a single backend-facing IR lowered from SemanticSchema
  1. Audit what each backend still reads from raw Schema.
    For each read:
  • is it genuinely backend-local rendering data?
  • should it instead be represented in SemanticSchema?
  • or should it belong in a shared post-semantic lowering stage?
  1. Move backend-independent meaning into the shared compiler layer.
    Examples:
  • resolved references
  • stable ordering
  • dependency information
  • naming/consolidation decisions
  • normalized container semantics that every backend would otherwise rediscover
  1. Keep backend-specific rendering local.
    Examples:
  • Python imports/runtime-provided types/type-hint mappings
  • TypeScript mappings and representation choices
  • Rust derive/ownership decisions
  • OpenAPI-specific schema projection
  1. Reduce direct raw-schema traversal in backends over time, one backend at a time.

Suggested First Step

Do a backend-by-backend audit of:

  • which raw Schema fields are still read directly
  • which of those reads are truly backend-specific
  • which should instead be represented in SemanticSchema or a shared lowering stage

That audit should produce a staged migration plan rather than a single large rewrite.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions