Skip to content

Missing support for custom attributes on primitive types #368

@yshcz

Description

@yshcz

Avro allows attaching custom properties to any type, including primitives.

In avro-rs, primitive schemas like Schema::Long/Schema::Int/etc are modeled as bare enum variants and there is no field to store custom properties. Schema::custom_attributes() also returns None for primitives.

As a result, parsing an object-form primitive schema silently drops all extra properties, and serializing the parsed schema emits just "long", so the metadata cannot be inspected or round-tripped.

reproducer:

use apache_avro::Schema;
use apache_avro_test_helper::TestResult;
use serde_json::json;

#[test]
fn test1() -> TestResult {
    let input = json!({
        "type": "long",
        "custom-prop": "value"
    });

    let schema = Schema::parse(&input)?;
    assert!(matches!(schema, Schema::Long));

    let serialized = serde_json::to_string(&schema)?;
    assert!(
        serialized.contains("custom-prop"),
        "Expected serialized schema to include custom property key, but it was dropped. Serialized: {serialized}. Parsed schema: {schema:?}"
    );
    Ok(())
}

output:

  Expected serialized schema to include custom property key, but it was dropped. Serialized: "long". Parsed schema: Long

Iceberg encodes types like timestamptz to Avro as a primitive long with additional properties (e.g. {"type":"long","logicalType":"timestamp-micros","adjust-to-utc":true} or similar depending on precision).

Since apache-avro cannot represent custom properties on primitive types, the adjust-to-utc flag is dropped during parsing and the original schema semantics cannot be preserved.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions