Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pydeequ/configs.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,4 @@ def _get_deequ_maven_config():

SPARK_VERSION = _get_spark_version()
DEEQU_MAVEN_COORD = _get_deequ_maven_config()
IS_DEEQU_V1 = re.search("com\.amazon\.deequ\:deequ\:1.*", DEEQU_MAVEN_COORD) is not None
IS_DEEQU_V1 = "com.amazon.deequ:deequ:1" in DEEQU_MAVEN_COORD
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: The new substring check "com.amazon.deequ:deequ:1" in DEEQU_MAVEN_COORD is less precise than the original regex intent. It would match a hypothetical version like com.amazon.deequ:deequ:12.0.0-spark-3.5 (a future major version starting with '1' but not actually v1.x). A more robust fix for the deprecation warning would be to either use a raw string with the original regex (re.search(r"com\.amazon\.deequ:deequ:1\.", DEEQU_MAVEN_COORD)) or check for ":deequ:1." (with trailing dot) to ensure it's actually version 1.x.

Line 44: IS_DEEQU_V1 = "com.amazon.deequ:deequ:1" in DEEQU_MAVEN_COORD — this matches any string containing the substring, including potential future versions like deequ:10.x or deequ:12.x. The original regex com\.amazon\.deequ\:deequ\:1.* was anchored to version strings starting with '1'. Looking at SPARK_TO_DEEQU_COORD_MAPPING (lines 7-11), current values are all deequ:2.0.8-spark-*, so this is not a current bug, but it's a correctness regression for future-proofing.