-
Notifications
You must be signed in to change notification settings - Fork 0
Evaluating each compressor on multiple error bounds #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
4e585fe
Add directory to store error bounds
treigerm e07f66c
Allow to configure each compressor with an error bound
treigerm 1f43aed
Pass dtype to build fn, add docstrings, create bitround helper fn
treigerm ce5a989
Enforce named arguments
treigerm 7c733b3
Refactor compressors to allow multiple error bound conversions
treigerm 5ebe08b
Refactor compressor to have transformation logic in base class
treigerm a3530ba
Add docstring
treigerm cd9ef48
Merge remote-tracking branch 'origin/main' into error_bounds
treigerm 8ba9df2
Adjust JPEG2000 for absolute error bounds
treigerm 9b0001c
Refine docstring
treigerm 1ec0fe3
Fix compressed datasets path
treigerm d4f16f3
Fix grammar in docstring
treigerm de6fd17
Fix relative error bound conversion bug
treigerm 84c113a
Improved comments and error handling
treigerm 3e6777b
Generate separate codec for each variable
treigerm 6309efb
Fix JPEG2000 maximum pixel value
treigerm be3805a
Save full stacktrace when error occurs
treigerm bcf7f3f
Clarify control flow, address PR review comments
treigerm 388ac13
Simplify control flow further
treigerm 445df64
Adjust JPEG2000 precision
treigerm 1383d9a
Clarifying comments
treigerm 970f5ea
Comment about input transformation
treigerm d2a864d
Rename dataclasses with more detailed names
treigerm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| /*/error_bounds.json |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,43 @@ | ||
| import json | ||
|
juntyr marked this conversation as resolved.
|
||
| from pathlib import Path | ||
|
|
||
| import xarray as xr | ||
|
|
||
| REPO = Path(__file__).parent.parent | ||
|
|
||
|
|
||
| def main(): | ||
| datasets = REPO.parent / "data-loader" / "datasets" | ||
| datasets_error_bounds = REPO / "datasets-error-bounds" | ||
|
|
||
| for dataset in datasets.iterdir(): | ||
| if dataset.name == ".gitignore": | ||
| continue | ||
|
|
||
| print(dataset.name) | ||
| ds = xr.open_dataset( | ||
| dataset / "standardized.zarr", | ||
| chunks=dict(), | ||
| engine="zarr", | ||
| decode_times=False, | ||
| ) | ||
|
|
||
| # TODO: This is a temporary solution that should be replaced by a more | ||
| # principled method to selct the error bounds. | ||
| low_error_bounds, mid_error_bounds, high_error_bounds = dict(), dict(), dict() | ||
| for v in ds: | ||
| data_range = (ds[v].max() - ds[v].min()).values.item() | ||
| low_error_bounds[v] = {"abs_error": 0.0001 * data_range, "rel_error": None} | ||
| mid_error_bounds[v] = {"abs_error": 0.001 * data_range, "rel_error": None} | ||
| high_error_bounds[v] = {"abs_error": 0.01 * data_range, "rel_error": None} | ||
|
|
||
| error_bounds = [low_error_bounds, mid_error_bounds, high_error_bounds] | ||
|
|
||
| dataset_error_bounds = datasets_error_bounds / dataset.name | ||
| dataset_error_bounds.mkdir(parents=True, exist_ok=True) | ||
| with open(dataset_error_bounds / "error_bounds.json", "w") as f: | ||
| json.dump(error_bounds, f) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.