ci: add swap during build, use tpchgen-cli#1443
Conversation
kevinjqliu
left a comment
There was a problem hiding this comment.
I was looking into fixing this and testing on my fork,
adding swap helped
https://github.com/kevinjqliu/datafusion-python/pull/4/files#diff-5c3fa597431eda03ac3339ae6bf7f05e1a50d6fc7333679ec38e21b337cb6721R244-R255
we can add it to give us more buffer
.github/workflows/build.yml
Outdated
| # temporarily comment out to verify it works in the PR | ||
| # if: inputs.build_mode == 'release' | ||
| env: | ||
| CARGO_BUILD_JOBS: 2 |
There was a problem hiding this comment.
this should help, we can even do CARGO_BUILD_JOBS=1 and also add it to the "debug" mode below on L241
Cargo.toml
Outdated
| [profile.release.package.substrait] | ||
| opt-level = 1 | ||
| codegen-units = 16 | ||
|
|
There was a problem hiding this comment.
Lets see if scoping it to substrait will help here.
I think we might need to bite the bullet and do this:
[profile.release]
lto = "thin"
codegen-units = 4
or override the options using env var just for that one job.
but note that this affects the final artifact
|
Claude on why The priority order that minimizes artifact impact:
|
|
Thanks @kevinjqliu ! If the swap is good enough to get us around this OOM then I think that's the best option. |
maybe need to add the new |
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM!
Looks like adding the swap worked.
probably needs |
|
Super frustrating because these same tests pass locally for me every time |
|
i was able to repro the issue locally. claude says its a datafusion regression, im running df 52 locally to check |
|
I take it back. Now I'm getting a failure. I must not have built recently. Investigating. |
|
Found it: apache/datafusion#21011 and that tpch code depended on it. |
|
on Mac M4 Looks like its due to apache/datafusion#21011 |
|
heres a potential fix kevinjqliu@1fb80a3 according to claude |
|
looks like claude code and tim code came to the same conclusion |
|
here goes nothing... |
https://github.com/apache/datafusion-python/actions/runs/23614615854/job/68778937649 @timsaucer did you cancel it? |
|
Nope, I don't think so. I'm going to let the rest finish and rerun that work flow |
|
nice! its all green now https://github.com/apache/datafusion-python/actions/runs/23614615854 |
Which issue does this PR close?
Related to #1429 but we need to verify if it resolves the issue.
Rationale for this change
We are getting OOM errors during build. This adds a swap to the build stage to prevent them.
What changes are included in this PR?
Add swap during build stage.
Added tpchgen-cli to generate TPC-H data, so also committed the answer files.
Are there any user-facing changes?
No