Conversation
… to allow loose packing to allow total fluxes that don't fit evenly into the requested number of sub-batches.
…mplex fragmentation.
…lux method which is rename to make_photon_subbatches.
04a30a2 to
02b52ba
Compare
|
This is ready for review, though CI failed earlier on today during Conda setup: |
rmjarvis
left a comment
There was a problem hiding this comment.
I think I'm confused about some of this.
| # Calculate current sub-batch flux. | ||
| # It would be nicer to store rather then recalculate. | ||
| subbatch = subbatches[current_subbatch] | ||
| subbatch_flux = sum(sb_obj.phot_flux for sb_obj in subbatch) |
There was a problem hiding this comment.
As you say, it would be nicer to keep this as a running total, rather than O(N^2) recalculating it each time. Is that hard for some reason? I'm thinking the easiest way would be to keep an array with the flux in each batch so far.
subbatch_fluxes = [0. for _ in range(nsubbatch)]
Then update it each time you add something to subbatches.
There are also patterns that only keep the running sum for the current batch, but the advantage of this is there is a nice debug statement you could put at the end with the list of realized fluxes per batch, which would be handy to be able to easily see when running.
| if remaining_flux <= available_flux: | ||
| # This sub-batch can contain the entirety of the remaining | ||
| # object flux. Place entirety it all in here, the continue | ||
| # without advancing in case more can go in. |
There was a problem hiding this comment.
I expected the first test to be if remaining_flux <= loose_photons_per_subbatch - subbatch_flux, in which case put it in entirely. And if remaining_flux >= photons_per_subbatch - subbatch_flux then bump to the next batch.
I think this cleanly gets what you now have both here and the start of the else branch.
| # In general there are multiple ways to split the batch. Assert that each | ||
| # object appears with its original flux across however many sub-batches it | ||
| # appears in, that the total flux across all sub-batches equals the total | ||
| # batch flux, and that the most full batch contains <= 1.1 * the flux in the |
There was a problem hiding this comment.
1.05?
Oh, actually, with the rhs using the minimum, it won't be this. And in fact, I don't think the code makes any guarantee about that ratio. There can be arbitrarily many batches up to 5% bigger than nominal, so the minimum can be arbitrarily small (even 0 I believe). So your use of 1.1 here is I think just something that happens to work for this particular case.
If we want to avoid that "failure mode" (of having a very small batch), then I think we could lessen the likelihood of it by recalculating the nominal and loose levels for the rest of the batches any time one batch goes over the nominal level. Not sure how important that is, but it's something we could implement if we see that there is often a batch at the end that is very low flux.
| dataclasses.replace(obj, phot_flux=available_flux) | ||
| ) | ||
| remaining_flux -= available_flux | ||
| current_subbatch = (1 + current_subbatch) % nsubbatch |
There was a problem hiding this comment.
If I understand this algorithm correctly, I don't think the % nsubbatch is ever necessary. Is this a relic from a previous implementation? Or am I missing something here?
| ] | ||
| run_subbatch_test("fragmentation in first sub-batch", batch, 2) | ||
|
|
||
| # Make sure the sub-batcher can go backwards (i.e. assign to subbatches |
There was a problem hiding this comment.
I didn't think the algorithm ever did go backwards? Again, am I missing something here?
Currently, the sub-batching in photon pooling is performed object-by-object. So, for example, if there are 100 objects in the batch, and 20 sub-batches, then the first five objects will go in the first sub-batch, the next five in the second, and so on. This means that if the objects vary a lot in brightness (and they do) then the flux per sub-batch will also vary a lot. In a typical run and setup provided by Jim, I found that most sub-batches contained a few thousand photons, and a handful contained several tens of millions. This leads to regular memory spikes when those sub-batches are processed in each new photon pool.
This PR smooths those memory spikes out by creating sub-batches with roughly equal fluxes. This is an implementation of the bin packing problem with fragmentation: our goal is to split the total flux in the pool across the sub-batches, and to achieve this we have to be able to fragment those extremely bright objects across however many sub-batches.
At the same time, each object fragmentation means another object lookup, and we had determined previously that with photon pooling these can mount up and become quite costly. For this reason, I'm allowing the flux in each sub-batch to vary slightly, letting objects to fill up over the limit to 105% of the expected per sub-batch flux if it prevents a fragmentation.
This is ready to go, but needs #511 to be merged in first and then this should be rebased onto it - so for now I'm leaving this as a draft.