Commit 46f88b3
Use the max size of serialized examples to find a safe number of shards
If we know the max size of serialized examples, then we can account for the worst case scenario where one shard would get only examples of the max size. This hopefully should prevent users running into problems with having too big shards.
PiperOrigin-RevId: 7263777781 parent 281ce2d commit 46f88b3
4 files changed
Lines changed: 136 additions & 21 deletions
File tree
- tensorflow_datasets/core
- dataset_builders
- utils
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
406 | 406 | | |
407 | 407 | | |
408 | 408 | | |
| 409 | + | |
409 | 410 | | |
410 | 411 | | |
411 | 412 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| 60 | + | |
60 | 61 | | |
61 | 62 | | |
62 | 63 | | |
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
66 | | - | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
67 | 70 | | |
68 | 71 | | |
69 | 72 | | |
70 | 73 | | |
71 | | - | |
72 | | - | |
| 74 | + | |
| 75 | + | |
73 | 76 | | |
74 | 77 | | |
75 | 78 | | |
76 | 79 | | |
77 | 80 | | |
78 | 81 | | |
79 | 82 | | |
80 | | - | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
81 | 93 | | |
82 | 94 | | |
83 | 95 | | |
| |||
96 | 108 | | |
97 | 109 | | |
98 | 110 | | |
| 111 | + | |
99 | 112 | | |
100 | 113 | | |
101 | 114 | | |
102 | 115 | | |
103 | 116 | | |
104 | 117 | | |
105 | | - | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
106 | 122 | | |
107 | 123 | | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
108 | 127 | | |
109 | 128 | | |
110 | 129 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
32 | 105 | | |
33 | 106 | | |
34 | | - | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
35 | 113 | | |
36 | 114 | | |
37 | 115 | | |
38 | 116 | | |
39 | 117 | | |
40 | 118 | | |
41 | 119 | | |
| 120 | + | |
42 | 121 | | |
43 | 122 | | |
44 | 123 | | |
| |||
48 | 127 | | |
49 | 128 | | |
50 | 129 | | |
51 | | - | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
52 | 134 | | |
53 | 135 | | |
54 | 136 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
116 | 116 | | |
117 | 117 | | |
118 | 118 | | |
| 119 | + | |
119 | 120 | | |
120 | 121 | | |
121 | 122 | | |
| |||
125 | 126 | | |
126 | 127 | | |
127 | 128 | | |
| 129 | + | |
128 | 130 | | |
129 | 131 | | |
130 | 132 | | |
131 | 133 | | |
132 | | - | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
133 | 137 | | |
134 | 138 | | |
135 | 139 | | |
| |||
372 | 376 | | |
373 | 377 | | |
374 | 378 | | |
| 379 | + | |
375 | 380 | | |
376 | 381 | | |
377 | 382 | | |
| |||
589 | 594 | | |
590 | 595 | | |
591 | 596 | | |
592 | | - | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
593 | 600 | | |
594 | 601 | | |
595 | 602 | | |
| 603 | + | |
596 | 604 | | |
597 | 605 | | |
598 | 606 | | |
| |||
658 | 666 | | |
659 | 667 | | |
660 | 668 | | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
661 | 672 | | |
662 | | - | |
663 | | - | |
664 | | - | |
665 | | - | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
666 | 677 | | |
667 | 678 | | |
668 | 679 | | |
669 | 680 | | |
670 | | - | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
671 | 684 | | |
672 | 685 | | |
673 | 686 | | |
| |||
0 commit comments