-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathGUIDE.html
More file actions
1785 lines (1761 loc) · 70.5 KB
/
GUIDE.html
File metadata and controls
1785 lines (1761 loc) · 70.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<h1 id="-roblox-data-stores-batch-processor-cli-technical-guide-">
<strong>Roblox Data Stores Batch Processor CLI: Technical Guide</strong>
</h1>
<p><strong>⚠️ Use At Your Own Risk</strong></p>
<p>
This CLI is a powerful tool to help creators better manage data in data
stores. It directly interacts with your live experience data at bulk.
</p>
<p>
Batch data processing is a very complex process, and many things can go wrong.
<strong
>Improper use, incorrect configurations, or errors in custom callback
functions can result in permanent, irreversible data loss.</strong
>
It is strongly recommended to test thoroughly on a test experience first.
</p>
<p>
<strong
>Please read this entire guide carefully before using this tool.</strong
>
Please be especially cautious if you are doing a data migration while your
experience is still reading / writing those keys. You will need to implement
handling for this — this is called a live-path migration and is covered later
in <strong>Section 10</strong>.
</p>
<p>By using this tool, you acknowledge and accept all risks.</p>
<p>
Welcome to the official guide for the Roblox Data Stores Batch Processor
Command-Line Interface (CLI). This document provides technical guidance for
installing, configuring, and operating the tool.
</p>
<p>
This tool is open source under the MIT License! Feel welcome to fork the
repository and stage a pull request with your contributions. Please see the
<strong><a href="./LICENSE">LICENSE</a></strong> for details. This tool uses
third-party modules; their licenses are listed in
<strong><a href="./ATTRIBUTIONS.md">ATTRIBUTIONS.md</a></strong
>.
</p>
<h2 id="-1-overview-"><strong>1. Overview</strong></h2>
<p>
The Batch Processor CLI is a powerful command-line tool built on the Lune
runtime. It performs large-scale, custom operations on your experience's
data stores by leveraging memory stores, data stores, and
<code>LuauExecutionSessionTasks</code> Open Cloud APIs.
</p>
<p>
The CLI orchestrates the entire workflow, from task creation to execution,
while providing you with the necessary tools to actively manage and monitor
your running batch processes.
</p>
<h3 id="-1-1-use-cases-"><strong>1.1. Use Cases</strong></h3>
<p>
The tool allows you to run custom Luau logic across all keys within a data
store or across all data stores within an experience. This enables a variety
of powerful data operations, including:
</p>
<ul>
<li>
<strong>Schema Migrations:</strong> Updating the data structure for
thousands or millions of user profiles after an update.
</li>
<li>
<strong>Bulk Data Deletion:</strong> Removing specific data fields from all
user accounts for data cleanup or compliance, or clearing obsolete data
stores.
</li>
</ul>
<p>
If you are a DataStore2 or Berezaa Method user, this tool will be especially
useful for data migration and cleanup!
</p>
<h3 id="-1-2-how-it-works-"><strong>1.2. How It Works</strong></h3>
<p>
At a high level, the tool operates by creating and managing
<code>LuauExecutionSessionTasks</code> to execute your custom Luau scripts.
The batch process is broken down into two stages:
</p>
<ul>
<li>
<strong>Stage 1: Scan and Queue</strong><br />The first stage scans your
experience to identify all target data stores or keys based on your provided
criteria. It then populates a memory store queue with these items, which are
considered "jobs" for the next stage. Stage 1 also accumulates
progress as jobs are completed, updating overall batch process state.
</li>
<li>
<strong>Stage 2: Dequeue and Process</strong><br />The second stage dequeues
the jobs created in Stage 1 and executes your custom Luau logic for each
item within a job. This is where your data manipulation, migration, or
analysis takes place.
</li>
</ul>
<p>
If the batch process runs to completion, at-least-once processing is
guaranteed.
</p>
<h2 id="-2-getting-started-"><strong>2. Getting Started</strong></h2>
<p>Follow these steps to set up the CLI and prepare your environment.</p>
<h3 id="-2-1-prerequisites-"><strong>2.1. Prerequisites</strong></h3>
<ol>
<li>
<strong>Unzip Module:</strong> Unzip the provided Batch Processing module
and save it to your local machine.
</li>
<li>
<strong>Install Lune:</strong>
<a href="https://lune-org.github.io/docs/getting-started/1-installation"
>Install the Lune runtime</a
>
on your machine. This is required to execute the tool.
</li>
<li>
<strong>Configure Open Cloud API Key:</strong> The tool requires an Open
Cloud API Key with specific permissions.
<ul>
<li>
Navigate to the
<a href="https://create.roblox.com/dashboard/credentials"
>Creator Hub API Extensions</a
>.
</li>
<li>Create a new API Key or edit an existing one.</li>
<li>
Grant the key the necessary permissions for your target experience, as
detailed in <strong>Appendix A: API Key Permissions</strong>.
</li>
<li>Save the generated API key to a secure location.</li>
</ul>
</li>
</ol>
<h3 id="-2-2-running-a-batch-process-">
<strong>2.2. Running a Batch Process</strong>
</h3>
<ol>
<li>
<strong>Set API Key Environment Variable:</strong> Before running a command,
set your API Key as an environment variable. You can do this directly in
your terminal, or see <strong>Appendix F: Tips and Tricks</strong> for
further ways of securing your API Key.
<ul>
<li>
<strong>macOS/Linux (sh, bash, zsh):</strong><br /><code
>export API_KEY="<Your-API-Key>"</code
>
</li>
<li>
<strong>Windows (PowerShell):</strong><br /><code
>$env:API_KEY="<Your-API-Key>"</code
>
</li>
</ul>
</li>
<li>
<strong>Create a Callback Function:</strong> Write your custom processing
logic in a Luau file. See
<strong>Appendix C: Custom Callback Function</strong> for requirements and
examples. This callback function will be called on
<strong>every key / data store scanned by the batch processor</strong>. Each
item will be processed independently.
</li>
<li>
<strong>Execute Command:</strong> Navigate to the top-level directory of the
unzipped module and run one of the available commands (process-keys or
process-data-stores). For example, call
<ul>
<li>
<code
>lune run batch-process process-keys mybatchprocess -c
myconfig.json</code
>
</li>
</ul>
</li>
</ol>
<h3 id="-2-3-foreground-process-"><strong>2.3 Foreground Process</strong></h3>
<p>
The batch process runs actively in the foreground, and you must
<strong>keep the terminal session open</strong> for the entire duration of the
job.
</p>
<p>
This is because the CLI performs a critical "keep-alive" function.
Today Session Tasks have a finite lifetime (a 5-minute time limit) and can
terminate for various reasons. The CLI actively monitors these sessions and
automatically restarts any that die to ensure that long-running batch
processes can continue until completion. Closing the terminal will terminate
this orchestration process. Batch process state is stored in memory stores, so
if the terminal closes, you can <strong>resume</strong> a previously-running
batch process.
</p>
<p>
Note that if you exit out of the foreground process, the batch process will
continue to execute until all Session Tasks terminate organically.
</p>
<h3 id="-2-4-cleaning-up-a-batch-process-">
<strong>2.4 Cleaning up a Batch Process</strong>
</h3>
<p>
After a batch process finishes or fails, the process name continues to be
reserved until the process is cleaned up. This allows you to review and keep a
record of previously completed batch processes. To cleanup a batch process,
you can call the <code>cleanup</code> command like so:
</p>
<ul>
<li>
<code>lune run batch-process cleanup mybatchprocess -c myconfig.json</code>
</li>
</ul>
<p>
“Cleaning up” a batch process implies deleting the data stores / memory stores
resources used for execution – see <strong>Appendix D</strong> for information
on these resources.
</p>
<h2 id="-3-best-practices-"><strong>3. Best Practices</strong></h2>
<p>
Before executing a large-scale batch process on your production data, it is
crucial to follow these testing and verification steps to prevent unintended
consequences.
</p>
<ul>
<li>
<strong>Test on a Non-Production Experience First</strong><br />Always run
your batch process against a test data store with a smaller, representative
dataset. This allows you to validate your callback logic and tune
configurations in a safe environment without impacting live user data.
Verify that the batch process is running using the
<a
href="https://create.roblox.com/docs/cloud-services/data-stores/observability"
><strong>Data Stores Observability Dashboard</strong></a
>, and verify any changes to your data stores manually in
<a
href="https://create.roblox.com/docs/cloud-services/data-stores/data-stores-manager"
><strong>Data Stores Manager</strong></a
>. Also ensure that your memory stores usage and error rate from the batch
process is at acceptable levels with the
<a
href="https://create.roblox.com/docs/cloud-services/memory-stores/observability"
><strong>Memory Stores Observability Dashboard</strong></a
>.
</li>
<li>
<strong>Perform a Scope Test</strong><br />Perform a “scope test” by running
a batch process with an empty callback function on your keys / data stores.
Given your <code>--key-prefix</code> or <code>--data-store-prefix</code>,
observe how many items get captured by the batch process and ensure it
coincides with your expected count.
</li>
<li>
<strong>Perform a Spot Test</strong><br />Once you are confident in your
script and configurations, perform a limited "spot test" on your
production environment. Use the <code>--key-prefix</code> or
<code>--data-store-prefix</code> options to target a single, specific data
point (e.g., your own test account’s key). Verify that the operation
completes successfully and has the intended effect on that single item
before running it on the entire dataset.
</li>
<li>
<strong>Verify All Configurations</strong><br />Before initiating a full
run, double-check all provided configurations—either in your config file or
as command-line options.
<strong
>We have picked default configurations that work for most use cases, but
they may not work for yours!</strong
>
Ensure all values are within reasonable limits for your use case and that
you understand the performance implications of all configurations.
</li>
<li>
<strong>Monitor Batch Process Runs</strong><br />After starting a batch
process, monitor the run for several minutes before walking away. Follow the
same steps as discussed above in
<strong>Test on a Non-Production Experience First</strong>.
</li>
</ul>
<h2 id="-4-command-reference-"><strong>4. Command Reference</strong></h2>
<h3 id="-4-1-process-keys-"><strong>4.1. process-keys</strong></h3>
<p>
Starts a batch process that iterates over key names within a single standard
data store.
</p>
<p><strong>Usage Examples:</strong></p>
<pre><code class="lang-sh"><span class="hljs-comment"># Using direct command-line options</span>
lune run batch-<span class="hljs-built_in">process</span> <span class="hljs-built_in">process</span>-<span class="hljs-built_in">keys</span> <<span class="hljs-built_in">process</span>-name> <span class="hljs-comment">--data-store-name <name> [other-options...]</span>
<span class="hljs-comment"># Using a configuration file</span>
lune run batch-<span class="hljs-built_in">process</span> <span class="hljs-built_in">process</span>-<span class="hljs-built_in">keys</span> <<span class="hljs-built_in">process</span>-name> <span class="hljs-comment">--config <filepath></span>
</code></pre>
<p><strong>Arguments:</strong></p>
<ul>
<li>
<strong><code><process-name></code></strong
>: The unique name of the new batch process. The process name can be at most
18 characters long.
</li>
</ul>
<p><strong>Command-Specific Options:</strong></p>
<ul>
<li>
<strong><code>--data-store-name / -d</code> :</strong> The name of the data
store to scan.
</li>
<li>
<strong><code>--data-store-scope / -s</code> :</strong> The scope of the
data store.
</li>
<li>
<strong><code>--key-prefix / -P</code> :</strong> A prefix to filter which
keys are processed.
</li>
<li>
<strong><code>--exclude-deleted-keys</code> :</strong> A flag to exclude
keys that have been previously deleted.
</li>
</ul>
<p>
<strong>Shared Options:</strong><br />This command also accepts all required
options (<code>--universe-id</code>, etc.) and all optional processing
configurations. <strong>See Appendix B: Configuration Glossary</strong> for
the full list.
</p>
<h3 id="-4-2-process-data-stores-">
<strong>4.2. process-data-stores</strong>
</h3>
<p>
Starts a batch process that iterates over standard data store names within an
experience.
</p>
<p><strong>Usage Examples:</strong></p>
<pre><code class="lang-sh"><span class="hljs-comment"># Using direct command-line options</span>
lune run batch-<span class="hljs-built_in">process</span> <span class="hljs-built_in">process</span>-data-stores <<span class="hljs-built_in">process</span>-name> <span class="hljs-comment">--data-store-prefix <prefix> [other-options...]</span>
<span class="hljs-comment"># Using a configuration file</span>
lune run batch-<span class="hljs-built_in">process</span> <span class="hljs-built_in">process</span>-data-stores <<span class="hljs-built_in">process</span>-name> <span class="hljs-comment">--config <filepath></span>
</code></pre>
<p><strong>Arguments:</strong></p>
<ul>
<li>
<strong><code><process-name></code></strong> : The unique name of the
new batch process. The process name can be at most 18 characters long.
</li>
</ul>
<p><strong>Command-Specific Options:</strong></p>
<ul>
<li>
<strong><code>--data-store-prefix / -P</code></strong> : A prefix to filter
which data stores are processed.
</li>
</ul>
<p>
<strong>Shared Options:</strong><br />This command also accepts all required
options (<code>--universe-id</code>, etc.) and all optional processing
configurations. See <strong>Appendix B: Configuration Glossary</strong> for
the full list.
</p>
<h3 id="-4-3-resume-"><strong>4.3. resume</strong></h3>
<p>
Resumes a previously running batch process that has not completed. A batch
process is considered incomplete if it fails, or if the terminal is closed and
the session tasks time out before all items can be processed.
</p>
<p><strong>Usage:</strong></p>
<pre><code class="lang-sh"><span class="hljs-comment"># Using direct command-line options</span>
lune <span class="hljs-built_in">run</span> batch-process resume <process-<span class="hljs-built_in">name</span>> <span class="hljs-comment">--universe-id <id></span>
<span class="hljs-comment"># Using a configuration file</span>
lune <span class="hljs-built_in">run</span> batch-process resume <process-<span class="hljs-built_in">name</span>> <span class="hljs-comment">--config <filepath></span>
</code></pre>
<p><strong>Arguments:</strong></p>
<ul>
<li>
<strong><code><process-name></code></strong> : The unique name of the
previously run batch process to resume.
</li>
</ul>
<p><strong>Option:</strong></p>
<ul>
<li>
<strong><code>--universe-id / -u</code></strong> : The universe to load /
resume the batch process on.
</li>
</ul>
<p>
<strong>Shared Options:</strong><br />When resuming a process, all original
configurations are loaded from the saved state. You can optionally override
any of the following configurations:
</p>
<ul>
<li><code>numProcessingInstances</code></li>
<li><code>outputDirectory</code></li>
<li><code>errorLogMaxLength</code></li>
<li><code>jobQueueMaxSize</code></li>
<li><code>maxTotalFailedItems</code></li>
<li><code>memoryStoresExpiration</code></li>
<li><code>memoryStoresStorageLimit</code></li>
<li><code>numRetries</code></li>
<li><code>retryTimeoutBase</code></li>
<li><code>retryExponentialBackoff</code></li>
<li><code>processItemRateLimit</code></li>
<li><code>progressRefreshTimeout</code></li>
</ul>
<p><strong>Important Notes</strong>:</p>
<ul>
<li>
Resuming will also
<strong>reload the callback function from the filepath</strong>. Please
ensure that the callback function and filepath are still correct before
resuming a batch process.
</li>
<li>
Resuming will <strong>reset the output directory</strong>. If you would like
to save the previous output, then please change the outputDirectory
configuration while resuming.
</li>
<li>
When resuming a previously failed batch process, it is
<strong>required</strong> that you set the
<code>maxTotalFailedItems</code> to a value greater than the current failed
item count.
</li>
<li>
The new configurations are only applied to
<strong>new session tasks</strong>. So, if you resume a batch process that
still has active Session Tasks running, the process will continue with the
old configurations until these sessions complete.
</li>
</ul>
<h3 id="-4-4-cleanup-"><strong>4.4. cleanup</strong></h3>
<p>Cleans up an old batch process.</p>
<p><strong>Usage:</strong></p>
<pre><code class="lang-sh"><span class="hljs-comment"># Using direct command-line options</span>
lune <span class="hljs-built_in">run</span> batch-process cleanup <process-<span class="hljs-built_in">name</span>> <span class="hljs-comment">--universe-id <id></span>
<span class="hljs-comment"># Using a configuration file</span>
lune <span class="hljs-built_in">run</span> batch-process cleanup <process-<span class="hljs-built_in">name</span>> <span class="hljs-comment">--config <filepath></span>
</code></pre>
<p><strong>Command-Specific Arguments:</strong></p>
<ul>
<li>
<strong><code><process-name></code></strong> : The unique name of the
previously run batch process to resume.
</li>
</ul>
<p><strong>Option:</strong></p>
<ul>
<li>
<strong><code>--universe-id / -u</code></strong> : The universe to load /
resume the batch process on.
</li>
</ul>
<h3 id="-4-5-list-"><strong>4.5. list</strong></h3>
<p>Lists all batch processes.</p>
<p><strong>Usage:</strong></p>
<pre><code class="lang-sh"><span class="hljs-comment"># Using direct command-line options</span>
lune <span class="hljs-built_in">run</span> batch-process <span class="hljs-built_in">list</span> <span class="hljs-comment">--universe-id <id></span>
<span class="hljs-comment"># Using a configuration file</span>
lune <span class="hljs-built_in">run</span> batch-process <span class="hljs-built_in">list</span> <span class="hljs-comment">--config <filepath></span>
</code></pre>
<p><strong>Option:</strong></p>
<ul>
<li>
<strong><code>--universe-id / -u</code></strong> : The universe to load /
resume the batch process on.
</li>
</ul>
<h2 id="-5-configuration-"><strong>5. Configuration</strong></h2>
<p>
You can configure a batch process using a combination of a configuration file,
command-line options, and interactive prompts. The tool uses a clear order of
precedence to determine the final configurations for a run.
</p>
<h3 id="-order-of-precedence-"><strong>Order of Precedence</strong></h3>
<p>
Configurations are applied in the following order, with later methods
overriding earlier ones:
</p>
<ol>
<li>
<strong>Default Values:</strong> Built-in defaults for optional
configurations.
</li>
<li>
<strong>Configuration File:</strong> Configurations loaded from a JSON file
specified with the <code>-c</code> / <code>--config</code> flag.
</li>
<li>
<strong>Command-Line Options:</strong> Flags passed directly in the command
(e.g., <code>--num-retries 5</code>). These will always override any values
set in a config file.
</li>
<li>
<strong>Interactive Prompts:</strong> For any
<em>required</em> configurations not provided by the methods above, the CLI
will prompt for input.
</li>
</ol>
<h3 id="-recommended-workflow-"><strong>Recommended Workflow</strong></h3>
<p>
A powerful way to use the CLI is to combine a configuration file with
command-line overrides. This allows you to maintain consistent base
configurations while retaining flexibility.
</p>
<ol>
<li>
<strong>Create a base config.json file</strong> with your common
configurations (like universeId, placeId, and default processing
parameters).
</li>
<li>
<strong>Use command-line options</strong> to override specific
configurations for a particular run.
</li>
</ol>
<p>
<strong><em>Example:</em></strong
><br />Imagine <code>my_config.json</code> sets <code>numRetries</code> to
<code>3</code>. You can override it for a single run:
</p>
<pre><code class="lang-sh"><span class="hljs-comment"># This run will use numRetries = 5, overriding the value from the file.</span>
<span class="hljs-comment"># All other settings will be loaded from my_config.json.</span>
lune run batch-<span class="hljs-built_in">process</span> <span class="hljs-built_in">process</span>-<span class="hljs-built_in">keys</span> my_process -c my_config.json <span class="hljs-comment">--num-retries 5</span>
</code></pre>
<p>
See <strong>Appendix B: Configuration Glossary</strong> for a full list of
available options.
</p>
<h2 id="-6-observability-and-debugging-">
<strong>6. Observability and Debugging</strong>
</h2>
<h3 id="-6-1-output-files-"><strong>6.1. Output Files</strong></h3>
<p>
The CLI generates an output directory for each batch process, providing tools
for monitoring and debugging. To inspect session details and logs, you must
call the Open Cloud endpoints with your API key included in the
<strong>x-api-key</strong> header.
</p>
<ul>
<li>
<strong><code>failed_items.output</code></strong
><br />Contains a line-separated record for each item that failed
processing.<br /><strong><em>Format:</em></strong>
<code
><item_name>|<error_time>|<truncated_error_log>|<session_path></code
><br />Metadata for failed items (such as error time, logs, and session
path) is never guaranteed, and will always be lost upon resuming a batch
process. For persistent error tracking, please ensure this information is
saved to a separate location before resuming a batch process.
</li>
<li>
<strong><code>process.output</code></strong
><br />Displays the current status of the process along with its active
configurations.
</li>
<li>
<strong
><code
>stage1_session_paths_active.output &
stage2_session_paths_active.output</code
></strong
><br />Lists the session paths for currently active Stage 1 (scanning) and
Stage 2 (processing) tasks. Note that logs are only available after a
session completes.<br />Example Request:
<code>GET https://apis.roblox.com/cloud/v2/<session_path></code>
</li>
<li>
<strong
><code
>stage1_session_paths_complete.output &
stage2_session_paths_complete.output</code
></strong
><br />An archive of session paths that have completed execution.
</li>
<li>
<strong><code>session_logs directory</code></strong
><br />Contains a file for each completed sessions’ task logs. Each log
file’s name is <code><task ID>.output</code>, where the task ID is the
final component of the session path.
</li>
<li>
<strong><code>cli.output</code></strong
><br />A list of warnings and other debugging messages coming directly from
the CLI. You will likely not need to use this, but in cases of unexpected
behavior it may contain insights into the issue.
</li>
</ul>
<p>
Given a session path from <code>failed_items.output</code> or
<code>stage[1/2]_session_paths_[active/complete].output</code>, you can find
its logs in the <code>session_logs</code> directory.
</p>
<p>
Alternatively, you can manually query the session task status or logs by
requesting the following Open Cloud endpoints. Include your API key in the
<code>x-api-key</code> header before making the requests:
</p>
<ul>
<li>
<strong>Query Task Status:</strong><br /><code
>GET https://apis.roblox.com/cloud/v2/<session_path></code
>
</li>
<li>
<strong>Query Logs:</strong><br /><code
>GET https://apis.roblox.com/cloud/v2/<session_path>/logs</code
>
</li>
</ul>
<h3 id="-6-2-troubleshooting-"><strong>6.2. Troubleshooting</strong></h3>
<p>
If a batch process appears to be stalled or not working as expected, follow
these diagnostic steps. Some circumstances can cause it to fail or to
<em>appear</em> unresponsive while it is still running in the background.
</p>
<ol>
<li>
<strong>Check Memory Stores Dashboard:</strong> The first step is to verify
that the process is actually running. The CLI's orchestration logic
heavily uses memory stores.
<ul>
<li>
Navigate to your <strong>Memory Stores Dashboard</strong> on the Creator
Hub.
</li>
<li>
Check for read and write activity. A running process will generate
continuous activity. This is the primary indicator that the tool is
operational.
</li>
<li>
Ensure that your experience has
<strong>available memory stores quota</strong>. If your experience is
out of available storage or Request Units, the batch process may stall.
</li>
</ul>
</li>
<li>
<strong>Inspect Session Logs for Completed Tasks:</strong> If memory stores
are active but you see no progress, a session may have completed or errored
out.
<ul>
<li>
Wait for a session path to appear in one of the ..._complete.output
files in your output directory.
</li>
<li>
Once a path appears, check for its logs in the session_logs directory,
or query the Open Cloud endpoint directly.
</li>
<li>
Reading these logs can help diagnose issues within the Luau callback
script or reveal problems like exceeding memory store storage limits,
which might prevent the process from updating its state.
</li>
<li>
Note that there may be warning messages implying that the process has
hit its configured <code>memoryStoresStorageLimit</code>. These are not
indicative of your experience being fully out of memory stores quota --
to be certain, we recommend checking your memory stores dashboard.
</li>
</ul>
</li>
<li>
<strong>Review the cli.output File:</strong> In rare cases, the CLI itself
may encounter an unrecoverable error.
<ul>
<li>Check the cli.output file in your output directory.</li>
<li>
This file contains critical errors related to the CLI's internal
operations. An error in this file indicates an issue with the tool
itself, rather than with the Luau scripts it is orchestrating. The CLI
has built-in retry logic for all essential operations; if it exhausts
its retries, it will terminate and log the final error here.
</li>
</ul>
</li>
</ol>
<h2 id="-7-performance-tuning-and-resource-management-">
<strong>7. Performance Tuning and Resource Management</strong>
</h2>
<p>
This section provides guidance on how to configure the CLI to balance
processing speed with resource consumption.
</p>
<h3 id="-7-1-understanding-and-managing-memory-stores-usage-">
<strong>7.1. Understanding and Managing Memory Stores Usage</strong>
</h3>
<p>
The CLI uses memory stores for job orchestration. Before starting a process,
the tool will provide you with an
<strong>estimated upper bound for memory stores usage</strong> (both for
access in Request Units/min and for storage in KB) based on your
configurations. This allows you to assess the potential impact on your
experience, especially if it has low concurrent users (CCU) or already uses
memory stores heavily.
</p>
<p>
You can further control resource consumption with the
<code>--memory-stores-storage-limit</code> option. This sets a safety cap (in
kilobytes) for the batch process. If the tool detects that storage usage is
approaching this limit, it will automatically slow down or pause the Stage 1
scanning process to allow the Stage 2 workers to clear the job queue. This
reduces memory pressure and prevents the process from failing due to storage
limits.
</p>
<p>
Be especially wary of your storage quota. If the storage limit is ever
reached, the batch processor may freeze indefinitely, and you may need to
flush your memory stores or cleanup your batch process.
</p>
<h3 id="-7-2-optimizing-for-processing-speed-vs-resource-usage-">
<strong>7.2. Optimizing for Processing Speed vs. Resource Usage</strong>
</h3>
<p>
Tuning your batch process involves a trade-off between speed and the
consumption of memory store resources. The configurations with the largest
impact are <code>numProcessingInstances</code>, <code>maxItemsPerJob</code>,
and <code>jobQueueMaxSize</code>.
</p>
<h4 id="-to-maximize-processing-speed-">
<strong>To Maximize Processing Speed:</strong>
</h4>
<ul>
<li>
<strong>Do:</strong> Increase <code>numProcessingInstances</code>. This is
the primary lever for speed, as it increases the number of concurrent
workers processing items.
</li>
<li>
<strong>Do:</strong> Increase <code>maxItemsPerJob</code>. This allows the
Stage 1 scanner to enqueue jobs more quickly by fetching more items per List
call, which helps keep your processing instances fed with work.
</li>
<li>
<strong>Be Aware:</strong> This strategy will significantly increase both
memory store access (RU/min) and storage usage. It is best suited for
experiences with high CCU and a large memory store quota.
</li>
</ul>
<h4 id="-to-minimize-memory-stores-storage-access-impact-">
<strong>To Minimize Memory Stores Storage / Access Impact:</strong>
</h4>
<p>
This is recommended for experiences with low CCU or those that already use
memory stores heavily for live game features.
</p>
<ul>
<li>
<strong>Do:</strong> Use a lower <code>numProcessingInstances</code> (e.g.,
1-3). This is the most effective way to reduce memory store access usage.
</li>
<li>
<strong>Do:</strong> Keep <code>jobQueueMaxSize</code> at a low value (e.g.,
the default of 20). A large queue is unnecessary if the processing rate is
slower and only serves to consume storage.
</li>
<li>
<strong>Do:</strong> Keep <code>maxItemsPerJob</code> at a moderate value. A
smaller job size reduces the amount of data stored in the queue at any given
time.
</li>
<li>
<strong>Be Aware:</strong> This will result in a slower overall batch
process but ensures the tool has a minimal footprint on your
experience's resources.
</li>
</ul>
<h4 id="-what-not-to-do-"><strong>What Not to Do:</strong></h4>
<ul>
<li>
<strong>Don't</strong> set <code>jobQueueMaxSize</code> to a very high
number unless you have a specific reason. The queue is unlikely to fill up
in most scenarios, and a large value primarily consumes unnecessary storage.
</li>
<li>
<strong>Don't</strong> set <code>maxItemsPerJob</code> so high that a
single job cannot be completed within the 5-minute
<code>LuauExecutionSessionTask</code> lifetime. A safe upper bound is
<code>maxItemsPerJob < processItemRateLimit * 4</code>.
</li>
</ul>
<p>
The default optional configurations are tuned to suffice for most batch
processing scenarios. However, it is ultimately up to you to tune the
configurations to your particular use case.
</p>
<h2 id="-8-example-use-case-data-deletion-">
<strong>8. Example Use Case: Data Deletion</strong>
</h2>
<p>
Deleting all data in a single, obsolete data store can be done using the
Delete Data Store API directly from the Data Stores Manager on Creator Hub.
However, this may take up to 30 days to reduce your storage footprint. To
delete data faster, you may use the batch processor. Please take caution that
you are only deleting data from a data store that you are
<strong>no longer using</strong>.
</p>
<ol>
<li>
<strong>Create a Deletion Callback:</strong> Use the following callback with
optional logging:
</li>
</ol>
<pre><code><span class="hljs-built_in">local</span> DataStoreService = game:GetService(<span class="hljs-string">"DataStoreService"</span>)
<span class="hljs-built_in">local</span> dataStoreToDelete = DataStoreService:GetDataStore(<span class="hljs-string">"<INSERT DATA STORE NAME>"</span>, <span class="hljs-string">"<INSERT DATA STORE SCOPE>"</span>)
<span class="hljs-literal">return</span> <span class="hljs-function"><span class="hljs-keyword">function</span>(<span class="hljs-title">item</span>)</span>
...
<span class="hljs-comment">-- <ADD OPTIONAL LOGGING HERE></span>
...
dataStoreToDelete:RemoveAsync(<span class="hljs-keyword">item</span>)
<span class="hljs-keyword">end</span>
</code></pre>
<ol>
<li>
<strong>Create the Configuration File</strong> (or provide arguments on the
command line)
</li>
<li>
<strong
>Run the <code>process-keys</code> command to kick off the
deletion:</strong
>
e.g.<br /><code
>lune run batch-process process-keys DS_Deletion -c <config
filepath>.json</code
>
</li>
</ol>
<p>
On a related note, bulk Data Store deletion can be achieved by integrating the
Deletion API with the batch processor. Please take caution that you are only
deleting data stores that you are no longer using.
</p>
<ol>
<li>
<strong>Create a Deletion Callback:</strong> Use the following callback with
optional logging:
</li>
</ol>
<pre><code><span class="hljs-keyword">local</span> DataStoreService = game:GetService(<span class="hljs-string">"DataStoreService"</span>)
<span class="hljs-keyword">local</span> dataStoreToDelete = DataStoreService:GetDataStore(<span class="hljs-string">"<INSERT DATA STORE NAME>"</span>, <span class="hljs-string">"<INSERT DATA STORE SCOPE>"</span>)
<span class="hljs-keyword">local</span> DataStoreService = game:GetService(<span class="hljs-string">"DataStoreService"</span>)
<span class="hljs-keyword">local</span> HttpService = game:GetService(<span class="hljs-string">"HttpService"</span>)
<span class="hljs-built_in">
return</span> function(<span class="hljs-built_in">item</span>)
...
<span class="hljs-comment">-- <ADD OPTIONAL LOGGING HERE></span>
...
<span class="hljs-keyword">local</span> encodedDataStoreName = HttpService:UrlEncode(dataStoreName)
<span class="hljs-keyword">local</span> apiKey = HttpService:GetSecret(<insert your api key secret here>)
<span class="hljs-keyword">local</span> response = HttpService:RequestAsync({
Url = <span class="hljs-string">"https://apis.roblox.com/cloud/v2/universes/"</span> .. UNIVERSE_ID
.. <span class="hljs-string">"/data-stores/"</span> .. encodedDataStoreName,
Method = <span class="hljs-string">"DELETE"</span>,
Headers = {
[<span class="hljs-string">"x-api-key"</span>] = apiKey,
[<span class="hljs-string">"Content-Type"</span>] = <span class="hljs-string">"application/json
},
})
...
-- <ADD OPTIONAL LOGGING HERE>
...
end</span>
</code></pre>
<ol>
<li>
<strong>Create the Configuration File</strong> (or provide arguments on the
command line)
</li>
<li>
<strong
>Run the <code>process-data-stores</code> command to kick off the
deletion:</strong
>
e.g.<br /><code
>lune run batch-process process-data-stores DS_Deletion -c <config
filepath>.json</code
>
</li>
</ol>
<h2 id="-9-example-use-case-data-migrations-">
<strong>9. Example Use Case: Data Migrations</strong>
</h2>
<h3 id="-9-1-general-migration-strategy-">
<strong>9.1. General Migration Strategy</strong>
</h3>
<p>
When performing a large-scale data migration, it is critical to handle cases
where users might join your experience mid-migration. This requires a robust
strategy to ensure data integrity.
</p>
<ol>
<li>
<strong>Implement a Live-Path Migration:</strong> Before starting the batch
process, implement and deploy live-server code that handles migration for
any user who joins the experience. This "live path" ensures that
active users are migrated on-the-fly. This logic must use a
<strong>migration marker</strong> to indicate that data has been migrated.
</li>
<li>
<strong>Use a Migration Marker:</strong> A migration marker is a piece of
data or any signifier that indicates a specific key has already been
migrated. Your callback function for the batch process
<strong>must</strong> check for this marker before attempting to migrate
data. This prevents the batch process from overwriting data that was already
migrated by the live-path logic, making the process idempotent (safe to run
multiple times).
</li>
<li>
<strong>Recommended Marker Patterns:</strong>
<ul>
<li>
<strong>Writing to a New Data Store (Recommended):</strong> The safest
approach is to read data from the old data store and write the migrated
data to a <em>new</em> data store. The migration marker is simply the
existence of the key in the new data store. Your callback logic would
be: if the key exists in the new store, do nothing; otherwise, perform
the migration.
</li>
<li>
<strong>Adding a Marker Property (In-Place Migration):</strong> If you
are overwriting data in the <em>same</em> data store, you must add a
"marker" property to the data object itself, or to the data
store metadata (e.g., <code>isMigrated = true</code> or
<code>schemaVersion = 2</code>). Your callback logic must first check if
this property exists and has the correct value before proceeding.
</li>
</ul>
</li>
</ol>
<p>
By combining a live-path migration with a marker check in your batch callback,
you create an idempotent process that is safe to run on live experiences.
</p>
<h3 id="-9-2-specific-example-datastore2-migration-deletion-">
<strong>9.2. Specific Example: DataStore2 Migration + Deletion</strong>
</h3>
<p>
The tool can be used to perform data migration and cleanup for the
<strong>DataStore2 module</strong>, which uses the
<strong>"Berezaa Method"</strong> of storing data. An example
configuration and callback files are provided to meet this exact use case in
the <code>examples</code> folder in the tool.
</p>
<p>
<strong>Provided Files in <code>examples/</code></strong>
</p>
<ul>
<li>
<strong><code>ds2-config.json</code></strong
>: A set of configurations specifically tuned for a DataStore2 migration and
deletion batch process.
</li>
<li>
<strong><code>ds2-migrate.luau</code></strong
>: A callback script that migrates the latest version from the DataStore2
data store to the migrated data store
</li>
<li>
<strong><code>ds2-migrate-delete.luau</code></strong
>: A callback script that <strong>1)</strong> migrates the latest version
from the DataStore2 data store to the migrated data store, and
<strong>2)</strong> deletes <strong>all but the latest version</strong> from
the DataStore2 data store. We recommend using this script, since it will
enable rollbacks in emergencies.
</li>
<li>
<strong><code>ds2-migrate-delete-all.luau</code></strong
>: A callback script that <strong>1)</strong> migrates the latest version
from the DataStore2 data store to the migrated data store, and
<strong>2)</strong> deletes <strong>all data</strong> from the DataStore2
data store, including the Data Store resource. Use this script with caution.
</li>
</ul>
<p><strong>Prerequisites:</strong></p>
<ol>
<li>
Integrate the <strong>DataStore2 Migration Tool</strong> into your
experience.
</li>
<li>
Ensure the "Saving Method" is
<code>MigrateOrderedBackups</code> (and set it if not).
</li>
<li>
Follow the testing instructions provided in the
<a
href="https://devforum.roblox.com/t/datastores-migration-packages-for-the-berezaa-method-and-datastore2-module-public-beta/3631514"
><strong>Data Stores Migration Packages Dev Forum Post</strong></a
>.
</li>
<li>Publish the experience with these changes.</li>
</ol>
<p><strong>Execution Steps:</strong></p>
<ol>
<li>Complete the setup in the <strong>Getting Started</strong> section.</li>
<li>
If you are using the <code>ds2-migrate-delete-all.luau</code> callback, see