Conversation
added 3 commits
March 23, 2026 21:05
--- X-AI-Tool: Human X-AI-Prompt: can you summerize this PR #5763 so I can add discription in the pr Signed-off-by: Yadan Wei <yadanwei@amazon.com>
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 74 X-AI-Prompt: can you look at this dockerfile sample https://github.com/aws/deep-learning-containers/pull/5808/changes#diff-aff16f8c535417fcf020bc2184ab09935e6c66cf46842f6ccee6d2022f4077ff to modify my dockerfile for oss setup /Volumes/workplace/kiro-workplace/AsimovBuilderCoreContext/src/AsimovBuilderCoreContext/workspace/2week/deep-learning-containers/docker/vllm/Dockerfile.amzn2023 Signed-off-by: Yadan Wei <yadanwei@amazon.com>
--- X-AI-Tool: Human X-AI-Prompt: can you look at this dockerfile sample https://github.com/aws/deep-learning-containers/pull/5808/changes#diff-aff16f8c535417fcf020bc2184ab09935e6c66cf46842f6ccee6d2022f4077ff to modify my dockerfile for oss setup /Volumes/workplace/kiro-workplace/AsimovBuilderCoreContext/src/AsimovBuilderCoreContext/workspace/2week/deep-learning-containers/docker/vllm/Dockerfile.amzn2023 Signed-off-by: Yadan Wei <yadanwei@amazon.com>
added 8 commits
March 23, 2026 21:50
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 183 X-AI-Prompt: for my build vllm container,how can I add benchmark test with popular models Signed-off-by: Yadan Wei <yadanwei@amazon.com>
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 142 X-AI-Prompt: okay could you implement for me and could you find which s3 bucket sample pr is using, we can use the same one Signed-off-by: Yadan Wei <yadanwei@amazon.com>
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 28 X-AI-Prompt: how the cache will be saved bucket/hash/**.o? Signed-off-by: Yadan Wei <yadanwei@amazon.com>
feat(vllm): add benchmark tests and sccache build acceleration Add vLLM benchmark test infrastructure with configurable throughput and latency thresholds per model, integrated into the PR workflow. Benchmark tests run against gpt-oss-20b, llama-3.3-70b, and qwen3-32b using both CodeBuild fleet and runner-scale-sets runners. Add sccache with S3 backend to the vLLM build stage to cache compiled object files across CI runs. This replaces the ineffective local ccache mount (lost on ephemeral CodeBuild runners) and enables incremental recompilation when upstream cherry-picks or patches change only a subset of source files. sccache is conditionally enabled via the SCCACHE_BUCKET build arg, reusing the existing WHEEL_CACHE_BUCKET repository variable from the PyTorch workflow. ai-dev-branch commit IDs: 7da43b3 d49c6f8 23c5e1a 1147845 eb2f2f5 8ea8588 3535d41 The prompts used are captured in the footers of those commits. The initial prompt was: can you summerize this PR #5763 so I can add discription in the pr --- X-AI-Handle-Time-Seconds: 353 X-AI-Line-Changes: New:414, Altered:1, Deleted:0 X-Human-Line-Changes: New:0, Altered:0, Deleted:0 X-AI-Line-Changes-Kiro-cli: New:414, Altered:1, Deleted:0 X-AI-Handle-Time-Seconds-Kiro-cli: 353 X-AI-Change-Count: 3 X-Human-Change-Count: 0 X-AI-Change-Count-Kiro-cli: 3 X-CR-Amendment: false
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 75 X-AI-Prompt: how my sample PR access s3 bucket, I think we do not need to do above things Signed-off-by: Yadan Wei <yadanwei@amazon.com>
fix(vllm): pass AWS credentials to sccache inside docker build sccache cannot reach the EC2 instance metadata service (IMDS) from inside a docker build container, causing S3 cache lookups to fail. Fix by passing the CodeBuild runner's temporary AWS credentials as build args so sccache can authenticate via the standard env var credential chain. Credentials only exist in the discarded builder stage and never appear in the final multi-stage image. ai-dev-branch commit IDs: 215775d The prompts used are captured in the footers of those commits. The initial prompt was: how my sample PR access s3 bucket, I think we do not need to do above things --- X-AI-Handle-Time-Seconds: 75 X-AI-Line-Changes: New:19, Altered:0, Deleted:0 X-Human-Line-Changes: New:0, Altered:0, Deleted:0 X-AI-Line-Changes-Kiro-cli: New:19, Altered:0, Deleted:0 X-AI-Handle-Time-Seconds-Kiro-cli: 75 X-AI-Change-Count: 1 X-Human-Change-Count: 0 X-AI-Change-Count-Kiro-cli: 1 X-CR-Amendment: false
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 50 X-AI-Prompt: #24 6.697 Using MAX_JOBS=32 as the number of jobs. #24 6.699 Using NVCC_THREADS=16 as the number of nvcc threads. #24 6.940 -- The CXX compiler identification is GNU 11.5.0 #24 6.951 -- Detecting CXX compiler ABI info #24 7.024 -- Detecting CXX compiler ABI info - failed #24 7.024 -- Check for working CXX compiler: /usr/bin/c++ #24 7.089 -- Check for working CXX compiler: /usr/bin/c++ - broken #24 7.089 CMake Error at /opt/venv/lib/python3.12/site-packages/cmake/data/share/cmake-4.3/Modules/CMakeTestCXXCompiler.cmake:73 (message): #24 7.089 The C++ compiler #24 7.089 #24 7.089 "/usr/bin/c++" #24 7.089 #24 7.089 is not able to compile a simple test program. #24 7.089 #24 7.089 It fails with the following output: #24 7.089 #24 7.089 Change Dir: '/workspace/vllm/build/temp.linux-x86_64-cpython-312/CMakeFiles/CMakeScratch/TryCompile-Y7AumQ' #24 7.089 #24 7.089 Run Build Command(s): /opt/venv/bin/ninja -v cmTC_ff516 #24 7.089 [1/2] sccache /usr/bin/c++ -o CMakeFiles/cmTC_ff516.dir/testCXXCompiler.cxx.o -c /workspace/vllm/build/temp.linux-x86_64-cpython-312/CMakeFiles/CMakeScratch/TryCompile-Y7AumQ/testCXXCompiler.cxx #24 7.089 FAILED: [code=2] CMakeFiles/cmTC_ff516.dir/testCXXCompiler.cxx.o #24 7.089 sccache /usr/bin/c++ -o CMakeFiles/cmTC_ff516.dir/testCXXCompiler.cxx.o -c /workspace/vllm/build/temp.linux-x86_64-cpython-312/CMakeFiles/CMakeScratch/TryCompile-Y7AumQ/testCXXCompiler.cxx #24 7.089 sccache: error: Server startup failed: cache storage failed to read: Unexpected (permanent) at read => S3Error { code: "AuthorizationHeaderMalformed", message: "The authorization header is malformed; a non-empty Access Key (AKID) must be provided in the credential.", resource: "", request_id: "9JNZ99SMVCR9235F" } #24 7.089 #24 7.089 Context: #24 7.089 uri: https://s3.us-west-2.amazonaws.com/dlc-cicd-models/sccache/vllm/.sccache_check #24 7.089 response: Parts { status: 400, version: HTTP/1.1, headers: {"x-amz-request-id": "9JNZ99SMVCR9235F", "x-amz-id-2": "xP77wFtCDnopxg4jLe8wBmqfAYAk3v+fP16A7xtV1fsZueOgmrd/cCc7CZRjMMLMk+FfKnUhh5c=", "content-type": "application/xml", "transfer-encoding": "chunked", "date": "Tue, 24 Mar 2026 05:57:07 GMT", "connection": "close", "server": "AmazonS3"} } #24 7.089 service: s3 #24 7.089 path: .sccache_check #24 7.089 range: 0- #24 7.089 #24 7.089 Backtrace: #24 7.089 0: <unknown> #24 7.089 1: <unknown> #24 7.089 2: <unknown> #24 7.089 3: <unknown> #24 7.089 4: <unknown> #24 7.089 5: <unknown> #24 7.089 6: <unknown> #24 7.089 7: <unknown> #24 7.089 8: <unknown> #24 7.089 9: <unknown> #24 7.089 10: <unknown> #24 7.089 11: <unknown> #24 7.089 12: <unknown> #24 7.089 #24 7.089 #24 7.089 Run with SCCACHE_LOG=debug SCCACHE_NO_DAEMON=1 to get more information #24 7.089 ninja: build stopped: subcommand failed. #24 7.089 #24 7.089 #24 7.089 #24 7.089 #24 7.089 #24 7.089 CMake will not be able to correctly generate this project. #24 7.089 Call Stack (most recent call first): #24 7.089 CMakeLists.txt:14 (project) #24 7.089 #24 7.089 #24 7.090 -- Configuring incomplete, errors occurred! #24 7.093 Traceback (most recent call last): #24 7.093 File "/workspace/vllm/setup.py", line 1044, in <module> #24 7.093 setup( #24 7.093 File "/opt/venv/lib64/python3.12/site-packages/setuptools/__init__.py", line 117, in setup #24 7.093 return distutils.core.setup(**attrs) # type: ignore[return-value] #24 7.093 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #24 7.093 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/core.py", line 186, in setup #24 7.093 return run_commands(dist) #24 7.093 ^^^^^^^^^^^^^^^^^^ #24 7.093 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/core.py", line 202, in run_commands #24 7.093 dist.run_commands() #24 7.093 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/dist.py", line 1002, in run_commands #24 7.093 self.run_command(cmd) #24 7.093 File "/opt/venv/lib64/python3.12/site-packages/setuptools/dist.py", line 1107, in run_command #24 7.094 super().run_command(command) #24 7.094 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command #24 7.094 cmd_obj.run() #24 7.094 File "/opt/venv/lib64/python3.12/site-packages/setuptools/command/bdist_wheel.py", line 370, in run #24 7.094 self.run_command("build") #24 7.094 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/cmd.py", line 357, in run_command #24 7.094 self.distribution.run_command(command) #24 7.094 File "/opt/venv/lib64/python3.12/site-packages/setuptools/dist.py", line 1107, in run_command #24 7.094 super().run_command(command) #24 7.094 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command #24 7.094 cmd_obj.run() #24 7.094 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/command/build.py", line 135, in run #24 7.094 self.run_command(cmd_name) #24 7.094 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/cmd.py", line 357, in run_command #24 7.094 self.distribution.run_command(command) #24 7.095 File "/opt/venv/lib64/python3.12/site-packages/setuptools/dist.py", line 1107, in run_command #24 7.095 super().run_command(command) #24 7.095 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command #24 7.095 cmd_obj.run() #24 7.095 File "/workspace/vllm/setup.py", line 360, in run #24 7.095 super().run() #24 7.095 File "/opt/venv/lib64/python3.12/site-packages/setuptools/command/build_ext.py", line 97, in run #24 7.095 _build_ext.run(self) #24 7.095 File "/opt/venv/lib64/python3.12/site-packages/setuptools/_distutils/command/build_ext.py", line 368, in run #24 7.095 self.build_extensions() #24 7.095 File "/workspace/vllm/setup.py", line 317, in build_extensions #24 7.095 self.configure(ext) #24 7.095 File "/workspace/vllm/setup.py", line 294, in configure #24 7.095 subprocess.check_call( #24 7.095 File "/usr/lib64/python3.12/subprocess.py", line 413, in check_call #24 7.095 raise CalledProcessError(retcode, cmd) #24 7.095 subprocess.CalledProcessError: Command '['cmake', '/workspace/vllm', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DVLLM_TARGET_DEVICE=cuda', '-DCMAKE_C_COMPILER_LAUNCHER=sccache', '-DCMAKE_CXX_COMPILER_LAUNCHER=sccache', '-DCMAKE_CUDA_COMPILER_LAUNCHER=sccache', '-DCMAKE_HIP_COMPILER_LAUNCHER=sccache', '-DVLLM_PYTHON_EXECUTABLE=/opt/venv/bin/python3', '-DVLLM_PYTHON_PATH=/workspace/vllm:/usr/lib64/python312.zip:/usr/lib64/python3.12:/usr/lib64/python3.12/lib-dynload:/opt/venv/lib64/python3.12/site-packages:/opt/venv/lib64/python3.12/site-packages/nvidia_cutlass_dsl/python_packages:/opt/venv/lib/python3.12/site-packages:/opt/venv/lib/python3.12/site-packages/nvidia_cutlass_dsl/python_packages:/opt/venv/lib64/python3.12/site-packages/setuptools/_vendor:/opt/venv/lib/python3.12/site-packages/grpc_tools/_proto', '-DFETCHCONTENT_BASE_DIR=/workspace/vllm/.deps', '-DNVCC_THREADS=16', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=2', '-DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc']' returned non-zero exit status 1. #24 ERROR: process "/bin/sh -c python3 setup.py bdist_wheel --dist-dir=dist --py-limited-api=cp38 && if [ -n \"${SCCACHE_BUCKET}\" ]; then sccache --show-stats; fi" did not complete successfully: exit code: 1 Signed-off-by: Yadan Wei <yadanwei@amazon.com>
fix(vllm): resolve AWS credentials for sccache inside docker build The CodeBuild runner's IAM credentials are not exposed as environment variables by default. Use `aws configure export-credentials` to resolve them from the SDK chain before passing as --build-arg to docker build, so sccache can authenticate to S3. ai-dev-branch commit IDs: c8835eb The prompts used are captured in the footers of those commits. The initial prompt was: (build error log showing sccache S3 AuthorizationHeaderMalformed failure) --- X-AI-Handle-Time-Seconds: 50 X-AI-Line-Changes: New:4, Altered:0, Deleted:0 X-Human-Line-Changes: New:0, Altered:0, Deleted:0 X-AI-Line-Changes-Kiro-cli: New:4, Altered:0, Deleted:0 X-AI-Handle-Time-Seconds-Kiro-cli: 50 X-AI-Change-Count: 1 X-Human-Change-Count: 0 X-AI-Change-Count-Kiro-cli: 1 X-CR-Amendment: false
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Test Plan
Test Result
Toggle if you are merging into master Branch
By default, docker image builds and tests are disabled. Two ways to run builds and tests:
How to use the helper utility for updating dlc_developer_config.toml
Assuming your remote is called
origin(you can find out more withgit remote -v)...python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -cp originpython src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -t sanity_tests -cp originpython src/prepare_dlc_dev_environment.py -rcp originNOTE: If you are creating a PR for a new framework version, please ensure success of the local, standard, rc, and efa sagemaker tests by updating the dlc_developer_config.toml file:
sagemaker_remote_tests = truesagemaker_efa_tests = truesagemaker_rc_tests = truesagemaker_local_tests = trueHow to use PR description
Use the code block below to uncomment commands and run the PR CodeBuild jobs. There are two commands available:# /buildspec <buildspec_path># /buildspec pytorch/training/buildspec.yml# /tests <test_list># /tests sanity security ec2sanity, security, ec2, ecs, eks, sagemaker, sagemaker-local.Toggle if you are merging into main Branch
PR Checklist
pre-commit run --all-fileslocally before creating this PR. (Read DEVELOPMENT.md for details).