Skip to content

Feat(tests): build test infrastructure#144

Open
chen2021673 wants to merge 7 commits intomasterfrom
CTest-clean
Open

Feat(tests): build test infrastructure#144
chen2021673 wants to merge 7 commits intomasterfrom
CTest-clean

Conversation

@chen2021673
Copy link
Copy Markdown
Contributor

@chen2021673 chen2021673 commented Apr 14, 2026

Summary

This PR refactors InfiniTrain’s test infrastructure around CTest and GoogleTest.

It consolidates the old test/ and tests/ layout into a single tests/ directory, introduces shared CMake utilities for test registration, and migrates applicable tests to device-parameterized TEST_P so CPU/CUDA cases can share the same test logic where appropriate.

Closes #120.

Changes

  • merge the old test/ directory into tests/
  • add shared CMake/GTest utilities under tests/common/
  • reduce repeated test registration boilerplate in per-suite CMakeLists.txt
  • migrate applicable tests from fixed-device TEST_F to device-parameterized TEST_P
  • replace hardcoded device selection with shared helpers such as GetDevice()
  • improve label-based selection for CPU/CUDA-related tests
  • refactor registration for all tests

How to run

ctest --output-on-failure
ctest -L cpu --output-on-failure
ctest -L cuda --output-on-failure

Impact

This is mainly a test infrastructure refactor. It is not intended to change training/runtime behavior, but it does change how tests are organized and registered.

Result

ctest --output-on-failure -j1 (并行可能抢占,先串行)

image

luoyueyuguang and others added 5 commits April 28, 2026 08:28
- Add infini_train_add_test CMake macro for simplified test registration
- Integrate gtest_discover_tests for automatic test case discovery
- Refactor all test directories to use unified macro (autograd, optimizer, hook, slow, lora)
- Reduce test CMakeLists.txt code by 68%
- Add LoRA tests (12 test cases)
- Delete TEST_REPORT.md
- Test labels: cpu/cuda/distributed/slow for flexible test execution
- Add shared test_macros.cmake in tests/common/

BREAKING CHANGE: Test registration now uses macro instead of manual add_test()

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace TEST_F with TEST_P across all test suites so each suite runs on
both CPU and CUDA without duplicating test logic. Adds InfiniTrainTestP,
TensorTestBaseP, AutogradTestBaseP, and DistributedInfiniTrainTestP base
classes with automatic CUDA/NCCL skip guards. Introduces
INFINI_TRAIN_REGISTER_TEST* C++ macros and infini_train_add_test_suite
CMake macro to eliminate repetitive INSTANTIATE_TEST_SUITE_P /
infini_train_add_test boilerplate. Removes deprecated test/, slow/, and
split optimizer test files; consolidates optimizer tests into a single
binary with creation  + step suites.
- Simplify CMakeLists: single CTest target per suite, remove label splitting
- Migrate old test/ directory into tests/ and delete test/
Comment thread CMakeLists.txt
Comment thread CMakeLists.txt
Comment thread CMakeLists.txt Outdated
- Add docs/test_usage_guide.md with build/run/write instructions
- Rename hook_mechanism.md → hook_mechanism_design.md
- Rename lora_usage.md → lora_usage_guide.md
- Add googletest as submodule in .gitmodules
- Add infini_run tool target in CMakeLists.txt, remove stale comments
Comment thread tests/common/test_utils.h Outdated
Comment thread tests/dtype/CMakeLists.txt
Add IsInitialized() to GlobalEnv and guard SetUpTestSuite so a second
test class in the same process skips re-initialization instead of
hitting CHECK(!initialized_). Also print try_compile output on
compile-fail test to surface header-not-found vs real type errors.

- 所有 autograd 测试都需要 `requires_grad=true`
- 所有 autograd 测试都需要填充数据
- 前向/反向传播测试必须有输入数据才能验证结果。`AutogradTestBase` 把 `FillSequentialTensor` 内置了,避免每个测试都手动调用
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

直接用 Arrange init 就行吧,给 Arrange 函数加一个 requires_grad 属性就行,对标 torch:
https://docs.pytorch.org/docs/2.11/generated/torch.arange.html

Comment thread tests/common/test_utils.h
for (size_t i = 0; i < size; ++i) { data[i] = start + static_cast<float>(i); }
}

inline void FillConstantTensor(const std::shared_ptr<Tensor> &tensor, float value) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

直接调用 Tensor 的 Fill 函数就行,感觉也没必要封装这个函数。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那这个AutogradTestBase我就直接删了

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件放已经有的 InfiniTrain/cmake 目录下。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件只服务 tests 目录下的测试注册逻辑,非通用,放在顶层合适吗?

Comment thread tests/common/test_utils.h
}
}
Device GetDevice() const { return Device(GetParam(), 0); }
std::shared_ptr<Tensor> createTensor(const std::vector<int64_t> &shape, DataType dtype = DataType::kFLOAT32,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个给 Tensor 的构造接口加一个 requires_grad 的参数就行,也是对齐 torch 的:
https://docs.pytorch.org/docs/2.11/generated/torch.tensor.html#torch.tensor

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

TEST_P(TensorCopyTest, CopiesCPUToCUDA) {
ONLY_CUDA();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种语义上就不应该有 cpu 的版本,但实际上还是注册了 cpu 的版本,虽然被 skip 了,但感觉还是有点怪:

  1. not use cuda 时这个函数不应该被编译;
  2. 即使 use cuda,也不应该注册 cpu 版本(那也没有 TEST_P 的必要了),可能需要改一下注册函数体现这种例外。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对的,现在这样skip是因为编译期感知不到测例内部的信息。如果要在编译期进行控制,那就需要用#ifdef USE_CUDA + TEST_F/TEST,另外也不能用infini_train_add_test_suite,要用CUDA-only test 注册方式。我觉得如果这种ONLY_CUDA/ONLY_CPU的测例确认是极少数的话可以不这么搞,用冗余滞后的跳过逻辑保留注册清晰度?

Comment thread tests/common/test_utils.h
#pragma once

#include <algorithm>
#include <gtest/gtest.h>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gtest 放到 cuda_xxx 下面的组里,用双引号:
https://gxtctab8no8.feishu.cn/docx/ARFVdldxPo87zHxIXe4c5LMwnNl#share-MwLDdV6xeoeEJqxkBc8cifylnfe

其他文件同理。

Comment thread tests/common/test_utils.h
#else
#include <cuda_runtime_api.h>
#endif
#endif
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里没必要这么写吧,如果找不到 cuda_runtime_api.h,编译器本来就会报错,USE_CUDA 了的时候直接 include 就行。

Comment thread tests/common/test_utils.h

#define INFINI_TRAIN_REGISTER_TEST(TestName) \
INSTANTIATE_TEST_SUITE_P(CPU, TestName, ::testing::Values(infini_train::Device::DeviceType::kCPU)); \
INSTANTIATE_TEST_SUITE_P(CUDA, TestName, ::testing::ValuesIn(infini_train::test::CudaDeviceTypes()))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没有 USE_CUDA 的时候,就不应该有 cuda param case 的注册;有 USE_CUDA 的时候,没有卡的话这个 case 直接炸就行,不需要防呆做到这个程度。

EXPECT_FALSE(config.add_bias_linear);
EXPECT_FALSE(config.tie_weights);
EXPECT_TRUE(config.UseGQA());
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这俩 case 没必要吧,test 里引用 example 有点奇怪;如果觉得有必要做这种校验的话,可以在 example 里给这些 config 类补上 sanitize 方法。类似:
https://github.com/NVIDIA/Megatron-LM/blob/8de8238844bb7824d3e245efae89d7c8c4211bc7/megatron/core/transformer/transformer_config.py#L2374

void Init(int threads_per_process, int tensor_parallel_size, bool sequence_parallel_enabled,
int pipeline_parallel_size, int virtual_pipeline_parallel_size);

bool IsInitialized() const;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个主要是为了防止 gtest RunAllTests 方法多线程启动以后,每个线程都初始化一次 global env。可以选择不用gtest_main 现成的实现,自己写一个 main 在调用 RunAllTests 之前初始化 global env,就不用在这里加接口了。

@kilinchange kilinchange changed the title [WIP]Feat(tests): build test infrastructure Feat(tests): build test infrastructure May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants