All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Multiple streams supported
- Support for counting total allocated memory via
Session::getInstance().totalAllocatedBytes()
- Using alias instead of email in
setup.py
- Compatibility between Python and C++ in how the data is stored in bt files
- The project now uses the GNU General Public License (GPL) v3; license file added
- Introduces new Python package for storing and loading numpy arrays; in can be installed with
pip install gputils-api; unit tests and documentation
- When compiling with
cmake, the unit tests will not be compiled by default unless the flagGPUTILS_BUILD_TESTis set - Clang clippy recommendations applied
- Proper error handling when binary tensor file is not found
- Method
saveToFilebecomesDTensor<T>::saveToFile(std::string pathToFile, Serialisation ser), i.e., the user can choose whether to save the file as a text (ASCII) file, or a binary one
- Method
parseFromTextFilerenamed toparseFromFile(supports text and binary formats)
- Quick bug bix in
DTensor::parseFromTextFile(passing storage mode tovectorFromTextFile)
- Set precision in
DTensor::saveToFileproperly DTensor<T>::parseFromFilethrowsstd::invalid_argumentifTis unsupported
- Created methods for serialising and deserialising
DTensorobjects
- (Breaking change) The methods
CholeskyFactoriser::factorise,CholeskyFactoriser::solve,QRFactoriser::factorise,QRFactoriser::leastSquares, andQRFactoriser::getQRare nowvoidand do not return a status code. Instead, a status code is returned by calledstatusCode. This change leads to a reduction in data being downloaded from the GPU. - In
Svda status code (bool) is returned fromSvd<double>::factoriseonly if the#GPUTILS_DEBUG_MODEis defined, otherwise, the method returns alwaystrue. - New base class
IStatusused for a universal implementation ofinfo()
- When slicing a
DTensoralongaxis=2, update the pointer to matrices - We got rid of warning
DTensor<T>::createRandomTensor
- Memory management improvements: we got rid of
pointerToMatrices, which would unnecessarily allocate memory andaddABdoes not allocate any new memory internally.
- Left/right Givens rotations
GivensAnnihilatorimplemented
- Patch initialisation of Q in QR decomposition.
- Add test for tall skinny matrices.
- Implementation and test of QR factorisation for tall or square matrices.
- Solve least-square problems with QR factorisation.
- Improve documentation.
- Implementation and test of methods
.maxAbs()and.minAbs()for any tensor.
- Support for random tensors
- Implementation of
CholeskyMultiFactoriserwhich performs multiple Cholesky factorisations in parallel
- Using a function
numBlocksinstead of the macroDIM2BLOCKS - Using
TEMPLATE_WITH_TYPE_TandTEMPLATE_CONSTRAINT_REQUIRES_FPXfor the code to run on both C++17 and C++20
- Implementation and test of
Nullspace(DTensor A)method.project(DTensor b) projectwill project in place eachbionto the nullspace ofAi
- Implementation of
DTensor<T>, which is our basic entity for data storage and maniputation (supports basic linear algebra usingcublasandcusolver); implementation of=+,-=,*=(for scalars and other tensors),+,-,*(scalars and tensors), printing (usingstd::cout <<), computation of norms (Frobenius and sum of absolute values of all elements); device vectors and matrices are tensors - Singular value decomposition using
cublas - Least-squares on tensors
- Computation of nullspace matrices (on tensor objects)
- Cholesky factorisation
- Set up unit tests, CI, and CHANGELOG