0. 简介
作为SLAMer常用的优化工具,我们会经常接触Ceres这一优化工具,但是在优化的过程中一直不支持GPU加速,这就导致优化性能难以提高,但是在Ceres2.1这一版本后,GPU加速开始适用于Ceres,为此本文来采坑看一看如何适用GPU加速Ceres。
1. 删除原本Ceres
通过find . -name ceres*
函数我们可以发现ceres代码路径是存在在下面三个路径下的,所以我们通过rm -rf
删除所有的依赖。
sudo rm -rf ./usr/local/include/ceres ./usr/include/ceres
2. 安装 Ceres2.1
# CMake
sudo apt-get install cmake
# google-glog + gflags
sudo apt-get install libgoogle-glog-dev libgflags-dev
# Use ATLAS for BLAS & LAPACK
sudo apt-get install libatlas-base-dev
# Eigen3
sudo apt-get install libeigen3-dev
# SuiteSparse and CXSparse (optional)
sudo apt-get install libsuitesparse-dev
# Download Ceres,如果无法下载,可以使用:https://ceres-solver.googlesource.com/ceres-solver/+archive/f68321e7de8929fbcdb95dd42877531e64f72f66.tar.gz 或者 http://ceres-solver.org/ceres-solver-2.1.0.tar.gz下载。
git clone https://ceres-solver.googlesource.com/ceres-solver/+/refs/tags/2.1.0
# check if it is ceres-solver-2.1.0.tar.gz
tar zxf ceres-solver-2.1.0.tar.gz
cd ceres-solver-2.1.0/
mkdir build
cd build
cmake -D CMAKE_INSTALL_PREFIX=/usr/local/ceres ..
make -j4
make test
sudo make install
这一步Ceres2.1就安装完毕了。我们在使用make test时候可以看到cuda已经被调用。
3. GPU 版本使用
我们可以看到现在Google说目前支持DENSE_QR, DENSE_NORMAL_CHOLESKY 和 DENSE_SCHUR三种优化方式,所以我们以DENSE_NORMAL_CHOLESKY为例子来进行介绍。
CMakeList.txt
set(Ceres_DIR /opt/third_party/ceres2.1/lib/cmake/Ceres)
set(CERES_INCLUDE_DIRS /opt/third_party/ceres2.1/include/)
if (CUDA)
find_package(CUDA QUIET)
if (CUDA_FOUND)
message("-- Found CUDA version ${CUDA_VERSION}: "
"${CUDA_LIBRARIES};"
"${CUDA_cusolver_LIBRARY};"
"${CUDA_cusparse_LIBRARY}")
include_directories(
${CUDA_INCLUDE_DIRS}
)
if (CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64")
message("embed_platform on")
include_directories(/usr/local/cuda/targets/aarch64-linux/include)
link_directories(/usr/local/cuda/targets/aarch64-linux/lib)
else()
message("embed_platform off")
link_directories(/usr/local/cuda-11.1/targets/x86_64-linux/lib/)
endif()
else (CUDA_FOUND)
message("-- Did not find CUDA library, disabling CUDA support.")
update_cache_variable(CUDA OFF)
list(APPEND CERES_COMPILE_OPTIONS CERES_NO_CUDA)
endif (CUDA_FOUND)
else (CUDA)
message("-- Building without CUDA.")
list(APPEND CERES_COMPILE_OPTIONS CERES_NO_CUDA)
endif (CUDA)
我们可以看一下官网的例子:
cuda_dense_cholesky_test.cc
.......
// Tests the CUDA Cholesky solver with a simple 4x4 matrix.
TEST(CUDADenseCholesky, Cholesky4x4Matrix) {
Eigen::Matrix4d A;
A << 4, 12, -16, 0,
12, 37, -43, 0,
-16, -43, 98, 0,
0, 0, 0, 1;
const Eigen::Vector4d b = Eigen::Vector4d::Ones();
LinearSolver::Options options;
ContextImpl context;
options.context = &context;
options.dense_linear_algebra_library_type = CUDA;
auto dense_cuda_solver = CUDADenseCholesky::Create(options);
ASSERT_NE(dense_cuda_solver, nullptr);
std::string error_string;
ASSERT_EQ(dense_cuda_solver->Factorize(A.cols(),
A.data(),
&error_string),
LinearSolverTerminationType::LINEAR_SOLVER_SUCCESS);
Eigen::Vector4d x = Eigen::Vector4d::Zero();
ASSERT_EQ(dense_cuda_solver->Solve(b.data(), x.data(), &error_string),
LinearSolverTerminationType::LINEAR_SOLVER_SUCCESS);
EXPECT_NEAR(x(0), 113.75 / 3.0, std::numeric_limits<double>::epsilon() * 10);
EXPECT_NEAR(x(1), -31.0 / 3.0, std::numeric_limits<double>::epsilon() * 10);
EXPECT_NEAR(x(2), 5.0 / 3.0, std::numeric_limits<double>::epsilon() * 10);
EXPECT_NEAR(x(3), 1.0000, std::numeric_limits<double>::epsilon() * 10);
}
目前CUDA版本的ceres支持有三种优化方式:DENSE_QR, DENSE_NORMAL_CHOLESKY 和 DENSE_SCHUR,使用方法也很简单,我们只需要在原有的基础上加入dense_linear_algebra_library_type
即可。
ceres::Solver::Summary summary;
// 使用cuda
#ifndef CERES_NO_CUDA
QINFO << "USE GPU Acceleration";
ceres::Solver::Options options;
options.minimizer_type = ceres::TRUST_REGION;
options.linear_solver_type = ceres::DENSE_QR;
options.dense_linear_algebra_library_type = ceres::CUDA;
ceres::Solve(options, &problem, &summary);
#endif
// 不使用 cuda
#ifdef CERES_NO_CUDA
ceres::Solver::Options options;
options.linear_solver_type = ceres::DENSE_QR;
options.minimizer_type = ceres::TRUST_REGION;
ceres::Solve(options, &problem, &summary);
#endif
评论(2)
您还未登录,请登录后发表或查看评论