文章

pose free gs

cf3dgs提到了几篇文章, 这几篇文章证明了可以同时估计相机参数并优化NerF, 但是要引入多种regularization terms和几何先验。大多数现有的方法优先从不同的相机位置来优化光线投射过程, 而非直接优化相机位姿。这是NerF中的隐式表示和光线跟踪的实现的性质导致的。这种间接地优化方法也自然地存在问题。

1
2
3
4
5
Wenjing Bian, Zirui Wang, Kejie Li, Jia-Wang Bian, and Victor Adrian Prisacariu. Nope-nerf: Optimising neural radiance field with no pose prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4160–4169, 2023.

Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Anima Anandkumar, Minsu Cho, and Jaesik Park. Self-calibrating neural radiance fields. In ICCV, 2021.

Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey. Barf: Bundle-adjusting neural radiance fields. In ICCV, 2021.

作者提出了一个local 3dgs来估计相机的相对位姿。

作者给出了一个公式来揭示相机位姿与高斯点的3D刚体变换之间的关系。3D高斯的中心点为$\mu$, 相机位姿为$W$

\[\mu_{2D}=K(W_{\mu})/W(W_{\mu})_z\]

Code相关

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
        elif self.model_cfg.data_type == "custom":
            source_path = self.model_cfg.source_path
            cameras_intrinsic_file = os.path.join(source_path, "sparse/0", "cameras.bin")
            max_frames = 300
            # if os.path.exists(cameras_intrinsic_file):
            #     images = sorted(glob.glob(os.path.join(source_path, "images", "*.jpg")))
            #     if len(images)>max_frames:
            #         images = images[-max_frames:]
            #     cam_intrinsics = read_intrinsics_binary(cameras_intrinsic_file)
            #     intr = cam_intrinsics[1]
            #     focal_length_x = intr.params[0]
            #     focal_length_y = intr.params[1]
            #     height = intr.height
            #     width = intr.width
            #     intr_mat = np.array(
            #         [[focal_length_x, 0, width/2], [0, focal_length_y, height/2], [0, 0, 1]])
            #     self.intrinsic = intr_mat
            # else:
            images = sorted(glob.glob(os.path.join(source_path, "images/*.png")))  
            if len(images) > max_frames:
                interval = len(images) // max_frames
                images = images[::interval]
            print("Total images: ", len(images))
            width, height = Image.open(images[0]).size

在处理自定义数据时, 如果图像数量超过了最大图像数(默认为300), 则使用图像数量整除预设值, 并使用插值来采样图像。

默认的参数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
    densification_interval = 100
    densify_from_iter = 301
    densify_grad_threshold = 0.0002
    densify_interval = 500
    densify_until_iter = 130000
    depth_loss_type = 'invariant'
    feature_lr = 0.0025
    iterations = 300
    lambda_depth = 0.0
    lambda_dist_2nd_loss = 0.0
    lambda_dssim = 0.2
    lambda_pc = 0.0
    lambda_rgb_s = 0.0
    match_method = 'dense'

输入数据:

观察代码readColmapSceneInfo可以发现, cf3dgs读取的数据也并非是原生的数据, 它其实是使用了colmap的一些信息

1
2
3
4
5
6
7
8
9
10
11
def readColmapSceneInfo(path, images, eval, llffhold=8):
    try:
        cameras_extrinsic_file = os.path.join(path, "sparse/0", "images.bin")
        cameras_intrinsic_file = os.path.join(path, "sparse/0", "cameras.bin")
        cam_extrinsics = read_extrinsics_binary(cameras_extrinsic_file)
        cam_intrinsics = read_intrinsics_binary(cameras_intrinsic_file)
    except:
        cameras_extrinsic_file = os.path.join(path, "sparse/0", "images.txt")
        cameras_intrinsic_file = os.path.join(path, "sparse/0", "cameras.txt")
        cam_extrinsics = read_extrinsics_text(cameras_extrinsic_file)
        cam_intrinsics = read_intrinsics_text(cameras_intrinsic_file)

它读取了colmap处理后的images和cameras, 其实变相地将没有参与colmap稀疏重建的视角给排除掉了。

对应函数readColmapSceneInfo中的变量观察下面的输出, colmap稀疏重建使用了150张图像, cf3dgs选择了其中的131张来训练, 19张来测试, 剩下其他的视角用来评估。

1
2
3
4
5
6
7
8
len(cam_extrinsics)
150
type(train_cam_infos)
<class 'list'>
len(train_cam_infos)
131
len(test_cam_infos)
19

COGS

custom数据集测试

训练时间记录

本文由作者按照 CC BY 4.0 进行授权