pose free gs
cf3dgs提到了几篇文章, 这几篇文章证明了可以同时估计相机参数并优化NerF, 但是要引入多种regularization terms和几何先验。大多数现有的方法优先从不同的相机位置来优化光线投射过程, 而非直接优化相机位姿。这是NerF中的隐式表示和光线跟踪的实现的性质导致的。这种间接地优化方法也自然地存在问题。
1
2
3
4
5
Wenjing Bian, Zirui Wang, Kejie Li, Jia-Wang Bian, and Victor Adrian Prisacariu. Nope-nerf: Optimising neural radiance field with no pose prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4160–4169, 2023.
Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Anima Anandkumar, Minsu Cho, and Jaesik Park. Self-calibrating neural radiance fields. In ICCV, 2021.
Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey. Barf: Bundle-adjusting neural radiance fields. In ICCV, 2021.
作者提出了一个local 3dgs来估计相机的相对位姿。
作者给出了一个公式来揭示相机位姿与高斯点的3D刚体变换之间的关系。3D高斯的中心点为$\mu$, 相机位姿为$W$
\[\mu_{2D}=K(W_{\mu})/W(W_{\mu})_z\]Code相关
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
elif self.model_cfg.data_type == "custom":
source_path = self.model_cfg.source_path
cameras_intrinsic_file = os.path.join(source_path, "sparse/0", "cameras.bin")
max_frames = 300
# if os.path.exists(cameras_intrinsic_file):
# images = sorted(glob.glob(os.path.join(source_path, "images", "*.jpg")))
# if len(images)>max_frames:
# images = images[-max_frames:]
# cam_intrinsics = read_intrinsics_binary(cameras_intrinsic_file)
# intr = cam_intrinsics[1]
# focal_length_x = intr.params[0]
# focal_length_y = intr.params[1]
# height = intr.height
# width = intr.width
# intr_mat = np.array(
# [[focal_length_x, 0, width/2], [0, focal_length_y, height/2], [0, 0, 1]])
# self.intrinsic = intr_mat
# else:
images = sorted(glob.glob(os.path.join(source_path, "images/*.png")))
if len(images) > max_frames:
interval = len(images) // max_frames
images = images[::interval]
print("Total images: ", len(images))
width, height = Image.open(images[0]).size
在处理自定义数据时, 如果图像数量超过了最大图像数(默认为300), 则使用图像数量整除预设值, 并使用插值来采样图像。
默认的参数:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
densification_interval = 100
densify_from_iter = 301
densify_grad_threshold = 0.0002
densify_interval = 500
densify_until_iter = 130000
depth_loss_type = 'invariant'
feature_lr = 0.0025
iterations = 300
lambda_depth = 0.0
lambda_dist_2nd_loss = 0.0
lambda_dssim = 0.2
lambda_pc = 0.0
lambda_rgb_s = 0.0
match_method = 'dense'
输入数据:
观察代码readColmapSceneInfo可以发现, cf3dgs读取的数据也并非是原生的数据, 它其实是使用了colmap的一些信息
1
2
3
4
5
6
7
8
9
10
11
def readColmapSceneInfo(path, images, eval, llffhold=8):
try:
cameras_extrinsic_file = os.path.join(path, "sparse/0", "images.bin")
cameras_intrinsic_file = os.path.join(path, "sparse/0", "cameras.bin")
cam_extrinsics = read_extrinsics_binary(cameras_extrinsic_file)
cam_intrinsics = read_intrinsics_binary(cameras_intrinsic_file)
except:
cameras_extrinsic_file = os.path.join(path, "sparse/0", "images.txt")
cameras_intrinsic_file = os.path.join(path, "sparse/0", "cameras.txt")
cam_extrinsics = read_extrinsics_text(cameras_extrinsic_file)
cam_intrinsics = read_intrinsics_text(cameras_intrinsic_file)
它读取了colmap处理后的images和cameras, 其实变相地将没有参与colmap稀疏重建的视角给排除掉了。
对应函数readColmapSceneInfo中的变量观察下面的输出, colmap稀疏重建使用了150张图像, cf3dgs选择了其中的131张来训练, 19张来测试, 剩下其他的视角用来评估。
1
2
3
4
5
6
7
8
len(cam_extrinsics)
150
type(train_cam_infos)
<class 'list'>
len(train_cam_infos)
131
len(test_cam_infos)
19
COGS
custom数据集测试
训练时间记录
本文由作者按照 CC BY 4.0 进行授权