NopoSplat 复现记录

发表于 2024/11/15

作者 winka9587

3 分钟阅读

dataset

参考dataset.md中提供的链接指向pixelSplat, 作者提到使用的是处理过的dataset, 在给pixelSplat作者发送邮件后获得了数据集的下载链接, 其中包含了5个文件:

acid.zip
acid_test_only.zip
point_cloud_figure.zip
re10k.zip
re10k_test_only.zip

下载并解压, 数据集解压后的文件结构如下:

acid.zip

|-- test
|   |-- 000000.torch
|   |-- 000001.torch
    ...
|   |-- 000291.torch
|   |-- 000292.torch
|   `-- index.json
|-- train
|   |-- 000000.torch
|   |-- 000001.torch
    ...
|   |-- 001018.torch
|   |-- 001019.torch
|   `-- index.json
`-- validation
    |-- 000000.torch
    |-- 000001.torch
    |-- 000002.torch
    ...
    |-- 000218.torch
    |-- 000219.torch
    `-- index.json

3 directories, 1537 files

数据很有意思, 以acid/train/000000.torch为例, 使用torch.load加载该文件, 我们检查一下它的数据结构:

List, length: 10
  [0]:
    Dict:
      Key: 'url' -> Type: <class 'str'>
      Value:
        String: https://www.youtube.com/watch?v=-E1q-K738Hk
      Key: 'timestamps' -> Type: <class 'torch.Tensor'>
      Value:
        <class 'torch.Tensor'>: tensor([16080000, 16120000, 16160000, 16200000, 16240000, 16280000, 16320000,
        16360000, 16400000, 16440000, 16480000, 16520000, 16560000, 16600000,
        16640000, 16680000, 16720000, 16760000, 16800000, 16840000, 16880000,
        16920000, 16960000, 17000000, 17040000, 17080000, 17120000, 17160000,
        17200000, 17240000, 17280000, 17320000, 17360000, 17400000, 17440000,
        17480000, 17520000, 17560000, 17600000, 17640000, 17680000, 17720000,
        17760000, 17800000, 17840000, 17880000, 17920000, 17960000, 18000000,
        18040000, 18080000, 18120000, 18160000, 18200000, 18240000, 18280000,
        18320000, 18360000, 18400000, 18440000, 18480000, 18520000, 18560000,
        18600000, 18640000, 18680000, 18720000, 18760000, 18800000, 18840000,
        18880000, 18920000, 18960000, 19000000, 19040000, 19080000, 19120000,
        19160000, 19200000, 19240000, 19280000, 19320000, 19360000, 19400000,
        19440000, 19480000, 19520000, 19560000, 19600000, 19640000, 19680000,
        19720000, 19760000, 19800000, 19840000, 19880000, 19920000, 19960000,
        20000000, 20040000, 20080000, 20120000, 20160000, 20200000, 20240000,
        20280000, 20320000, 20360000, 20400000, 20440000, 20480000, 20520000,
        20560000, 20600000, 20640000, 20680000, 20720000, 20760000, 20800000,
        20840000, 20880000, 20920000, 20960000, 21000000, 21040000, 21080000,
        21120000, 21160000, 21200000, 21240000, 21280000, 21320000, 21360000,
        21400000, 21440000, 21480000, 21520000, 21560000, 21600000, 21640000,
        21680000, 21720000, 21760000, 21800000, 21840000, 21880000, 21920000,
        21960000, 22000000, 22040000, 22080000, 22120000, 22160000, 22200000,
        22240000, 22280000, 22320000, 22360000, 22400000, 22440000, 22480000,
        22520000, 22560000, 22600000, 22640000, 22680000, 22720000, 22760000,
        22800000, 22840000, 22880000, 22920000, 22960000, 23000000, 23040000,
        23080000, 23120000, 23160000, 23200000, 23240000, 23280000, 23320000,
        23360000, 23400000, 23440000, 23480000, 23520000, 23560000, 23600000,
        23640000, 23680000, 23720000, 23760000, 23800000, 23840000, 23880000,
        23920000, 23960000, 24000000, 24040000, 24080000, 24120000, 24160000,
        24200000, 24240000, 24280000, 24320000, 24360000, 24400000, 24440000,
        24480000, 24520000, 24560000, 24600000, 24640000, 24680000, 24720000,
        24760000, 24800000, 24840000, 24880000, 24920000, 24960000, 25000000,
        25040000, 25080000, 25120000, 25160000, 25200000, 25240000, 25280000,
        25320000, 25360000, 25400000, 25440000, 25480000, 25520000, 25560000,
        25600000, 25640000, 25680000, 25720000, 25760000, 25800000, 25840000,
        25880000, 25920000, 25960000, 26000000, 26040000, 26080000, 26120000,
        26160000, 26200000, 26240000, 26280000])
      Key: 'cameras' -> Type: <class 'torch.Tensor'>
      Value:
        <class 'torch.Tensor'>: tensor([[7.5774e-01, 1.3471e+00, 5.0000e-01,  ..., 1.7650e-02, 9.9982e-01,
         8.0739e-01],
        [7.5774e-01, 1.3471e+00, 5.0000e-01,  ..., 1.7898e-02, 9.9981e-01,
         8.4496e-01],
        [7.5774e-01, 1.3471e+00, 5.0000e-01,  ..., 1.8296e-02, 9.9980e-01,
         9.2223e-01],
        ...,
        [7.5774e-01, 1.3471e+00, 5.0000e-01,  ..., 1.0335e-01, 9.9112e-01,
         1.9037e+01],
        [7.5774e-01, 1.3471e+00, 5.0000e-01,  ..., 1.0360e-01, 9.9108e-01,
         1.9102e+01],
        [7.5774e-01, 1.3471e+00, 5.0000e-01,  ..., 1.0377e-01, 9.9104e-01,
         1.9232e+01]])
      Key: 'images' -> Type: <class 'list'>
      Value:
        List, length: 256
          [0]:
            <class 'torch.Tensor'>: tensor([255, 216, 255,  ...,  15, 255, 217], dtype=torch.uint8)
        ...
        Total elements: 256
      Key: 'key' -> Type: <class 'str'>
      Value:
        String: a459940b42a66c49
...
Total elements: 10

数据来自youtube上的无人机视频, 记录了时间戳来获取它。

camera为内外参extrinsics和intrinsics, 可以通过convert_poses获取

key用来标识一个唯一的场景

scene = example["key"]

Q: acid_test_only是否与acid中test一致？

施工中

本文由作者按照 CC BY 4.0 进行授权

dataset

热门标签