【NeRF】動画から点群・メッシュ・任意視点動画を生成してみる

📅 2024/1/18 · ☕ 4 min read

NeRFを使えば，点群・メッシュ・任意視点動画が作れるのでやってみた
- 今回は愛飲するRedBullを被写体にしてみるヨ！
- 任意視点動画 (GIF版)

任意視点動画 (動画版)

点群

NeRFとnerfstudioについて簡潔に説明

ボリュームレンダリング
- ある点 $x$ と方向 $d$ を入力として $(c, σ)$ を出力する行為．
- $c, σ$ はそれぞれ色と密度を指す．
NeRF
- 画像集合からボリュームレンダリングを行うNNモデル $F_{θ} (x, d)$ を学習
- Lossは元の画像との再構成損失．
- 自己位置とカメラの{内部, 外部}パラメタを計算するためにCOLMAPを使用
  - COLMAPのSfMにより3次元点群が得られるが，NeRFではこの点群は使用しない．
「学習」という言葉の注意点
- ボリュームレンダリングを行う $F_{θ} (x, d)$ を学習するため，一つの物体専用のモデルを作るイメージ
  - したがって，pretrainという概念も，別物体に対する汎化という概念も存在しない

We optimize a separate neural continuous volume representation network for each scene. This requires only a dataset of captured RGB images of the scene, the corresponding camera poses and intrinsic parameters, and scene bounds (we use ground truth camera poses, intrinsics, and bounds for synthetic data, and use the COLMAP structure-from-motion package to estimate these parameters for real data).
引用: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

nerfstudio
- あらゆるNeRF系論文を統一的に実装できるようなフレームワーク
- 論文もある (SIGGRAPH 2023)
- Nerfstudio: A Modular Framework for Neural Radiance Field Development

実験環境

テスト環境
- GCP V100
- CUDA 11.8
- nerfstudio == 0.3.4
- pycolmap == 0.4.0
- hloc == 1.4

前準備: COLMAPをインストール

nerfstudioのDockerfileを使えばこの辺りは不要．
- 筆者はDockerを使わずに使いたかったので，この手順を踏んだ

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


sudo apt-get install \
    git \
    cmake \
    ninja-build \
    build-essential \
    libboost-program-options-dev \
    libboost-filesystem-dev \
    libboost-graph-dev \
    libboost-system-dev \
    libeigen3-dev \
    libflann-dev \
    libfreeimage-dev \
    libmetis-dev \
    libgoogle-glog-dev \
    libgtest-dev \
    libsqlite3-dev \
    libglew-dev \
    qtbase5-dev \
    libqt5opengl5-dev \
    libcgal-dev \
    libceres-dev

1
2
3
4
5
6
7


git clone https://github.com/colmap/colmap.git
cd colmap
mkdir build
cd build
cmake .. -GNinja
ninja
sudo ninja install

前準備: nerfstudioをインストール

1
2
3


git clone https://github.com/nerfstudio-project/nerfstudio.git
cd nerfstudio
pip install -e . 

step1: 動画からSfMを実行

ns-process-dataを使えば内部でCOLMAPを呼び出してSfMを実行してくれる

1

ns-process-data video --data data/redbull/test.mp4 --output-dir data/redbull

以下のように失敗する場合，COLMAPの性能限界が原因であることが多いので，hlocを使う
- ただし，以下のvesionを使うこと
  - pycolmap == 0.4.0
  - hloc == 1.4

step2: NeRFによりSfMから3Dモデルを構築

SfMからNeRFを学習させる

1

 ns-train nerfacto --vis viewer --data data/redbull/

VRAM使用量はコレくらい (約17GB / 24GB)

step3: 結果

SfMにより位置が推定されたカメラ画像と，指定した視点においてボリュームレンダリングされた画像が描画される

step4: 点群の抽出

1

ns-export pointcloud --load-config outputs/test2/nerfacto/2024-01-17_191700/config.yml --output-dir exports/mesh/ --normal-output-name rgb

step5: メッシュの作成

1

 ns-export poisson --load-config outputs/test2/nerfacto/2024-01-17_191700/config.yml --output-dir exports/mesh/ --normal-output-name rgb

復習
- 通常，objファイルには表面の特性を記述したmtlファイルと，テクスチャ情報のpngファイルへの参照を保持している
- そのため，mtlとテクスチャのpngがファイル直下にあれば，普通にobjファイルを読み込むだけで，テクスチャも読み込んでくれる
- 描画用コードは後述

step6: resume

以下のコマンドで学習途中のモデルを再度学習させることができる

1

ns-train nerfacto --data data/redbull --load-dir outputs/redbull/nerfacto/2024-01-14_005908/nerfstudio_models

(optional) COLMAP以外の選択肢: Hierarchical-Localization

hloc
https://github.com/cvg/Hierarchical-Localization
- From Coarse to Fine: Robust Hierarchical Localization at Large Scaleの実装
- COLMAPよりも現代的な技術に基づく．
- MobileNetをベースとしたNNにより構成．

1
2
3
4


git clone --recursive https://github.com/cvg/Hierarchical-Localization/
cd Hierarchical-Localization/
git checkout v1.4
python -m pip install -e .

(optional) 描画用コード

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70


import open3d as o3d
import numpy as np
from sklearn.cluster import KMeans

class GUIEditor:
    def __init__(self) -> None:
        self.coordinate_frame = o3d.geometry.TriangleMesh.create_coordinate_frame(size=0.1, origin=[0, 0, 0])

    def run(self,obj_path):
        # interactive picking
        mesh = o3d.io.read_triangle_mesh(obj_path, enable_post_processing=True)
        assert mesh.textures

        picked_indices = []
        while len(picked_indices) < 2:
            picked_points = self._pick_points(mesh)
            picked_indices = [pp.coord for pp in picked_points]

        # clastering
        vertices = np.asarray(picked_indices)
        kmeans = KMeans(n_clusters=2)
        kmeans.fit(vertices)
        centroids = kmeans.cluster_centers_

        # visualize
        vis = o3d.visualization.Visualizer()
        vis.create_window()
        cylinder = self._create_cylinder_between_points(centroids[0], centroids[1], color=[1, 0, 0])
        vis.add_geometry(cylinder)
        vis.add_geometry(mesh)
        vis.add_geometry(self.coordinate_frame)
        vis.run() 
        vis.destroy_window()

    def _create_cylinder_between_points(self, p1, p2, color, radius=0.001):
        distance = np.linalg.norm(p1 - p2)
        mid_point = (p1 + p2) / 2

        cylinder = o3d.geometry.TriangleMesh.create_cylinder(radius=radius, height=distance)
        cylinder.paint_uniform_color(color)

        vec = p2 - p1
        vec /= np.linalg.norm(vec)
        z_axis = np.array([0.0, 0.0, 1.0])
        rotation_axis = np.cross(z_axis,vec)
        rotation_axis /= np.linalg.norm(rotation_axis) 
        rotation_angle = np.arccos(np.clip(np.dot(z_axis, vec), -1.0, 1.0))
        rotation_matrix = o3d.geometry.get_rotation_matrix_from_axis_angle(rotation_axis * rotation_angle)

        cylinder.rotate(rotation_matrix, center=cylinder.get_center())
        cylinder.translate(mid_point - cylinder.get_center())

        return cylinder

    def _pick_points(self,mesh):
        print("Showing mesh. Please click on the mesh to select points...")
        vis = o3d.visualization.VisualizerWithVertexSelection()
        vis.create_window()
        vis.add_geometry(mesh)
        vis.add_geometry(self.coordinate_frame)
        vis.run()
        vis.destroy_window()
        return vis.get_picked_points()
    
def main():
    editor = GUIEditor()
    editor.run("mesh.obj")

if __name__ == "__main__":
    main()

著者

YuWd (Yuiga Wada)

機械学習・競プロ・iOS・Web

【NeRF】動画から点群・メッシュ・任意視点動画を生成してみる

NeRFとnerfstudioについて簡潔に説明

実験環境

前準備: COLMAPをインストール

前準備: nerfstudioをインストール

step1: 動画からSfMを実行

step2: NeRFによりSfMから3Dモデルを構築

step3: 結果

step4: 点群の抽出

step5: メッシュの作成

step6: resume

(optional) COLMAP以外の選択肢: Hierarchical-Localization

(optional) 描画用コード

関連記事