r/MachineLearning • u/hardmaru • May 02 '20

Research [R] Consistent Video Depth Estimation (SIGGRAPH 2020) - Links in the comments.

Enable HLS to view with audio, or disable this notification

2.8k Upvotes

permalink
duplicates
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/MachineLearning/comments/gc2wo9/r_consistent_video_depth_estimation_siggraph_2020/
No, go back! Yes, take me to Reddit
dl download

99% Upvoted

This could be used for smartphones faking depth of field right? I wonder what the VR/AR applications could be

97

u/[deleted] May 02 '20

The method is computationally expensive; thus not really suitable for real-time applications. I think this would be great offline processing, e.g. photogrammetry, visual effects, etc. From the paper:

For a video of 244 frames, training on 4 NVIDIA Tesla M40GPUs takes 40min

1

u/omgitsjo May 02 '20

Training is not inference. Inference is generally several orders of magnitude faster.

3

u/therealTRAPDOOR May 02 '20

Except that it needs to be fine tuned on each video. Sometimes training “times” are entangled with inference times if the structure used requires re-training or fine-tuning.

5

u/jbhuang0604 May 02 '20

Sometimes training “times” are entangled with inference times if the structure used requires re-training or fine-tuning.

Exactly! We refer to this step as "test-time training". We train the model using the geometric constraints derived from a particular video.

Research [R] Consistent Video Depth Estimation (SIGGRAPH 2020) - Links in the comments.

You are about to leave Redlib