Profiling¶
gsplat is developed with efficiency in mind, while also supports trade-offs between
memory consumption and speed. The rasterization()
function has a few arguments
that would not affect the numerical results but could significantly impact the runtime and
memory usage, including packed and sparse_grad:
packed: If True, the rasterization process will be performed in a memory-efficient way, in which the intermidate tensors are packed into the sparse tensor layout. This could greatly reduce the memory usage when the scene is large and each camera only sees a small portion of the scene. But this also introduces a small runtime overhead. It is recommended to set
packed=True
when the scene is large and setpacked=False
when the scene is small (relative to the camera frustum).sparse_grad: This argument is only effective when
packed=True
. If True, in addition to the intermidate tensors, the gradients will also be packed into a coo sparse tensor layout. This could further reduce the memory usage when training a large scene as the gradients of the Gaussian attributes are usually sparse. Note that in most cases, sparse gradients should be used together with a sparse optimizer, such as torch.optim.SparseAdam. And currently we only supports sparse gradients for part of the Gaussian attributes. Seerasterization()
for more details.
Here we provide careful profiling of the performance with the different rasterization backends, along with the impact of the above arguments in gsplat. “Mem” denotes for the amount of GPU memory allocated by the forward + backward rasterization process (excluding the input data), which is calculated using the diff of torch.cuda.max_memory_allocated() before and after the rasterization operation. Evaluations are conducted on a 24GB NVIDIA TITAN RTX GPU. (commit 8ea2ea3)
Render RGB Images¶
Batch size 1.
python profiling/main.py --backend gsplat gsplat-legacy inria \
--batch_size 1 --scene_grid 5 --channels 3
Backend |
Packed |
Sparse Grad |
Mem (GB) |
FPS[fwd] |
FPS[bwd] |
---|---|---|---|---|---|
gsplat v1.0.0 |
True |
True |
0.35 |
132.0 |
84.8 |
gsplat v1.0.0 |
True |
False |
0.35 |
160.8 |
88.4 |
gsplat v1.0.0 |
False |
False |
0.48 |
171.8 |
97.1 |
gsplat v0.1.11 |
n/a |
n/a |
0.62 |
129.9 |
91.1 |
diff-gaussian-rasterization |
n/a |
n/a |
1.00 |
164.5 |
41.5 |
Batch size 4.
python profiling/main.py --backend gsplat gsplat-legacy inria \
--batch_size 4 --scene_grid 5 --channels 3
Backend |
Packed |
Sparse Grad |
Mem (GB) |
FPS[fwd] |
FPS[bwd] |
---|---|---|---|---|---|
gsplat v1.0.0 |
True |
True |
1.41 |
42.9 x 4 |
24.1 x 4 |
gsplat v1.0.0 |
True |
False |
1.41 |
43.9 x 4 |
24.0 x 4 |
gsplat v1.0.0 |
False |
False |
2.05 |
46.1 x 4 |
25.5 x 4 |
gsplat v0.1.11 |
n/a |
n/a |
1.83 |
32.5 x 4 |
21.6 x 4 |
diff-gaussian-rasterization |
n/a |
n/a |
3.91 |
42.7 x 4 |
10.1 x 4 |
Render Feature Maps: 32 Channel¶
Batch size 1.
python profiling/main.py --backend gsplat gsplat-legacy inria \
--batch_size 1 --scene_grid 1 --channels 32
Backend |
Packed |
Sparse Grad |
Mem (GB) |
FPS[fwd] |
FPS[bwd] |
---|---|---|---|---|---|
gsplat v1.0.0 |
True |
True |
0.61 |
124.5 |
43.6 |
gsplat v1.0.0 |
True |
False |
0.61 |
157.0 |
44.3 |
gsplat v1.0.0 |
False |
False |
0.61 |
168.4 |
44.2 |
gsplat v0.1.11 |
n/a |
n/a |
0.83 |
18.3 |
6.9 |
diff-gaussian-rasterization |
n/a |
n/a |
3.66 |
28.9 |
5.0 |
Batch size 4.
python profiling/main.py --backend gsplat gsplat-legacy inria \
--batch_size 4 --scene_grid 1 --channels 32
Backend |
Packed |
Sparse Grad |
Mem (GB) |
FPS[fwd] |
FPS[bwd] |
---|---|---|---|---|---|
gsplat v1.0.0 |
True |
True |
2.45 |
36.8 x 4 |
10.9 x 4 |
gsplat v1.0.0 |
True |
False |
2.45 |
40.4 x 4 |
10.9 x 4 |
gsplat v1.0.0 |
False |
False |
2.48 |
42.1 x 4 |
10.9 x 4 |
gsplat v0.1.11 |
n/a |
n/a |
3.28 |
4.5 x 4 |
1.7 x 4 |
diff-gaussian-rasterization |
n/a |
n/a |
14.52 |
7.1 x 4 |
1.2 x 4 |
Render a Large Scene¶
49M Gaussians.
python profiling/main.py --backend gsplat gsplat-legacy inria \
--batch_size 1 --scene_grid 21 --channels 3
Backend |
Packed |
Sparse Grad |
Mem (GB) |
FPS[fwd] |
FPS[bwd] |
---|---|---|---|---|---|
gsplat v1.0.0 |
True |
True |
1.44 |
53.7 |
39.2 |
gsplat v1.0.0 |
True |
False |
3.08 |
62.1 |
34.6 |
gsplat v1.0.0 |
False |
False |
5.67 |
59.2 |
37.5 |
gsplat v0.1.11 |
n/a |
n/a |
9.86 |
23.8 |
21.1 |
diff-gaussian-rasterization |
n/a |
n/a |
10.84 |
38.3 |
18.8 |
107M Gaussians.
python profiling/main.py --backend gsplat gsplat-legacy inria \
--batch_size 1 --scene_grid 31 --channels 3
Backend |
Packed |
Sparse Grad |
Mem (GB) |
FPS[fwd] |
FPS[bwd] |
---|---|---|---|---|---|
gsplat v1.0.0 |
True |
True |
2.31 |
45.1 |
38.4 |
gsplat v1.0.0 |
True |
False |
6.11 |
47.3 |
28.9 |
gsplat v1.0.0 |
False |
False |
12.17 |
39.3 |
25.8 |
gsplat v0.1.11 |
n/a |
n/a |
OOM |
OOM |
OOM |
diff-gaussian-rasterization |
n/a |
n/a |
OOM |
OOM |
OOM |