Utils

Below are the basic functions that supports the rasterization.

3DGS

spherical_harmonics(degrees_to_use: int, dirs: Tensor, coeffs: Tensor, masks: Tensor | None = None) Tensor[source]

Computes spherical harmonics.

Parameters:
  • degrees_to_use – The degree to be used.

  • dirs – Directions. […, 3]

  • coeffs – Coefficients. […, K, 3]

  • masks – Optional boolen masks to skip some computation. […,] Default: None.

Returns:

Spherical harmonics. […, 3]

quat_scale_to_covar_preci(quats: Tensor, scales: Tensor, compute_covar: bool = True, compute_preci: bool = True, triu: bool = False) Tuple[Tensor | None, Tensor | None][source]

Converts quaternions and scales to covariance and precision matrices.

Parameters:
  • quats – Quaternions (No need to be normalized). [N, 4]

  • scales – Scales. [N, 3]

  • compute_covar – Whether to compute covariance matrices. Default: True. If False, the returned covariance matrices will be None.

  • compute_preci – Whether to compute precision matrices. Default: True. If False, the returned precision matrices will be None.

  • triu – If True, the return matrices will be upper triangular. Default: False.

Returns:

  • Covariance matrices. If triu is True the returned shape is [N, 6], otherwise [N, 3, 3].

  • Precision matrices. If triu is True the returned shape is [N, 6], otherwise [N, 3, 3].

Return type:

A tuple

proj(means: Tensor, covars: Tensor, Ks: Tensor, width: int, height: int, camera_model: typing_extensions.Literal[pinhole, ortho, fisheye] = 'pinhole') Tuple[Tensor, Tensor][source]

Projection of Gaussians (perspective or orthographic).

Parameters:
  • means – Gaussian means. [C, N, 3]

  • covars – Gaussian covariances. [C, N, 3, 3]

  • Ks – Camera intrinsics. [C, 3, 3]

  • width – Image width.

  • height – Image height.

Returns:

  • Projected means. [C, N, 2]

  • Projected covariances. [C, N, 2, 2]

Return type:

A tuple

fully_fused_projection(means: Tensor, covars: Tensor | None, quats: Tensor | None, scales: Tensor | None, viewmats: Tensor, Ks: Tensor, width: int, height: int, eps2d: float = 0.3, near_plane: float = 0.01, far_plane: float = 10000000000.0, radius_clip: float = 0.0, packed: bool = False, sparse_grad: bool = False, calc_compensations: bool = False, camera_model: typing_extensions.Literal[pinhole, ortho, fisheye] = 'pinhole') Tuple[Tensor, Tensor, Tensor, Tensor, Tensor][source]

Projects Gaussians to 2D.

This function fuse the process of computing covariances (quat_scale_to_covar_preci()), transforming to camera space (world_to_cam()), and projection (proj()).

Note

During projection, we ignore the Gaussians that are outside of the camera frustum. So not all the elements in the output tensors are valid. The output radii could serve as an indicator, in which zero radii means the corresponding elements are invalid in the output tensors and will be ignored in the next rasterization process. If packed=True, the output tensors will be packed into a flattened tensor, in which all elements are valid. In this case, a camera_ids tensor and gaussian_ids tensor will be returned to indicate the row (camera) and column (Gaussian) indices of the packed flattened tensor, which is essentially following the COO sparse tensor format.

Note

This functions supports projecting Gaussians with either covariances or {quaternions, scales}, which will be converted to covariances internally in a fused CUDA kernel. Either covars or {quats, scales} should be provided.

Parameters:
  • means – Gaussian means. [N, 3]

  • covars – Gaussian covariances (flattened upper triangle). [N, 6] Optional.

  • quats – Quaternions (No need to be normalized). [N, 4] Optional.

  • scales – Scales. [N, 3] Optional.

  • viewmats – Camera-to-world matrices. [C, 4, 4]

  • Ks – Camera intrinsics. [C, 3, 3]

  • width – Image width.

  • height – Image height.

  • eps2d – A epsilon added to the 2D covariance for numerical stability. Default: 0.3.

  • near_plane – Near plane distance. Default: 0.01.

  • far_plane – Far plane distance. Default: 1e10.

  • radius_clip – Gaussians with projected radii smaller than this value will be ignored. Default: 0.0.

  • packed – If True, the output tensors will be packed into a flattened tensor. Default: False.

  • sparse_grad – This is only effective when packed is True. If True, during backward the gradients of {means, covars, quats, scales} will be a sparse Tensor in COO layout. Default: False.

  • calc_compensations – If True, a view-dependent opacity compensation factor will be computed, which is useful for anti-aliasing. Default: False.

Returns:

If packed is True:

  • camera_ids. The row indices of the projected Gaussians. Int32 tensor of shape [nnz].

  • gaussian_ids. The column indices of the projected Gaussians. Int32 tensor of shape [nnz].

  • radii. The maximum radius of the projected Gaussians in pixel unit. Int32 tensor of shape [nnz].

  • means. Projected Gaussian means in 2D. [nnz, 2]

  • depths. The z-depth of the projected Gaussians. [nnz]

  • conics. Inverse of the projected covariances. Return the flattend upper triangle with [nnz, 3]

  • compensations. The view-dependent opacity compensation factor. [nnz]

If packed is False:

  • radii. The maximum radius of the projected Gaussians in pixel unit. Int32 tensor of shape [C, N].

  • means. Projected Gaussian means in 2D. [C, N, 2]

  • depths. The z-depth of the projected Gaussians. [C, N]

  • conics. Inverse of the projected covariances. Return the flattend upper triangle with [C, N, 3]

  • compensations. The view-dependent opacity compensation factor. [C, N]

Return type:

A tuple

isect_tiles(means2d: Tensor, radii: Tensor, depths: Tensor, tile_size: int, tile_width: int, tile_height: int, sort: bool = True, packed: bool = False, n_cameras: int | None = None, camera_ids: Tensor | None = None, gaussian_ids: Tensor | None = None) Tuple[Tensor, Tensor, Tensor][source]

Maps projected Gaussians to intersecting tiles.

Parameters:
  • means2d – Projected Gaussian means. [C, N, 2] if packed is False, [nnz, 2] if packed is True.

  • radii – Maximum radii of the projected Gaussians. [C, N] if packed is False, [nnz] if packed is True.

  • depths – Z-depth of the projected Gaussians. [C, N] if packed is False, [nnz] if packed is True.

  • tile_size – Tile size.

  • tile_width – Tile width.

  • tile_height – Tile height.

  • sort – If True, the returned intersections will be sorted by the intersection ids. Default: True.

  • packed – If True, the input tensors are packed. Default: False.

  • n_cameras – Number of cameras. Required if packed is True.

  • camera_ids – The row indices of the projected Gaussians. Required if packed is True.

  • gaussian_ids – The column indices of the projected Gaussians. Required if packed is True.

Returns:

  • Tiles per Gaussian. The number of tiles intersected by each Gaussian. Int32 [C, N] if packed is False, Int32 [nnz] if packed is True.

  • Intersection ids. Each id is an 64-bit integer with the following information: camera_id (Xc bits) | tile_id (Xt bits) | depth (32 bits). Xc and Xt are the maximum number of bits required to represent the camera and tile ids, respectively. Int64 [n_isects]

  • Flatten ids. The global flatten indices in [C * N] or [nnz] (packed). [n_isects]

Return type:

A tuple

isect_offset_encode(isect_ids: Tensor, n_cameras: int, tile_width: int, tile_height: int) Tensor[source]

Encodes intersection ids to offsets.

Parameters:
  • isect_ids – Intersection ids. [n_isects]

  • n_cameras – Number of cameras.

  • tile_width – Tile width.

  • tile_height – Tile height.

Returns:

Offsets. [C, tile_height, tile_width]

world_to_cam(means: Tensor, covars: Tensor, viewmats: Tensor) Tuple[Tensor, Tensor][source]

Transforms Gaussians from world to camera coordinate system.

Parameters:
  • means – Gaussian means. [N, 3]

  • covars – Gaussian covariances. [N, 3, 3]

  • viewmats – World-to-camera transformation matrices. [C, 4, 4]

Returns:

  • Gaussian means in camera coordinate system. [C, N, 3]

  • Gaussian covariances in camera coordinate system. [C, N, 3, 3]

Return type:

A tuple

rasterize_to_pixels(means2d: Tensor, conics: Tensor, colors: Tensor, opacities: Tensor, image_width: int, image_height: int, tile_size: int, isect_offsets: Tensor, flatten_ids: Tensor, backgrounds: Tensor | None = None, masks: Tensor | None = None, packed: bool = False, absgrad: bool = False) Tuple[Tensor, Tensor][source]

Rasterizes Gaussians to pixels.

Parameters:
  • means2d – Projected Gaussian means. [C, N, 2] if packed is False, [nnz, 2] if packed is True.

  • conics – Inverse of the projected covariances with only upper triangle values. [C, N, 3] if packed is False, [nnz, 3] if packed is True.

  • colors – Gaussian colors or ND features. [C, N, channels] if packed is False, [nnz, channels] if packed is True.

  • opacities – Gaussian opacities that support per-view values. [C, N] if packed is False, [nnz] if packed is True.

  • image_width – Image width.

  • image_height – Image height.

  • tile_size – Tile size.

  • isect_offsets – Intersection offsets outputs from isect_offset_encode(). [C, tile_height, tile_width]

  • flatten_ids – The global flatten indices in [C * N] or [nnz] from isect_tiles(). [n_isects]

  • backgrounds – Background colors. [C, channels]. Default: None.

  • masks – Optional tile mask to skip rendering GS to masked tiles. [C, tile_height, tile_width]. Default: None.

  • packed – If True, the input tensors are expected to be packed with shape [nnz, …]. Default: False.

  • absgrad – If True, the backward pass will compute a .absgrad attribute for means2d. Default: False.

Returns:

  • Rendered colors. [C, image_height, image_width, channels]

  • Rendered alphas. [C, image_height, image_width, 1]

Return type:

A tuple

rasterize_to_indices_in_range(range_start: int, range_end: int, transmittances: Tensor, means2d: Tensor, conics: Tensor, opacities: Tensor, image_width: int, image_height: int, tile_size: int, isect_offsets: Tensor, flatten_ids: Tensor) Tuple[Tensor, Tensor, Tensor][source]

Rasterizes a batch of Gaussians to images but only returns the indices.

Note

This function supports iterative rasterization, in which each call of this function will rasterize a batch of Gaussians from near to far, defined by [range_start, range_end). If a one-step full rasterization is desired, set range_start to 0 and range_end to a really large number, e.g, 1e10.

Parameters:
  • range_start – The start batch of Gaussians to be rasterized (inclusive).

  • range_end – The end batch of Gaussians to be rasterized (exclusive).

  • transmittances – Currently transmittances. [C, image_height, image_width]

  • means2d – Projected Gaussian means. [C, N, 2]

  • conics – Inverse of the projected covariances with only upper triangle values. [C, N, 3]

  • opacities – Gaussian opacities that support per-view values. [C, N]

  • image_width – Image width.

  • image_height – Image height.

  • tile_size – Tile size.

  • isect_offsets – Intersection offsets outputs from isect_offset_encode(). [C, tile_height, tile_width]

  • flatten_ids – The global flatten indices in [C * N] from isect_tiles(). [n_isects]

Returns:

  • Gaussian ids. Gaussian ids for the pixel intersection. A flattened list of shape [M].

  • Pixel ids. pixel indices (row-major). A flattened list of shape [M].

  • Camera ids. Camera indices. A flattened list of shape [M].

Return type:

A tuple

accumulate(means2d: Tensor, conics: Tensor, opacities: Tensor, colors: Tensor, gaussian_ids: Tensor, pixel_ids: Tensor, camera_ids: Tensor, image_width: int, image_height: int) Tuple[Tensor, Tensor][source]

Alpah compositing of 2D Gaussians in Pure Pytorch.

This function performs alpha compositing for Gaussians based on the pair of indices {gaussian_ids, pixel_ids, camera_ids}, which annotates the intersection between all pixels and Gaussians. These intersections can be accquired from gsplat.rasterize_to_indices_in_range.

Note

This function exposes the alpha compositing process into pure Pytorch. So it relies on Pytorch’s autograd for the backpropagation. It is much slower than our fully fused rasterization implementation and comsumes much more GPU memory. But it could serve as a playground for new ideas or debugging, as no backward implementation is needed.

Warning

This function requires the nerfacc package to be installed. Please install it using the following command pip install nerfacc.

Parameters:
  • means2d – Gaussian means in 2D. [C, N, 2]

  • conics – Inverse of the 2D Gaussian covariance, Only upper triangle values. [C, N, 3]

  • opacities – Per-view Gaussian opacities (for example, when antialiasing is enabled, Gaussian in each view would efficiently have different opacity). [C, N]

  • colors – Per-view Gaussian colors. Supports N-D features. [C, N, channels]

  • gaussian_ids – Collection of Gaussian indices to be rasterized. A flattened list of shape [M].

  • pixel_ids – Collection of pixel indices (row-major) to be rasterized. A flattened list of shape [M].

  • camera_ids – Collection of camera indices to be rasterized. A flattened list of shape [M].

  • image_width – Image width.

  • image_height – Image height.

Returns:

  • renders: Accumulated colors. [C, image_height, image_width, channels]

  • alphas: Accumulated opacities. [C, image_height, image_width, 1]

Return type:

A tuple

rasterization_inria_wrapper(means: Tensor, quats: Tensor, scales: Tensor, opacities: Tensor, colors: Tensor, viewmats: Tensor, Ks: Tensor, width: int, height: int, near_plane: float = 0.01, far_plane: float = 100.0, eps2d: float = 0.3, sh_degree: int | None = None, backgrounds: Tensor | None = None, **kwargs) Tuple[Tensor, Tensor, Dict][source]

Wrapper for Inria’s rasterization backend.

Warning

This function exists for comparison purpose only. Only rendered image is returned.

Warning

Inria’s CUDA backend has its own LICENSE, so this function should be used with the respect to the original LICENSE at: https://github.com/graphdeco-inria/diff-gaussian-rasterization

2DGS

fully_fused_projection_2dgs(means: Tensor, quats: Tensor, scales: Tensor, viewmats: Tensor, Ks: Tensor, width: int, height: int, eps2d: float = 0.3, near_plane: float = 0.01, far_plane: float = 10000000000.0, radius_clip: float = 0.0, packed: bool = False, sparse_grad: bool = False) Tuple[Tensor, Tensor, Tensor, Tensor][source]

Prepare Gaussians for rasterization

This function prepares ray-splat intersection matrices, computes per splat bounding box and 2D means in image space.

Parameters:
  • means – Gaussian means. [N, 3]

  • quats – Quaternions (No need to be normalized). [N, 4].

  • scales – Scales. [N, 3].

  • viewmats – Camera-to-world matrices. [C, 4, 4]

  • Ks – Camera intrinsics. [C, 3, 3]

  • width – Image width.

  • height – Image height.

  • near_plane – Near plane distance. Default: 0.01.

  • far_plane – Far plane distance. Default: 200.

  • radius_clip – Gaussians with projected radii smaller than this value will be ignored. Default: 0.0.

  • packed – If True, the output tensors will be packed into a flattened tensor. Default: False.

  • sparse_grad (Experimental) – This is only effective when packed is True. If True, during backward the gradients of {means, covars, quats, scales} will be a sparse Tensor in COO layout. Default: False.

Returns:

If packed is True:

  • camera_ids. The row indices of the projected Gaussians. Int32 tensor of shape [nnz].

  • gaussian_ids. The column indices of the projected Gaussians. Int32 tensor of shape [nnz].

  • radii. The maximum radius of the projected Gaussians in pixel unit. Int32 tensor of shape [nnz].

  • means. Projected Gaussian means in 2D. [nnz, 2]

  • depths. The z-depth of the projected Gaussians. [nnz]

  • ray_transforms. transformation matrices that transforms xy-planes in pixel spaces into splat coordinates (WH)^T in equation (9) in paper [nnz, 3, 3]

  • normals. The normals in camera spaces. [nnz, 3]

If packed is False:

  • radii. The maximum radius of the projected Gaussians in pixel unit. Int32 tensor of shape [C, N].

  • means. Projected Gaussian means in 2D. [C, N, 2]

  • depths. The z-depth of the projected Gaussians. [C, N]

  • ray_transforms. transformation matrices that transforms xy-planes in pixel spaces into splat coordinates.

  • normals. The normals in camera spaces. [C, N, 3]

Return type:

A tuple

rasterize_to_pixels_2dgs(means2d: Tensor, ray_transforms: Tensor, colors: Tensor, opacities: Tensor, normals: Tensor, densify: Tensor, image_width: int, image_height: int, tile_size: int, isect_offsets: Tensor, flatten_ids: Tensor, backgrounds: Tensor | None = None, masks: Tensor | None = None, packed: bool = False, absgrad: bool = False, distloss: bool = False) Tuple[Tensor, Tensor][source]

Rasterize Gaussians to pixels.

Parameters:
  • means2d – Projected Gaussian means. [C, N, 2] if packed is False, [nnz, 2] if packed is True.

  • ray_transforms – transformation matrices that transforms xy-planes in pixel spaces into splat coordinates. [C, N, 3, 3] if packed is False, [nnz, channels] if packed is True.

  • colors – Gaussian colors or ND features. [C, N, channels] if packed is False, [nnz, channels] if packed is True.

  • opacities – Gaussian opacities that support per-view values. [C, N] if packed is False, [nnz] if packed is True.

  • normals – The normals in camera space. [C, N, 3] if packed is False, [nnz, 3] if packed is True.

  • densify – Dummy variable to keep track of gradient for densification. [C, N, 2] if packed, [nnz, 3] if packed is True.

  • tile_size – Tile size.

  • isect_offsets – Intersection offsets outputs from isect_offset_encode(). [C, tile_height, tile_width]

  • flatten_ids – The global flatten indices in [C * N] or [nnz] from isect_tiles(). [n_isects]

  • backgrounds – Background colors. [C, channels]. Default: None.

  • masks – Optional tile mask to skip rendering GS to masked tiles. [C, tile_height, tile_width]. Default: None.

  • packed – If True, the input tensors are expected to be packed with shape [nnz, …]. Default: False.

  • absgrad – If True, the backward pass will compute a .absgrad attribute for means2d. Default: False.

Returns:

  • Rendered colors. [C, image_height, image_width, channels]

  • Rendered alphas. [C, image_height, image_width, 1]

  • Rendered normals. [C, image_height, image_width, 3]

  • Rendered distortion. [C, image_height, image_width, 1]

  • Rendered median depth.[C, image_height, image_width, 1]

Return type:

A tuple

rasterize_to_indices_in_range_2dgs(range_start: int, range_end: int, transmittances: Tensor, means2d: Tensor, ray_transforms: Tensor, opacities: Tensor, image_width: int, image_height: int, tile_size: int, isect_offsets: Tensor, flatten_ids: Tensor) Tuple[Tensor, Tensor, Tensor][source]

Rasterizes a batch of Gaussians to images but only returns the indices.

Note

This function supports iterative rasterization, in which each call of this function will rasterize a batch of Gaussians from near to far, defined by [range_start, range_end). If a one-step full rasterization is desired, set range_start to 0 and range_end to a really large number, e.g, 1e10.

Parameters:
  • range_start – The start batch of Gaussians to be rasterized (inclusive).

  • range_end – The end batch of Gaussians to be rasterized (exclusive).

  • transmittances – Currently transmittances. [C, image_height, image_width]

  • means2d – Projected Gaussian means. [C, N, 2]

  • ray_transforms – transformation matrices that transforms xy-planes in pixel spaces into splat coordinates. [C, N, 3, 3]

  • opacities – Gaussian opacities that support per-view values. [C, N]

  • image_width – Image width.

  • image_height – Image height.

  • tile_size – Tile size.

  • isect_offsets – Intersection offsets outputs from isect_offset_encode(). [C, tile_height, tile_width]

  • flatten_ids – The global flatten indices in [C * N] from isect_tiles(). [n_isects]

Returns:

  • Gaussian ids. Gaussian ids for the pixel intersection. A flattened list of shape [M].

  • Pixel ids. pixel indices (row-major). A flattened list of shape [M].

  • Camera ids. Camera indices. A flattened list of shape [M].

Return type:

A tuple

accumulate_2dgs(means2d: Tensor, ray_transforms: Tensor, opacities: Tensor, colors: Tensor, normals: Tensor, gaussian_ids: Tensor, pixel_ids: Tensor, camera_ids: Tensor, image_width: int, image_height: int) Tuple[Tensor, Tensor, Tensor][source]

Alpha compositing for 2DGS.

Warning

This function requires the nerfacc package to be installed. Please install it using the following command pip install nerfacc.

Parameters:
  • means2d – Gaussian means in 2D. [C, N, 2]

  • ray_transforms – transformation matrices that transform rays in pixel space into splat’s local frame. [C, N, 3, 3]

  • opacities – Per-view Gaussian opacities (for example, when antialiasing is enabled, Gaussian in each view would efficiently have different opacity). [C, N]

  • colors – Per-view Gaussian colors. Supports N-D features. [C, N, channels]

  • normals – Per-view Gaussian normals. [C, N, 3]

  • gaussian_ids – Collection of Gaussian indices to be rasterized. A flattened list of shape [M].

  • pixel_ids – Collection of pixel indices (row-major) to be rasterized. A flattened list of shape [M].

  • camera_ids – Collection of camera indices to be rasterized. A flattened list of shape [M].

  • image_width – Image width.

  • image_height – Image height.

Returns:

renders: Accumulated colors. [C, image_height, image_width, channels]

alphas: Accumulated opacities. [C, image_height, image_width, 1]

normals: Accumulated opacities. [C, image_height, image_width, 3]

Return type:

A tuple

rasterization_2dgs_inria_wrapper(means: Tensor, quats: Tensor, scales: Tensor, opacities: Tensor, colors: Tensor, viewmats: Tensor, Ks: Tensor, width: int, height: int, near_plane: float = 0.01, far_plane: float = 100.0, eps2d: float = 0.3, sh_degree: int | None = None, backgrounds: Tensor | None = None, depth_ratio: int = 0, **kwargs) Tuple[Tuple, Dict][source]

Wrapper for 2DGS’s rasterization backend which is based on Inria’s backend.

Install the 2DGS rasterization backend from

https://github.com/hbb1/diff-surfel-rasterization

Credit to Jeffrey Hu https://github.com/jefequien