D-RayAvatar: Dynamic Ray-tracing based Relightable Gaussian Avatar from Monocular Videos

Abstract

The success of 3D Gaussian Splatting has inspired a series of Gaussian-based 4D avatars, some of which enable remarkable relighting. However, since the original 3DGS is limited to capture advanced illumination, the relighting quality of the latest 4D Gaussian avatars still need to be improved. The recent Gaussian ray-tracer has shown impressive potential to capture secondary ray-based illumination for static scenes, but costs too huge computation to be directly applied for dynamic avatars. In this paper, we propose a new Dynamic Ray-tracing based Gaussian Splatting (D-RayGS), which tailors for relightable Gaussian avatar reconstruction to simulate the inter-reflective illumination between Gaussian primitives for high-fidelity avatar inverse rendering, while performing the Gaussian ray-tracing in a fast speed. Based on the D-RayGS, we introduce a highly accurate joint learning of both the dynamic geometry, BRDF materials and lighting, using a compact regularization from an extra signed distance field. Benefiting from the D-RayGS and learning, we build a relightable Gaussian avatar reconstruction from monocular videos, i.e., D-RayAvatar, which enables high fidelity relightable rendering quality while performing the dynamic rendering efficiently. Extensive evaluation on public datasets show that our D-RayAvatar can achieve better Gaussian avatar reconstruction in both geometry reconstruction quality and relightable rendering quality.

Methodology

Figure 2: The main pipeline of D-RayAvatar. Given monocular video input, we propose to reconstruct the Gaussian avatar represented by relightable 2D Gaussian primitives, which enables efficient inverse rendering using dynamic ray-tracing based Gaussian splatting (D-RayGS). We deform the relightable canonical Gaussian attributes $\bar{g}$ to deformed $g$ using motion deformation $\mathcal{D}^m$ and shape deformation $\mathcal{D}^s$, then compute its Gaussian-level PBR color $c_{pbr}(\omega_0)$ by performing ray-tracing embedded Physically Based Rendering directly on each $g$, and aggregate the PBR color image by splatting all the Gaussians using a PBR splatting. The D-RayGS is learnt to be compact with the geometric regularization from an extra SDF field for high quality relightable rendering.

Reconstruction Comparison

Reconstruction Comparison against SOTA methods, including both relightable and non-relightable methods.

Qualitative comparison

Qualitative comparison across five different identities.

Quantitative Comparison

Quantitative comparison on the test set of our collected dataset from different comparison approaches.

Method	PSNR ↑	SSIM ↑	LPIPS ↓	MSE ↓	L1 ↓
FLARE	25.6670	0.9122	0.0592	0.05921	0.0123
INSTA	26.6980	0.9268	0.0912	0.0509	0.0151
SPLAvatar	25.6255	0.9223	0.0979	0.0533	0.0139
FlashAvatar	28.7674	0.9415	0.0563	0.0406	0.0110
GBS(Gaussian Blendshapes)	27.9699	0.9390	0.1051	0.0436	0.0126
HRAvatar	29.3708	0.9407	0.0776	0.0414	0.0121
Ours	29.9664	0.9431	0.0557	0.0350	0.0101

Relighting Comparison

Relighting Comparison against SOTA methods (FLARE, RGAvatar, HRAvatar) under different illumination, showing more faithful and detailed prediction of albedos, and enhanced relighting quality with superior specular highlights and realistic self-shadows.

Sample 1