Neural Volume Rendering and Surface Rendering
16-825 Learning for 3D: Assignment 3
A. Neural Volume Rendering (80 points)
0. Transmittance Calculation (10 points)

Transmittance Calculation
Note that we have:
\[ T(\textbf{x}, \textbf{x}_{t_i}) = T(\textbf{x}, \textbf{x}_{t_{i-1}}) \cdot e^{-\sigma_{t_{i-1}} \Delta t_{i-1}} \]
- T(y_1, y_2):
\[ T(y_1, y_2) = e^{-2} = 0.135 \]
- T(x_2, y_4):
\[ T(y_2, y_4) = T(y_2, y_3) \cdot T(y_3, y_4) = e^{-0.5} \cdot e^{-30} = 5.676 \times 10^{-14} \]
- T(x, y_4):
\[ T(x, y_4) = T(x, y_1) \cdot T(y_1, y_2) \cdot T(y_2, y_3) \cdot T(y_3, y_4) = e^{0} \cdot e^{-2} \cdot e^{-0.5} \cdot e^{-30} = 7.681 \times 10^{-15} \]
- T(x, y_3):
\[ T(x, y_3) = T(x, y_1) \cdot T(y_1, y_2) \cdot T(y_2, y_3) = e^{0} \cdot e^{-2} \cdot e^{-0.5} = 0.082 \]
1. Differentiable Volume Rendering
1.3. Ray sampling (5 points)
Visualization
Run for visualization:
# mkdir images (uncomment when running for the first time)
python volume_rendering_main.py --config-name=box
1.4. Point sampling (5 points)
Visualization
Run for visualization:
python volume_rendering_main.py --config-name=box
1.5. Volume rendering (20 points)
Visualization
2. Optimizing a basic implicit volume
2.1. Random ray sampling (5 points)
Implemented get_random_pixels_from_image in ray_utils.py.
xy_grid = get_random_pixels_from_image(cfg.training.batch_size, image_size, camera)2.2. Loss and training (5 points)
Run the training with:
python volume_rendering_main.py --config-name=train_boxBelow are the center of the box, and the side lengths of the box after training, rounded to the nearest 1/100 decimal place.
Box center: (0.25, 0.25, -0.00)
Box side lengths: (2.01, 1.50, 1.50)2.3. Visualization
3. Optimizing a Neural Radiance Field (NeRF) (20 points)
Implementation
Note:
- Use the
ReLUactivation for the density layer to ensure non-negative values. - Use the
Sigmoidactivation for the color layer. - Use
HarmonicEmbeddingfor both xyz and direction inputs for positional encoding. - Use
MLPWithInputSkipsclass as the backbone mlp for generating spatial features.
Visualization
Run the training with:
python volume_rendering_main.py --config-name=nerf_lego
python volume_rendering_main.py --config-name=nerf_lego_highres
4. NeRF Extras (CHOOSE ONE! More than one is extra credit)
4.1 View Dependence (10 points)
I modified the the color layer of the NeRF MLP to take in both 3D position and viewing direction as input, and feed the concatenation of features and embedded viewing direction to the color layer in the forward function.
To run the training, first download the nerf materials dataset and store it in the data/ folder, and run:
python volume_rendering_main.py --config-name=nerf_materials_highresVisualization
The result shows that the model is decently able to capture view-dependent effects such as specular highlights on the surface of the objects.
Trade-offs between increased view dependence and generalization quality
Incorporating view dependence into the NeRF model allows it to capture complex lighting effects such as specular highlights and reflections, which are essential for shiny or reflective surfaces. However, this will also increase the complexity of the model, requiring more parameters and potentially leading to overfitting.
4.2 Coarse/Fine Sampling (10 points)
NeRF employs two networks: a coarse network and a fine network. During the coarse pass, it uses the coarse network to get an estimate of geometry, and during fine pass uses these geometry estimates for better point sampling for the fine network. Implement this strategy and discuss trade-offs (speed / quality).
Visualization
Not implemented.
trade-offs (speed / quality)
Not implemented.
B. Neural Surface Rendering (50 points)
5. Sphere Tracing (10 points)
Implementation
I implemented the sphere tracing algorithm referring to the lecture slides. It initializes the mask and iteratively updates the 3D points along each ray direction to find the intersection of the surfaces. In each iteration, we check if the signed distance is sufficiently small to consider it an intersection, and update the mask accordingly. The process continues until all rays have intersected or the maximum number of iterations is reached.
Below is my implementation of sphere tracing in renderer.py:
def sphere_tracing(self, implicit_fn, origins, directions):
N_rays = origins.shape[0]
points = origins.clone() # (N_rays, 3)
# Initialize mask
mask = torch.zeros((N_rays, 1), dtype=torch.bool, device=origins.device)
for _ in range(self.max_iters):
# Predict signed distance for each point
sdf = implicit_fn(points) # (N_rays,)
# Update points
points = points + sdf * directions
# Update mask
mask = mask | (sdf < 1e-6)
if mask.all():
break
return points, maskVisualization
Run the training with:
# mkdir images (uncomment when running for the first time)
python -m surface_rendering_main --config-name=torus_surface
6. Optimizing a Neural SDF (15 points)
Implementation
Note:
- Optimization Goal: \(SDF(x) = 0\) for all points \(x\) in the point cloud.
- Eikonal Regularization Constraint: \(||\nabla SDF(x)|| = 1\) for all \(x \in \mathbb{R}^3\).
- Use
HarmonicEmbeddingfor positional encoding of input 3D points. - Use
MLPWithInputSkipsto define the MLP architecture similar to Part A.
Final Loss:
Point Loss: 0.001139
Eikonal Loss: 0.049197Visualization
Run this to train the NeuralSurface representation:
python -m surface_rendering_main --config-name=points_surface
7. VolSDF (15 points)
Implementation
Color Prediction
For the color prediction, similar to the implementation in NeuralRadianceField, I added two fully connnected layers to predict RGB color from the features extracted from the SDF MLP. While calling get_color or get_distance_color functions, I first get the features for input points from the SDF MLP, and then pass them through the color MLP to get the RGB color.
SDF to Density
From section 3.1 of the VolSDF Paper, the volumetric density is modeled as:
\[ \sigma(x) = \alpha \, \Psi_\beta(-d_\Omega(x)) \]
where \(d_\Omega(x)\) is the SDF, and the Laplace CDF is given by:
\[ \Psi_\beta(s) = \begin{cases} \dfrac{1}{2}\exp\!\left(\dfrac{s}{\beta}\right), & s \le 0 \\[8pt] 1 - \dfrac{1}{2}\exp\!\left(-\dfrac{s}{\beta}\right), & s > 0 \end{cases} \]
The implementation in the sdf_to_density function in renderer.py is as follows:
def sdf_to_density(signed_distance, alpha, beta):
neg_sdf = -signed_distance
density = torch.where(
neg_sdf <= 0,
0.5 * torch.exp(neg_sdf / beta),
1.0 - 0.5 * torch.exp(-neg_sdf / beta)
)
density = alpha * density
return densityVisualization
Run this to train an SDF on the lego bulldozer model:
python -m surface_rendering_main --config-name=volsdf_surface
I chose alpha=10.0 and beta=0.025 for optimal results.
Discussion
- What does the parameters \(\alpha\) and \(\beta\) are doing here?
\(\alpha\) controls the overall scale of the density. Higher \(\alpha\) means higher opacity or thickness of the surface. \(\beta\) controls the sharpness of the transition from free space to the surface.
- How does high \(\beta\) bias your learned SDF? What about low \(\beta\) ?
A high \(\beta\) makes the SDF-to-density mapping smoother, so the surface becomes thick and blurry. A low \(\beta\) makes the mapping sharper, concentrating density tightly around the zero-level set and producing sharp edges and surfaces.
- Would an SDF be easier to train with volume rendering and low \(\beta\) or high \(\beta\) ? Why?
Training with high \(\beta\) is easier because the gradients are smoother and less likely to vanish or explode.
- Would you be more likely to learn an accurate surface with high \(\beta\) or low \(\beta\) ? Why?
An accurate surface is more likely to learn with low \(\beta\), since the density better approximates the true SDF boundary, though it may be harder to train.
8. Neural Surface Extras (CHOOSE ONE! More than one is extra credit)
8.1. Render a Large Scene with Sphere Tracing (10 points)
I created a new implicit function that combines a box frame with several spheres placed along its edges.
Visualization
Run the training with:
python -m surface_rendering_main --config-name=complex_boxframe_surface
8.2 Fewer Training Views (10 points)
Implementation
I set up a parameter num_train_views in the yaml file to control the number of training views. I also pass that config value into the dataset loader to select a random subset of training views by using the following code snippet in dataset.py:
# Q8.2: Subsample training views if num_train_views is specified
if num_train_views is not None and num_train_views < len(train_idx):
print(f"Using {num_train_views} training views (out of {len(train_idx)} available)")
# Randomly sample num_train_views indices from train_idx
# Use a fixed random seed for reproducibility
np.random.seed(42)
sampled_indices = np.random.choice(len(train_idx), size=num_train_views, replace=False)
train_idx = [train_idx[i] for i in sampled_indices]
else:
print(f"Using all {len(train_idx)} training views")Visualization
Commands:
python volume_rendering_main.py --config-name=nerf_lego_few_views # NeRF with 20 training views
python -m surface_rendering_main --config-name=volsdf_surface_few_viewsBelow are the comparisons of training VolSDF (geometry + color rendering) and NeRF models with full views and only 20 training views on the lego bulldozer scene.
20 Training Views
20 Training Views
20 Training Views
Full Training Views
Full Training Views
Full Training Views
Both VolSDF and NeRF show more blurry result when trained with only 20 views.
8.3 Alternate SDF to Density Conversions (10 points)
Implementation
I implemented the SDF to density conversion from the NeuS paper in addition to the VolSDF method.
The NeuS paper proposes the following SDF to density conversion: \[ \phi_\beta(x) = \frac{1}{\beta} \cdot \text{sigmoid}(-\frac{1}{\beta} \cdot x) \cdot (1 - \text{sigmoid}(-\frac{1}{\beta} \cdot x)) \]
Python code implementation:
def sdf_to_density_NeuS(signed_distance, alpha, beta):
sigmoid_term = torch.sigmoid(-signed_distance / beta)
density = alpha * (1.0 / beta) * (sigmoid_term * (1.0 - sigmoid_term))
return densityNote that I adjusted the alpha and beta to 2.0 and 0.02 for NeuS to get better results.
Visualization
Commands:
python -m surface_rendering_main --config-name=neus_surface
The two methods produce similar quality results, with NeuS having slightly sharper edges.