ifisam/vr_exercise_01
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
=============================================
VR Exercise 01 - Depth Perception
HCI Human+Computer Interaction
Dr. Nico Feld
Author: Iftakharul Islam
Mtr. Nr: 1801193
Email: s2ifisla@uni-trier.de
Date: 03.05.2026
Unity Version: 6000.4.2f1
=============================================
=== TASK 1: DEPTH PERCEPTION IN 2D ===
---------------------------------------
The starting scene (2D.unity) presents two spheres — a red one and a blue
one — which, owing to the interplay between their respective sizes and
distances from the camera, appear virtually identical in projected size on
screen. The RedSphere (scale 2, at z=10) and the BlueSphere (scale 10, at
z=50) both subtend roughly the same visual angle, rendering their true
depth difference imperceptible without further contextual information.
The following steps were undertaken to progressively introduce monocular
depth cues into the scene, thereby enabling the viewer to discern the
genuine spatial relationship between the two spheres. Crucially, neither
the sizes nor the z-positions of the spheres, nor the camera, were altered
at any point.
Step 1: Added a ground plane with a checkerboard texture.
A large plane (position: 0, -2, 25; scale: 10, 1, 10) was placed
beneath both spheres, with a tiled checkerboard material applied to its
surface. As the textured ground recedes into the distance, the checker
pattern becomes progressively denser and finer from the viewer's
perspective. This texture density gradient provides the brain with a
continuous scale reference across the scene's depth, making it far
easier to appreciate that the BlueSphere sits considerably farther
away than the RedSphere.
Depth Cue: Texture Gradient (Texture Density)
Step 2: Added evenly-spaced vertical pillars along the scene.
Nine identical cylinders (scale: 0.2, 2, 0.2) were placed at regular
intervals along the z-axis, grouped under an empty "Pillars" parent
object. The pillars are positioned at z-values ranging from
approximately 15 through to 60, spaced roughly 5 units apart. Because
each pillar is physically the same height and width, the viewer
instinctively recognises that those appearing smaller on screen must be
farther away. Furthermore, the regularly-spaced row of pillars creates
converging sight lines that draw the eye towards a vanishing point,
reinforcing the sense of depth through linear perspective.
Depth Cues: Familiar/Known Size, Relative Size, Linear Perspective
Step 3: Added a directional light with soft shadows.
A directional light (position: -1, 15, 0; rotation: 50, -30, 0;
intensity: 1.5) was introduced with soft shadow mapping enabled
(shadow type 2). The light casts distinct shadows from both spheres
onto the ground plane, anchoring each sphere to the surface and
revealing its position in three-dimensional space. The shadow of the
closer RedSphere appears sharper and more tightly coupled to its
source, whilst the BlueSphere's shadow is slightly more diffuse and
offset, further communicating the depth difference.
Depth Cue: Cast Shadows / Contact Shadows
Step 4: Enabled linear fog (aerial perspective).
Linear fog was activated in the scene's render settings (fog mode:
linear; start: 5; end: 80; colour: 0.75, 0.75, 0.8). This simulates
atmospheric scattering, whereby objects at greater distances appear
progressively more washed-out and desaturated. The BlueSphere, sitting
at z=50, is noticeably hazier than the RedSphere at z=10. This mimics
the natural phenomenon observed when viewing distant mountains or
buildings — they appear bluish and faded compared to nearer objects.
Depth Cue: Aerial Perspective (Atmospheric Perspective)
Step 5: Added an occlusion object (thin pillar) between the spheres.
A narrow cube (position: 1, 0, 25; scale: 0.3, 3, 0.3) was placed at
the midpoint between the two spheres. From the camera's viewpoint, the
cube partially obscures the BlueSphere whilst leaving the RedSphere
entirely visible. This interposition immediately signals to the viewer
that the BlueSphere must lie behind the occluding object, and hence
farther from the camera. Occlusion is widely regarded as one of the
most powerful monocular depth cues, as the brain resolves it almost
instantaneously and unambiguously.
Depth Cue: Occlusion (Interposition)
Step 6: Added converging road lines along the ground plane.
Two elongated cubes were placed at ground level to simulate road
markings:
- RoadLine L: position (-3, -0.99, 25), scale (0.1, 0.01, 50)
- Roadline R: position (6, -0.99, 25), scale (0.1, 0.01, 50)
These parallel lines, when viewed in perspective, converge towards a
vanishing point on the horizon. This is a classic instance of linear
perspective — the same principle that makes railway tracks appear to
meet in the distance. The convergence provides a strong and intuitive
spatial framework within which the viewer can judge the relative
positions of the spheres.
Depth Cue: Linear Perspective
Step 7: Added a lateral camera motion script (motion parallax).
A custom C# script (CameraMotion.cs) was attached to the Main Camera,
permitting the viewer to shift the camera laterally along the x-axis
using the A and D keys (or the left/right arrow keys). The camera's
movement is clamped to a range of +/- 3 units from its starting
position, at a speed of 5 units per second. When the camera translates
sideways, nearer objects (the RedSphere) appear to shift far more
dramatically across the screen than distant objects (the BlueSphere).
This differential motion — motion parallax — is a potent monocular
depth cue that the brain uses extensively in everyday perception.
Depth Cue: Motion Parallax
Controls: A / Left Arrow = move camera left
D / Right Arrow = move camera right
=== TASK 2: DEPTH PERCEPTION IN 3D (VR) ===
---------------------------------------------
A duplicate of the final Task 1 scene was created and saved as 3D.unity.
In this copy, the standard Main Camera was removed and replaced with a
Meta OVRCameraRig prefab (from the Meta XR Core SDK, version 201.0.0),
positioned at the origin (0, 0, 0). The CameraMotion.cs script was not
carried over to this scene, as head-tracked motion parallax in VR renders
it unnecessary.
Why viewing this scene in VR substantially improves depth perception:
1. Stereopsis (Binocular Disparity)
The VR headset renders two slightly offset images — one for each
eye — separated by the viewer's inter-pupillary distance (~63 mm).
The brain fuses these two images and extracts depth information from
the small differences (disparities) between them. Objects closer to
the viewer exhibit greater binocular disparity than distant ones.
Consequently, the RedSphere (at z=10) produces a markedly larger
disparity than the BlueSphere (at z=50), providing an immediate and
compelling sense of their true depth separation. This is a depth
cue that simply does not exist on a conventional flat display.
2. Vergence
When focusing on a nearby object in VR, the eyes naturally rotate
inward (converge). The muscular effort required for this convergence
serves as an oculomotor depth cue. The closer RedSphere demands
greater vergence than the distant BlueSphere, giving the brain
yet another channel of depth information unavailable on a 2D screen.
3. Head-Tracked Motion Parallax
The Quest 3 provides six degrees of freedom (6DoF) head tracking.
When the viewer physically moves their head — even subtly — the
rendered viewpoint updates in real time. This produces natural
motion parallax: nearby objects shift substantially whilst far
objects remain relatively stationary. Unlike the scripted keyboard-
driven parallax in Task 1, this is continuous, intuitive, and
requires no conscious input from the user.
In summary, the VR headset supplements all of the monocular depth cues
established in Task 1 with powerful binocular cues (stereopsis and
vergence) and natural, head-tracked motion parallax. The combined
effect is a dramatically richer and more veridical perception of depth,
making the true size and distance difference between the two spheres
immediately apparent in a way that is simply unattainable on a flat
monitor.
=== CONTROLS ===
-----------------
2D Scene (2D.unity):
A / Left Arrow Move camera left
D / Right Arrow Move camera right
3D Scene (3D.unity):
Head movement Natural head-tracked 6DoF interaction
(No controller input required for this exercise)
=== EXTERNAL RESOURCES ===
---------------------------
1. Checkerboard Texture
The checkerboard texture applied to the ground plane material
(CheckerBoard Tex.png) was sourced from an Imgur album shared within
the following Reddit thread:
Thread: "Where can I find prototyping checkerboard-styled tiled
materials?"
Author: u/KabirTV
Date: 3 March 2019
Subreddit: r/Unity3D
URL: https://www.reddit.com/r/Unity3D/comments/awth13/
Texture: https://imgur.com/a/xJG4hjY
The texture was used solely as a visual aid for demonstrating the
texture gradient depth cue and does not constitute a trivial solution
to the assignment.
=== PROJECT STRUCTURE ===
--------------------------
Assets/
CameraMotion.cs Lateral camera sway script (Task 1)
Materials/
Blue.mat BlueSphere material
Blue_Lit.mat BlueSphere lit variant (Not Used)
Red.mat RedSphere material
Red_Lit.mat RedSphere lit variant (Not Used)
CheckerBoard.mat Checkerboard material for ground plane
CheckerBoard Tex.png Checkerboard texture (see citation above)
Scenes/
2D.unity Task 1 scene (standard camera)
3D.unity Task 2 scene (OVRCameraRig)
Oculus/ Meta Quest configuration assets
XR/ XR Plugin Management settings
Settings/ URP render pipeline settings
=============================================
I acknowledge some helps regarding debugging and understanding the paradigm of Unity
And developing apps AI has been used. But the whole project is developed by myself.
Slight usage of paraphraser was involved to make the report sound fluent and without
Grammatical mistakes.