Unity visionOS 2D Windows and Fully Immersive VR

Unity Vision Pro Development - 2D & Fully Immersive VR !

Over the last few days, I've been testing 2D windows in Unity using their built-in visionOS platform, as well as fully immersive (VR) experiences. The 2D Window option available within Unity is the default state of a Unity application targeted for the visionOS platform without any other plugins enabled. This means that if you were to build your application in Unity set to the visionOS platform on a Mac, the result would be an Xcode Project including a single Swift UI 2D window to render your application. So yes, only a single 2D window is available when using this mode. To clarify, if your game or app has multiple Canvases/UIs, this mode won’t create multiple native 2D windows; instead, you will see the results of your game in a single native window. (There is a way to use Swift UI from within Unity by using a bridge between Unity and Swift, as discussed here, but keep in mind that this is only available for applications targeting mixed reality experiences.)


As for Full Immersive (VR), in this case, you need to use the visionOS platform, as mentioned previously with 2D windows, but you also need to enable the visionOS plugin. This also provides you with additional access to ARKit features such as world sensing (meshing), plane detection, hand tracking, etc. More information about visionOS ARKit features is well-documented here.

Using The visionOS 2D Windowed App Mode In Unity

Fig 1.0 - visionOS 2D Window Input Actions In Unity

This is fairly simple. Just ensure that you have the visionOS modules installed with your Unity 2022 LTS version or greater, and you should be good to go. Create a new 3D Unity project with URP, switch the platform to visionOS, and enable the new input system as an active input handler under player settings. These steps are all you need to get the rendering part working. If you need to capture input, you can use the Built-In touch features available with the new Input System.

Here's an example and steps of how you could capture a pinch/tap gesture or the pinch/tap position when using the 2D window app mode with the new input system:

  1. Install the Input System through the package manager.

  2. Create a Input Actions file.

  3. Create the following actions: (Fig 1.0 shows you how this looks like in the Unity Editor)

    • Action Name: TouchTap

      • ActionType=Button Binding=<Touchscreen>/primaryTouch/tap

    • Action Name: TouchPosition

      • ActionType=Value ControlType=Vector2 Binding=<Touchscreen>/position

    • Action Name: TouchDelta

      • ActionType=Value ControlType=Vector2 Binding=<Touchscreen>/delta

  4. Click on the generated Input Actions file from your Unity project panel and select “Generate C# Class” within the Unity inspector window. This will ensure we can access the input mappings through C# code.

  5. Review the example MonoBehaviour class WindowedInputHandler below as a reference on how to bind to these new inputs.

  6. Attach it to a new or existing game object.

  7. To test it you could use Unity’s Device Simulator which will mock touch inputs when hitting play in Unity within the Unity Editor.

using UnityEngine;
using UnityEngine.InputSystem;
public class WindowedInputHandler : MonoBehaviour
{
    private VisionOSInputActions visionOSInputActions;
    
    private void Awake()
    {
        visionOSInputActions = new VisionOSInputActions();
        visionOSInputActions.Enable();
        visionOSInputActions.VisionPro2D.TouchTap.performed += OnTouchTap;
        visionOSInputActions.VisionPro2D.TouchPosition.performed += OnTouchPositionChanged;
        visionOSInputActions.VisionPro2D.TouchDelta.performed += OnTouchDeltaChanged;
    }
    
    private void OnTouchTap(InputAction.CallbackContext context) 
        => Debug.Log($"Touch tap was executed");
    
    private void OnTouchPositionChanged(InputAction.CallbackContext context)
    {
        Vector2 touchPosition = context.ReadValue<Vector2>();
        Debug.Log($"Touch position value changed: {touchPosition}");
    }
    
    private void OnTouchDeltaChanged(InputAction.CallbackContext context)
        => Debug.Log($"Touch delta value changed");
    
    private void OnDestroy()
    {
        visionOSInputActions.VisionPro2D.TouchTap.performed -= OnTouchTap;
        visionOSInputActions.VisionPro2D.TouchPosition.performed -= OnTouchPositionChanged;
    }
}

Here’s a project I put together, ready to go and available on GitHub, which incorporates 2D Windows and Full Immersive VR.

visionOS 2D Windowed App Mode Scaling Issues

Currently, when you scale a 2D Window generated by Unity, you will notice that UI/3D content is not uniformly scaled. I initially thought this was an issue with some of my UI settings, but it turns out that the entire Window generated by the Unity Build is what was causing the problem. This is when I realized how grateful I am to be familiar with Objective-C and understand what Unity was doing with the generated Xcode project.

Here's an example of what is happening today, and also how it looks after applying a fix. I will cover the fix next.

Fig 1.3 - Unity visionOS Xcode generated project

To fix this scaling problem, we need to open up the UnityAppController.mm file in Xcode and add the code below, which should uniformly scale your window. For reference, I added an image as shown in Fig 1.3 to help you see which area needs this code. (Keep in mind that this is a temporary fix. If you rebuild your project, you may lose these changes and these lines will need to be added again)

UIWindowSceneGeometryPreferencesVision *preferences = [[UIWindowSceneGeometryPreferencesVision alloc] init];
    
preferences.resizingRestrictions = UIWindowSceneResizingRestrictionsUniform;
   
[_window.windowScene requestGeometryUpdateWithPreferences:preferences errorHandler:^(NSError * _Nonnull error) {
    // Handle error if any
    NSLog(@"Error occurred: %@", error);
}];

Using The Fully Immersive VR App Mode In Unity

Cool, so far we've briefly discussed the default 2D Window in Unity, input, and also a fix for scaling. But what about VR experiences with Unity for visionOS? Well, let's get into a few things I've learned. However, keep in mind that much of this is highly experimental. While you can implement a lot of functionalities through code, many features available in Unity with XRI are currently not functioning well. For instance, XRI Interactions (grab interactables near & far) with hands have many problems, but this is something I know Unity is currently working on based on a few discussions I had with their team.

Fig 1.4 - A very basic Full Immersive VR visionOS demo in Unity

Let start with the basics. So How Do We Create A VR Experience For VisionOS In Unity?

  1. For VR/Full Immersive, we need to ensure Unity 2022 LTS is installed and visionOS modules need to be added, just like what we did with a 2D Window.

  2. Add the following packages by going to Window > Package Manager (Unity Registry)

    • Input System

    • XR Interaction Toolkit

      • Under samples tab add XR Hands and Starter Assets

  3. Install the XR Plugin Management package as well as visionOS Plugin by going to Player Settings > XR Plug-in Management

  4. Under Player Settings > XR Plug-in Management > Apple visionOS

    • Set the App Mode to Virtual Reality - Fully Immersive Space

    • Populate Hand Tracking Usage Description to allow visionOS asking the user for hand tracking permissions.

    • If your experience requires world sensing (such as planes, meshing, etc) be sure to also add information to World Sensing Usage Description.

  5. The Depth Texture on your camera needs to be set to On.

  6. That’s it, I suggest adding a cube or a 3D shape to test this out at position X=0, Y=1,Z=5 and deploying. Optionally, you could drag and drop the XR Origin (VR) rig, populate the Left & Right Controller game objects with the Left & Right Hand Device Position(s) and Rotation(s). (A step by step tutorial of these steps are covered on this video and Fig 1.4 shows the results)

Fully Immersive VR Input In Unity

We covered inputs when discussing the 2D Window section, utilizing touches from the Input System. Similarly, in VR, we can add input mappings to our input actions file to detect actions like pinching with gaze or pinching and dragging. In this scenario, we can use what Unity calls a visionOS Spatial Pointer to determine precisely where in 3D space those actions occurred.

The Unity visionOS platform provides two types of Spatial Pointers: SpatialPointerDevice and VisionOSSpatialPointerDevice. Both are very similar, but in general, I followed Unity’s recommendation to use VisionOSSpatialPointerDevice for Fully Immersive VR. This is also what Unity says about them in their docs: “The primary difference between the two is that the interaction doesn't require colliders. Thus, VisionOSSpatialPointerDevice is missing input controls related to the interaction (targetId, interactionPosition, etc.).”

But how do we leverage these, especially the VR recommended version VisionOSSpatialPointerDevice, for input in Fully Immersive VR applications?

Well, let me walk you through how I configured this in my latest prototype:

Fig 1.5 - visionOS Spatial Pointers (primary, spatialPointer0, & spatialPointer1)

  1. Open the Input Actions file created in previous section, and add the following mappings:

    • Action Name: FirstSpatialPointer

      • ActionType=Value ControlType=Vision OS Spatial Pointer Binding=<VisionOSSpatialPointerDevice>/spatialPointer0

    • Action Name: SecondSpatialPointer

      • ActionType=Value ControlType=Vision OS Spatial Pointer Binding=<VisionOSSpatialPointerDevice>/spatialPointer1

    • Action Name: PrimarySpatialPointer

      • ActionType=Value ControlType=Vision OS Spatial Pointer Binding=<VisionOSSpatialPointerDevice>/primarySpatialPointer

By now, you should have the mappings shown in Fig 1.5. For simplicity, I didn't map any of the hand(s) transforms since I prefer to focus specifically on the Spatial Pointer(s). However, you can do so and refer to my original GitHub examples if you'd like to add additional hand features.

Now, how do we use these inputs? Let me demonstrate with an example where we can create a ray (with a line renderer) and perform a raycast against objects that have a specific layer mask.

using UnityEngine;
using UnityEngine.InputSystem;
using UnityEngine.XR.VisionOS;
using UnityEngine.XR.VisionOS.InputDevices;

public class SpatialHandInteractionHandler : MonoBehaviour
{
    [SerializeField] private GameObject spatialPointerPrefab;
    [SerializeField] private float spatialPointerDistance = 1000.0f;
    [SerializeField] private LayerMask layersUsedForInteractions;

    [SerializeField] private InputActionProperty spatialPointerProperty;

    private GameObject spatialPointer;
    private LineRenderer spatialPointerLine;
    private Transform selectedObject;

    private void Awake()
    {
        if (spatialPointerPrefab != null)
        {
            spatialPointer = Instantiate(spatialPointerPrefab, transform);
            spatialPointer.SetActive(false);
            var pointerLine = spatialPointer.transform.GetChild(0);
            spatialPointerLine = pointerLine.GetComponent<LineRenderer>();
        }
    }

    private void Update()
    {
        // update pointer info
        var pointerState = spatialPointerProperty.action.ReadValue<VisionOSSpatialPointerState>();
        
        // update pointer and ray
        UpdatePointer(pointerState, spatialPointer);
    }

    private void UpdatePointer(VisionOSSpatialPointerState pointerState, GameObject pointer)
    {
        var isPointerActive = pointerState.phase == VisionOSSpatialPointerPhase.Began ||
                              pointerState.phase == VisionOSSpatialPointerPhase.Moved;

        var pointerDevicePosition = transform.InverseTransformPoint(pointerState.inputDevicePosition);
        var pointerDeviceRotation = pointerState.inputDeviceRotation;
    
        pointer.gameObject.SetActive(pointerState.isTracked);
        pointer.transform.SetLocalPositionAndRotation(pointerDevicePosition, pointerDeviceRotation);

        spatialPointerLine.enabled = isPointerActive;
        if (isPointerActive)
        {
            spatialPointerLine.SetPosition(1, new Vector3(0,0, spatialPointerDistance));
            spatialPointerLine.transform.rotation = pointerState.startRayRotation;

            if (Physics.Raycast(pointerState.startRayOrigin, pointerState.startRayDirection, out RaycastHit hit, 
                    Mathf.Infinity, layersUsedForInteractions))
            {
                selectedObject = hit.transform;
                Debug.Log($"(Pointer_{spatialPointerProperty.action.name}) ray collided with ({selectedObject.name})");
            }
        }
    }
}

Fig 1.6 - Using visionOS Spatial Pointer Device information to build a 3D pointer with a ray

The results of adding the script above to a game object should resemble what I demonstrate in Fig 1.6. The only requirement is to create a spatialPointerPrefab, which can be any game object with a child game object containing a LineRenderer. Keep the LineRenderer to a max of 2 points, with a Line width of 0.01f, and with World Space set to false.

You could also extend this by adding hand visualizers or hand meshes, as shown in Fig 1.7 below. But the general idea is that you have access to the 3D touches that natively in visionOS require Gaze/Eye Tracking and a pinch, so all of that is abstracted away from us and given to developers in a very simple way.

Well, that’s all for today’s post. Honestly, I wanted to keep writing because there’s just so much information to provide about visionOS development in Unity, but let’s save that for the next post. I will begin to look into additional ARKit visionOS features with AR Foundation and visionOS for my next few videos and posts. However, if there’s something specific that you’re looking to learn, please let me know in the comments below.

Thank you for your time and happy XR coding, everyone!

Fig 1.7

Using visionOS Spatial Pointer Device with Hand Visualizers for both hands. It is hard to see but some of the joints movements are delayed since visionOS doesn’t send this information in real-time unlike other platforms.

Recommended Resources

  • The Step by Step YouTube video, which was the basis of this post, is available here.

  • GitHub project containing all images/GIFs shown in this post here.

  • For details on 3D Touch input, refer to the Unity Docs.

  • The VisionOS Spatial Pointer State Docs are also very helpful for finding out all the information available with this binding.

Previous
Previous

Object Tracking For visionOS Is Here - But How Does It Work?

Next
Next

OpenXR With Magic Leap 2 - Unity Setup, Plane Detection, And Gaze Features!