Visu3d - Transform (go/v3d-transform)#
If you’re new to v3d, please look at the intro first.
Installation#
We use same installation/imports as in the intro.
!pip install visu3d etils[ecolab] jax[cpu] tf-nightly tfds-nightly sunds
/bin/sh: line 1: pip: command not found
from __future__ import annotations
from etils.ecolab.lazy_imports import *
Transformations#
v3d
makes it easy to project back and forth across coordinates frames.
3d <> 3d#
v3d.Transform
stores the position, rotation and scale of an object.
It is used to transform objects (e.g. from world to camera 3d coordinates).
v3d.Transform
is composed of R
(rotation, scale) and t
(translation) component:
tr = v3d.Transform(
R=[ # Define a rigid rotation
[-1/3, -(1/3)**.5, (1/3)**.5],
[1/3, -(1/3)**.5, -(1/3)**.5],
[-2/3, 0, -(1/3)**.5],
],
t=[2, 2, 2],
)
# Fig display the (x, y, z) basis of the transformation
tr.fig
v3d.Transform
can be composed with all types of objects:
xnp.Array
You custom object (see
Protocol
section below)
Transformation is applied through Python __matmul__
operator: tr @ <obj>
v3d.make_fig([
tr,
tr @ np.array([[0, 0, 0], [1, 1, 1]]),
tr @ v3d.Point3d(p=[0, 0, 2], rgb=[255, 0, 0]),
tr @ v3d.Ray(pos=[0, 0, 0], dir=[0, 1, 1]),
tr @ v3d.Transform(R=np.eye(3), t=[0, 0, 3]),
])
Inverting a transformation is trivial:
tr.inv
Transform(
R=array([[-0.5 , 0.5 , -1. ],
[-0.86602545, -0.86602545, -0. ],
[ 0.57735026, -0.57735026, -0.57735026]], dtype=float32),
t=array([2. , 3.4641018, 1.1547005], dtype=float32),
)
tr.inv @ tr # `tr.inv @ tr` is identity
Transform(
R=array([[1. , 0. , 0. ],
[0. , 1. , 0. ],
[0. , 0. , 0.99999994]], dtype=float32),
t=array([0., 0., 0.], dtype=float32),
)
See the API for all properties (.matrix4x4
, .x_dir
, .y_dir
, .z_dir
,…).
3d <> 2d (Camera pixel projections)#
Let’s create a camera looking at the center.
# Camera looking at the center
cam = v3d.Camera.from_look_at(
spec=v3d.PinholeCamera.from_focal(
resolution=(128, 170),
focal_in_px=120,
),
pos=[2, -0.5, 1.7],
target=[0, 0, 0], # < TODO(epot): Rename end -> look_at
)
# Point cloud of arbitrary `(..., 3)` shape
rng = np.random.default_rng(0)
point_cloud = v3d.Point3d(
p=(rng.random((50, 50, 3)) - 0.5) * 3,
rgb=rng.integers(255, size=(50, 50, 3)),
)
We can project the 3d into 2d pixel coordinates using px_from_world
. It supports:
xnp.Array
:(..., 3) -> (..., 2)
Your custom objects (see
Protocol
section below)
# Convert (world 3d) -> (px 2d) coordinates
px_coord = cam.px_from_world @ point_cloud
Which is equivalent to:
# Convert (world 3d) -> (camera 3d) -> (px 2d) coordinates
px_coord = cam.spec.px_from_cam @ cam.cam_from_world @ point_cloud
v3d.Point2d
can be visualized in the pixel space:
# Truncate coordinates outside the screen
# Use `(w, h)` as pixels are in `(i, j)` coordinates
px_coord = px_coord.clip(min=0, max=cam.wh)
px_coord.fig
v3d.Point3d
-> v3d.Point2d
will preserve the depth and rgb values, which allows to project back to 3d without any information loss:
px_coord.flatten()[0]
Point2d(
p=array([ 63.00236, 118.86679], dtype=float32),
depth=array([3.1114159], dtype=float32),
rgb=array([ 48, 134, 92], dtype=uint8),
)
The transformation preserves the shape (*shape, 3)
-> (*shape, 2)
.
print(f'{point_cloud.p.shape} -> {px_coord.p.shape}')
(50, 50, 3) -> (50, 50, 2)
When the depth is missing, z=1
in camera coordinates:
px_coord = px_coord.replace(depth=None)
# Convert (px 2d) -> (world 3d) coordinates
projected_points = cam.world_from_px @ px_coord
v3d.make_fig([
point_cloud,
projected_points,
cam,
])
Supporting the Transform protocol#
To support v3d.Transform
, you only need to implement the apply_transform
protocol.
from etils.array_types import f32
class MyRay(v3d.DataclassArray):
pos: f32['*shape 3']
dir: f32['*shape 3']
def apply_transform(self, tr: v3d.Transform):
"""Supports `tr @ my_ray`."""
return self.replace(
pos=tr @ self.pos,
# `tr.apply_to_dir` only apply the rotation (tr.R), but NOT the
# translation (tr.t)
dir=tr.apply_to_dir(self.dir),
)
my_ray = MyRay(pos=[0, 0, 0], dir=[0, 0, 1])
cam.world_from_cam @ my_ray
MyRay(
pos=array([ 2. , -0.5, 1.7], dtype=float32),
dir=array([-0.74848115, 0.18712029, -0.636209 ], dtype=float32),
)
Similarly to support 3d <-> 2d
pixel projection, you need to implement the apply_px_from_cam
and apply_cam_from_px
protocols. See v3d.Point3d
for an implementation example.
For more info on how to create your custom v3d.DataclassArray
primitives, have look at the dataclass array tutorial.