diff --git a/documentation/README.md b/documentation/README.md index f3033f3..e544744 100644 --- a/documentation/README.md +++ b/documentation/README.md @@ -8,4 +8,6 @@ This folder contains documentation and other examples. * This Jupyter notebook describes the pinhole camera model, and the relationship between H-FOV and focal length. * [fisheye_camera_models.ipynb](./fisheye_camera_models.ipynb) * This Jupyter notebook describes typical fisheye camera models. +* [camera_pose_to_homography.ipynb](./camera_pose_to_homography.ipynb) + * This Jupyter notebook describes homography between a plane defined in a world coordinate frame and the camera. diff --git a/documentation/camera_pose_to_homography.ipynb b/documentation/camera_pose_to_homography.ipynb new file mode 100644 index 0000000..92fef5e --- /dev/null +++ b/documentation/camera_pose_to_homography.ipynb @@ -0,0 +1,686 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Camera Pose to Homography Conversion\n", + "\n", + "This Jupyter notebook demonstrates how to convert camera pose information, defined in a world coordinate system, into a homography mapping. The mapping relates a defined plane in the world coordinate system to the camera plane. We begin by explaining how to define a homography mapping for a plane aligned with the world coordinate system. Then, we address a more general case where the world coordinate frame and the plane of interest are not aligned. The motivation for the second case is as follows: imagine you need to map points between the camera plane and a plane defined in the world coordinate frame. However, this plane may not necessarily align with the world coordinate system. Such a mapping enables determining the 3D coordinates, within the plane, of a point observed in the camera." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Homography Between a Plane and the Camera\n", + "\n", + "In this case the plane $\\pi$ (the plane of interest) aligns with the XY-plane of the world coordinate system. Points in the image and the scene are related by a projective transformation. In Figure 1. $C$ is the camera centre, $x$ is the projection of the point $x_\\pi$ onto the camera plane. The point $x_\\pi$ lies on the XY-plane.\n", + "\n", + "
\n", + " \n", + "
Figure 1: Projective transformation from camera plane to a plane that aligns with the world coordinate frame.
\n", + "
\n", + "\n", + "The projective transformation $x=PX$ is a map from the world coordinate frame to a point in the image coordinate frame. We have placed the world coordinate frame so that the XY-plane aligns with the plane $\\pi$, meaning that the Z-coordinate is 0. Therefore we have\n", + "\n", + "$$\n", + "x=PX=\\begin{bmatrix}p_1 & p_2 & p_3 & p_4 \\end{bmatrix} \\begin{bmatrix}X \\\\ Y \\\\ 0 \\\\ 1 \\end{bmatrix} = \\begin{bmatrix}p_1 & p_2 & p_4 \\end{bmatrix} \\begin{bmatrix}X \\\\ Y \\\\ 1 \\end{bmatrix}\n", + "$$\n", + "\n", + ", and since the projection matrix $P$ of a calibrated camera is as follows\n", + "\n", + "$$\n", + "P = K\\begin{bmatrix}R & t\\end{bmatrix} = K\\begin{bmatrix}r_1 & r_2 & r_3 & t \\end{bmatrix}\n", + "$$\n", + "\n", + "Putting this information together, we have\n", + "\n", + "$$\n", + "x = K\\begin{bmatrix}r_1 & r_2 & r_3 & t \\end{bmatrix} \\begin{bmatrix}X \\\\ Y \\\\ 0 \\\\ 1 \\end{bmatrix} = K\\begin{bmatrix}r_1 & r_2 & t \\end{bmatrix} \\begin{bmatrix}X \\\\ Y \\\\ 1 \\end {bmatrix}\n", + "$$\n", + "\n", + "Since plane to plane mapping is a homography, we have the following\n", + "\n", + "$$\n", + "H = K\\begin{bmatrix}r_1 & r_2 & t \\end{bmatrix}\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Homography Between a Plane Defined in the World Coordinate Frame and the Camera\n", + "\n", + "In this case we want to find the homography between any plane defined in the world coordinate frame and the camera plane. We have the following information:\n", + "\n", + "* Camera calibration matrix: $K$\n", + " * We have either calibrated the camera or obtained the camera calibration matrix using other means.\n", + "* Camera pose in the world coordinate system: $^cT_w$\n", + " * We know the camera pose in the world coordinate system.\n", + "* Plane parameters: $\\pi = \\begin{bmatrix}\\pi_1 & \\pi_2 & \\pi_3 & \\pi_4\\end{bmatrix}$\n", + " * We have a known plane defined in the world coordinate frame\n", + "\n", + "We expect that the direction vector $e_{w3}=\\begin{bmatrix}0 & 0 & 1\\end{bmatrix}$ of the world coordinate frame intersects with the plane $\\pi$. In Figure 2. we have the following, $O_c$ is the camera centre, $O_w$ is the centre of the world coordinate frame and $O_\\pi$ is the centre of the plane $\\pi$.\n", + "\n", + "
\n", + " \n", + "
Figure 2: Projective transformation from the camera to a plane that does not align with XY-plane of the world coordinate frame.\n", + "
\n", + "
\n", + "\n", + "We define the transformation $\\left(^cT_w\\right)^{-1} \\left({}^{\\pi}T_w\\right) \\left(^cT_{\\pi}\\right)$ so that it maps points defined in the camera coordinate frame back to the camera coordinate frame, or more formally\n", + "\n", + "$$\n", + "x_c^{\\prime} = \\left( \\left(^cT_w\\right)^{-1} \\left({}^{\\pi}T_w\\right) \\left(^cT_{\\pi}\\right) \\right) x_c = x_c\n", + "$$\n", + "\n", + "The transformations are as follows:\n", + "\n", + "* $^cT_w$ transformation matrix from the world coordinate frame to the camera coordinate frame\n", + "* ${}^{\\pi}T_w$ transformation matrix from the world coordinate frame to the coordinate frame defined in the plane $\\pi$\n", + "* $^cT_{\\pi}$ transformation matrix from the coordinate frame defined in the plane $\\pi$ to the camera coordinate frame\n", + "\n", + "The transformation ${}^{\\pi}T_w$ is defined as follows\n", + "\n", + "$$\n", + "{}^{\\pi}T_w = \n", + "\\begin{bmatrix}\n", + "{}^{\\pi}R_w & {}^{\\pi}t_w \\\\\n", + "0 & 1\n", + "\\end{bmatrix}\n", + "$$\n", + "\n", + "Next steps are defining both the location and the orientation of the plane coordinate frame in the world coordinate frame.\n", + "\n", + "### Origin of the Plane $\\pi$\n", + "\n", + "The plane $\\pi$ is defined if the world coordinate frame using the plane equation that is define later on. In order for us to define a coordinate system on the plane itself, we need to define where the origin of the plane is. In order to simplify things, we place origin of the plane coordinate frame at the location where the Z-axis of the world coordinate frame, defined by $e_{w3}=\\begin{bmatrix}0 & 0 & 1\\end{bmatrix}$, intersects with the plane. Therefore, we parameterize a line in the direction of the coordinate vector $e_z=\\begin{bmatrix}0 & 0 & 1\\end{bmatrix}^t$ with respect to $t$ as follows\n", + "\n", + "$$\n", + "\\begin{bmatrix}X \\\\ Y \\\\ Z\\end{bmatrix} = O_w + e_z t = \\begin{bmatrix}o_x \\\\ o_y \\\\ o_z\\end{bmatrix} + \\begin{bmatrix}0 \\\\ 0 \\\\ 1\\end{bmatrix} t = \\begin{bmatrix}0 \\\\ 0 \\\\ 1\\end{bmatrix} t\n", + "$$\n", + "\n", + ", where $O_w$ is the origin of the world coordinate system. The plane equation is as follows:\n", + "\n", + "$$\n", + "\\pi_1 X + \\pi_2 Y + \\pi_3 Z + \\pi_4 = 0\n", + "$$\n", + "\n", + "Next we plug in the coordinates from the parameterized line equation and we get\n", + "\n", + "$$\n", + "t = -\\dfrac{\\pi_4}{\\pi_3}\n", + "$$\n", + "\n", + "$t$ is the length of the vector in the direction of $e_z$, and that is where the we place the origin of the plane $\\pi$.\n", + "\n", + "### Orientation of the Coordinate System on the Plane $\\pi$\n", + "\n", + "We want to align orientation of the plane coordinate system so that the Z-axis is aligned with the normal $n$ of the plane, while keeping X- and Y- directions aligned with the world coordinate system. Normal $n$ of a plane\n", + "\n", + "$$\n", + "\\pi = \\begin{bmatrix}\\pi_1 & \\pi_2 & \\pi_3 & \\pi_4\\end{bmatrix} \\begin{bmatrix} X \\\\ Y \\\\ Z \\\\ 1 \\end{bmatrix}\n", + "$$\n", + "\n", + "is\n", + "\n", + "$$\n", + "n = \\begin{bmatrix}\\pi_1 & \\pi_2 & \\pi_3\\end{bmatrix}\n", + "$$\n", + "\n", + "Basis vectors of the world coordinate system are\n", + "\n", + "$$\n", + "e_{w1} = \\begin{bmatrix}1\\\\0\\\\0\\end{bmatrix},\\quad e_{w2} = \\begin{bmatrix}0\\\\1\\\\0\\end{bmatrix},\\quad e_{w3} = \\begin{bmatrix}0\\\\0\\\\1\\end{bmatrix}\n", + "$$\n", + "\n", + "We define the basis vectors $e_{p1}$, $e_{p2}$ and $e_{p3}$ of the plane coordinate system using the plane normal vector $n$ and the basis vector $e_{w2}$, so \n", + "that the basis vector $e_{p3}$ it aligns with $n$\n", + "\n", + "$$\n", + "e_{p3} = \\dfrac{n}{||n||}\n", + "$$\n", + "\n", + "Then we define the basis vector $e_{p1}$ to be perpendicular to $e_{w2}$ and $e_{p3}$\n", + "\n", + "$$\n", + "e_{p1} = \\dfrac{e_{w2} \\times n}{||e_{w2} \\times n||}\n", + "$$\n", + "\n", + ", and finally the basis vector $e_{p2}$ is perpendicular to both $ep_1$ and $e_{p3}$\n", + "\n", + "$$\n", + "e_{p2} = \\dfrac{e_{p3} \\times e_{p1}}{||e_{p3} \\times e_{p1}||}\n", + "$$\n", + "\n", + "Thus, orientation of the plane coordinate system is\n", + "\n", + "$$\n", + "{}^{\\pi}R_w = \\begin{bmatrix}e_{p1} && e_{p2} && e_{p3}\\end{bmatrix}\n", + "$$\n", + "\n", + "### Combined Transformations\n", + "\n", + "The transformation from the plane $\\pi$ to the camera coordinate frame is defined as follows\n", + "\n", + "$$\n", + "{}^{c}T_{\\pi} = \n", + "\\begin{bmatrix}\n", + "{}^{c}R_{\\pi} & {}^{c}t_{\\pi} \\\\\n", + "0 & 1\n", + "\\end{bmatrix}\n", + "$$\n", + "\n", + ", where the rotation matrix and the translation vector are defined as follows\n", + "\n", + "$$\n", + "\\begin{aligned}\n", + "{}^{c}R_{\\pi} &= {}^{c}R_{w} \\left({}^{\\pi}R_{w} \\right)^{-1} \\\\\n", + "{}^{c}t_{\\pi} &= {}^{c}t_{w} + {}^{c}R_{w} {}^{w}t_{\\pi}\n", + "\\end{aligned}\n", + "$$\n", + "\n", + ", and\n", + "\n", + "$$\n", + "{}^{w}t_{\\pi} = - \\left({}^{\\pi}R_{w} \\right)^{-1} {}^{\\pi}t_{w}\n", + "$$\n", + "\n", + "### Homography\n", + "\n", + "When we calculate the homography\n", + "\n", + "$$\n", + "H = K\\begin{bmatrix}r_1 & r_2 & t \\end{bmatrix}\n", + "$$\n", + "\n", + "$r_1$ and $r_2$ are from ${}^{c}R_{\\pi}$ and $t$ is ${}^{c}t_{\\pi}$." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Testing\n", + "\n", + "In the following we create a test case for the homography mapping. World coordinate system origin is at $\\left(0,0,0\\right)$." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Jupyter environment detected. Enabling Open3D WebVisualizer.\n", + "[Open3D INFO] WebRTC GUI backend enabled.\n", + "[Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.\n" + ] + } + ], + "source": [ + "from python_camera_library.utils import rotation\n", + "import numpy as np\n", + "import math\n", + "import open3d as o3d" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Plane\n", + "\n", + "The equation of a plane with normal vector $n=(a,b,c)$ through the point $x_0=(x_0, y_0, z_0)$ is $n(x-x_0) = 0$. This gives the equation of a plane $ax + by + cz + d = 0$, where $d=-ax_0-by_0-cz_0$" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Plane normal vector:\n", + "[[0.17364818]\n", + " [0. ]\n", + " [0.98480775]]\n", + "Plane origin:\n", + "[[ 0]\n", + " [ 0]\n", + " [-2]]\n", + "Plane d-parameter:\n", + "1.969615506024416\n", + "Plane equation:\n", + "[0.17364818 0. 0.98480775 1.96961551]\n" + ] + } + ], + "source": [ + "# Plane normal\n", + "n = np.array([0, 0, 1]).reshape(3, 1)\n", + "R = rotation(rot_x=0, rot_y=math.radians(10), rot_z=0)\n", + "n = R@n\n", + "\n", + "# Plane origin\n", + "O_p = np.array([0, 0, -2]).reshape(3,1)\n", + "d = np.sum(-n*O_p)\n", + "\n", + "# Plane equation\n", + "plane_equation = np.append(n, d)\n", + "\n", + "print(f\"Plane normal vector:\\n{n}\")\n", + "print(f\"Plane origin:\\n{O_p}\")\n", + "print(f\"Plane d-parameter:\\n{d}\")\n", + "print(f\"Plane equation:\\n{plane_equation}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Intersection of the world coordinate frame with the plane:\n", + "[[-0.]\n", + " [-0.]\n", + " [-2.]]\n", + "R_w2p:\n", + "[[ 0.98480775 0. -0.17364818]\n", + " [ 0. 0.98480775 0. ]\n", + " [ 0.17364818 0. 0.98480775]]\n", + "t_w2p:\n", + "[[0.34729636]\n", + " [0. ]\n", + " [1.96961551]]\n", + "R_p2w:\n", + "[[ 0.98480775 0. 0.17364818]\n", + " [ 0. 0.98480775 0. ]\n", + " [-0.17364818 0. 0.98480775]]\n", + "t_p2w:\n", + "[[-0.68404029]\n", + " [ 0. ]\n", + " [-1.87938524]]\n" + ] + } + ], + "source": [ + "# Intersection of the world coordinate system with the plane\n", + "t = -plane_equation[-1]/plane_equation[-2]\n", + "intersection = np.array([0., 0., 1.]).reshape(3,1) * t\n", + "print(f\"Intersection of the world coordinate frame with the plane:\\n{intersection}\")\n", + "\n", + "# Orientation of the plane coordinate system\n", + "ew_1 = np.array([1., 0., 0.])\n", + "ew_2 = np.array([0., 1., 0.])\n", + "\n", + "ep_3 = n/np.linalg.norm(n)\n", + "ep_1 = np.cross(ew_2.flatten(), ep_3.flatten())\n", + "ep_1 /= np.linalg.norm(ep_1)\n", + "ep_2 = np.cross(ep_3.flatten(), ew_1.flatten())\n", + "ep_2 /= np.linalg.norm(ep_1)\n", + "\n", + "# R_w2p and t_w2p transform points from the world to plane coordinate frame\n", + "R_w2p = np.hstack((ep_1.reshape(3,1), ep_2.reshape(3,1), ep_3.reshape(3,1))).T\n", + "t_w2p = -R_w2p.T @ intersection.copy()\n", + "\n", + "print(f\"R_w2p:\\n{R_w2p}\")\n", + "print(f\"t_w2p:\\n{t_w2p}\")\n", + "\n", + "# R_p2w and t_p2w transform points from the plane to world coordinate frame\n", + "t_p2w = -R_w2p.T @ t_w2p\n", + "R_p2w = R_w2p.T\n", + "print(f\"R_p2w:\\n{R_p2w}\")\n", + "print(f\"t_p2w:\\n{t_p2w}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Test Transformations\n", + "\n", + "Test the world to plane to camera transformations:\n", + "1. First transfer the point of interest from world to plane and then from plane to camera coordinate frame\n", + "2. Calculate the combined transformation world -> camera\n", + "\n", + "Both transformations should produce same results." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "poi_w:\n", + "[[0.]\n", + " [0.]\n", + " [0.]]\n", + "poi_p:\n", + "[[0.34729636]\n", + " [0. ]\n", + " [1.96961551]]\n", + "R_w2c:\n", + "[[ 0. 1. 0.]\n", + " [ 0. 0. -1.]\n", + " [-1. 0. 0.]]\n", + "t_w2c:\n", + "[[0.]\n", + " [0.]\n", + " [3.]]\n", + "poi_w -> poi_c1: [0. 0. 0.] -> [0. 0. 3.]\n", + "poi_p -> poi_c2: [0.34729636 0. 1.96961551] -> [0. 0. 3.]\n" + ] + } + ], + "source": [ + "# Point of interest in the world coordinate frame\n", + "poi_w = np.array([0., 0., 0.]).reshape(3, 1)\n", + "\n", + "# Point of interest in the plane coordinate frame\n", + "poi_p = R_w2p @ poi_w + t_w2p\n", + "\n", + "print(f\"poi_w:\\n{poi_w}\")\n", + "print(f\"poi_p:\\n{poi_p}\")\n", + "\n", + "# R_w2c and t_w2c transform coordinates from the world coordinate frame to the camera coordinate frame\n", + "R_w2c = np.array([[0., 1., 0.], [0., 0., -1.], [-1., 0., 0.]]).reshape(3,3)\n", + "t_w2c = np.array([0., 0., 3.]).reshape(3,1)\n", + "print(f\"R_w2c:\\n{R_w2c}\")\n", + "print(f\"t_w2c:\\n{t_w2c}\")\n", + "\n", + "# R_p2c and t_p2c transform coordinates from the plane coordinate frame to the camera coordinate frame\n", + "R_p2c = R_w2c @ R_w2p.T\n", + "t_p2c = t_w2c - R_w2c @ R_w2p.T @ t_w2p\n", + "\n", + "# Transfer the poi_w to the camera coordinate frame\n", + "poi_c1 = R_w2c @ poi_w + t_w2c\n", + "\n", + "# Transfer the poi_p to the camera coordinate frame (poi_p is the same as poi_w but in the plane doordinate system)\n", + "poi_c2 = R_p2c @ poi_p + t_p2c\n", + "\n", + "# poi_c1 and poi_c2 should have exactly the same values\n", + "print(f\"poi_w -> poi_c1: {poi_w.flatten()} -> {poi_c1.flatten()}\")\n", + "print(f\"poi_p -> poi_c2: {poi_p.flatten()} -> {poi_c2.flatten()}\")\n", + "\n", + "# assert that they actually have the same values using assert_allclose\n", + "np.testing.assert_allclose(poi_c1, poi_c2, rtol=1e-5, atol=1e-15)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Test Projection Using Camera Calibration Matrix K and Homography\n", + "\n", + "We project the point of interest directly using the camera calibration matrix K and the homography matrix H. Both should produce the same results. Homography is defined as follows:\n", + "\n", + "$$\n", + "H = K \\begin{bmatrix}r_1 & r_2 & t_{p2c}\\end{bmatrix}\n", + "$$\n", + "\n", + ", where\n", + "\n", + "$$\n", + "R_{p2c} = \\begin{bmatrix}r_1 & r_2 & r_3\\end{bmatrix}\n", + "$$" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Point of interest poi2_p in the plane coordinate frame:\n", + "[[1. ]\n", + " [0.5]\n", + " [0. ]]\n", + "poi2_c projected to the camera using camera matrix K, uv1:\n", + "[[18.24236595]\n", + " [76.05989457]\n", + " [ 1. ]]\n", + "poi2_p projected to the camera using homography H, uv2:\n", + "[[18.24236595]\n", + " [76.05989457]\n", + " [ 1. ]]\n", + "uv2 projected to the plane using inverse of homography H:\n", + "[[1. ]\n", + " [0.5]\n", + " [1. ]]\n" + ] + } + ], + "source": [ + "# Point of interest defined in the plane coordinate frame\n", + "poi2_p = np.array([1., 0.5, 0.]).reshape(3,1)\n", + "print(f\"Point of interest poi2_p in the plane coordinate frame:\\n{poi2_p}\")\n", + "\n", + "# Transform poi2_p to the camera coordinate frame\n", + "poi2_c = R_p2c @ poi2_p + t_p2c\n", + "\n", + "# Camera matrix\n", + "K = np.array([[100., 0., 0.], [0., 100., 0.], [0., 0., 1.]]).reshape(3,3)\n", + "\n", + "# Homography from plane to the camera\n", + "H = K @ np.hstack((R_p2c[:, :2].reshape(3, 2), t_p2c))\n", + "\n", + "# Project the point to the camera using camera calibration matrix K\n", + "uv1 = K @ poi2_c\n", + "uv1 /= uv1[-1, -1]\n", + "print(f\"poi2_c projected to the camera using camera matrix K, uv1:\\n{uv1}\")\n", + "\n", + "# Project the point to the camera using homography H\n", + "poi2_p_homog = poi2_p[:2].copy()\n", + "poi2_p_homog = np.append(poi2_p_homog, 1.).reshape(3,1)\n", + "uv2 = H @ poi2_p_homog\n", + "uv2 /= uv2[-1, -1]\n", + "print(f\"poi2_p projected to the camera using homography H, uv2:\\n{uv2}\")\n", + "\n", + "# Assert that uv1 and uv2 are equal (close)\n", + "np.testing.assert_allclose(uv1, uv2, rtol=1e-5, atol=1e-15)\n", + "\n", + "# Backproject the point to the plane, this should have the same XY-coordinates as poi2_p\n", + "uv_backprojected = np.linalg.inv(H)@uv2\n", + "uv_backprojected /= uv_backprojected[-1, -1]\n", + "print(f\"uv2 projected to the plane using inverse of homography H:\\n{uv_backprojected}\")\n", + "\n", + "# Assert that the XY-coordinates are close\n", + "np.testing.assert_allclose(poi2_p[:2], uv_backprojected[:2], rtol=1e-5, atol=1e-10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Intersection Between a Camera Unit Vector and the Plane\n", + "\n", + "Here we calculate where the back projection of the $uv1$, in the camera coordinate frame, intersects with the plane. Since $uv1$ is projection of $poi2$, these should both produce the same point." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Length of the vector: 3.4268446227929985\n", + "Intersection:\n", + "[[ 0.30076747]\n", + " [ 0.49240388]\n", + " [-2.05303342]]\n", + "poi2_p:\n", + "[[1. ]\n", + " [0.5]\n", + " [0. ]]\n", + "poi2_p in world coordinate frame, poi2_w:\n", + "[[ 0.30076747]\n", + " [ 0.49240388]\n", + " [-2.05303342]]\n" + ] + } + ], + "source": [ + "# Vector in the direction of uv1 in the camera coordinate frame\n", + "vect_c = np.linalg.inv(K) @ uv1\n", + "vect_c /= np.linalg.norm(vect_c)\n", + "\n", + "R_c2w = R_w2c.T\n", + "t_c2w = -R_w2c.T @ t_w2c\n", + "\n", + "# Vector in the direction of uv1 in the world coordinate frame\n", + "vect_w = R_c2w @ vect_c\n", + "vect_w /= np.linalg.norm(vect_w)\n", + "\n", + "# From the plane equation\n", + "d = plane_equation[-1]\n", + "n = plane_equation[:3]\n", + "length_vect = -(d+np.sum((n.flatten()*t_c2w.flatten()))) / (np.sum(n.flatten()*vect_w.flatten()))\n", + "plane_intersection = t_c2w + length_vect*vect_w\n", + "print(f\"Length of the vector: {length_vect}\")\n", + "print(f\"Intersection:\\n{t_c2w + length_vect*vect_w}\")\n", + "\n", + "# Convert poi2_p to world coordinates\n", + "poi2_w = R_p2w @ poi2_p + t_p2w\n", + "print(f\"poi2_p:\\n{poi2_p}\")\n", + "print(f\"poi2_p in world coordinate frame, poi2_w:\\n{poi2_w}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "# Function to create a plane mesh with rotation and translation, visible from both sides\n", + "def create_plane(center, normal, width, height, R, t):\n", + " # Calculate two orthogonal vectors in the plane\n", + " if np.allclose(normal, [0, 0, 1]):\n", + " v1 = np.array([1, 0, 0])\n", + " else:\n", + " v1 = np.cross(normal, [0, 0, 1])\n", + " v1 /= np.linalg.norm(v1)\n", + "\n", + " v2 = np.cross(normal, v1)\n", + " v2 /= np.linalg.norm(v2)\n", + "\n", + " # Calculate the four corners of the plane\n", + " half_width = width / 2\n", + " half_height = height / 2\n", + " corners = np.array([\n", + " center + half_width * v1 + half_height * v2,\n", + " center - half_width * v1 + half_height * v2,\n", + " center - half_width * v1 - half_height * v2,\n", + " center + half_width * v1 - half_height * v2,\n", + " ])\n", + "\n", + " # Apply rotation and translation to the corners\n", + " corners = (R @ corners.T).T + t.reshape(1, 3)\n", + "\n", + " # Create the mesh\n", + " vertices = o3d.utility.Vector3dVector(corners)\n", + " triangles = o3d.utility.Vector3iVector([[0, 1, 2], [2, 3, 0]])\n", + " mesh = o3d.geometry.TriangleMesh()\n", + " mesh.vertices = vertices\n", + " mesh.triangles = triangles\n", + " mesh.compute_vertex_normals()\n", + "\n", + " # Add colors to the mesh for better visualization\n", + " mesh.paint_uniform_color([0.7, 0.7, 0.7]) # Light gray color\n", + "\n", + " return mesh\n", + "\n", + "# Function to create a sphere\n", + "def create_sphere(position, radius, color):\n", + " sphere = o3d.geometry.TriangleMesh.create_sphere(radius=radius)\n", + " sphere.compute_vertex_normals()\n", + " sphere.paint_uniform_color(color)\n", + " sphere.translate(position)\n", + " return sphere" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Red is the X-axis\n", + "Green is the Y-axis\n", + "Blue is the Z-axis\n" + ] + } + ], + "source": [ + "# Create visualization for the world coordinate frame\n", + "world_coordinate_frame = o3d.geometry.TriangleMesh.create_coordinate_frame(size=0.5, origin=[0, 0, 0])\n", + "\n", + "# Create visualization for the plane coordinate frame\n", + "plane_coordinate_frame = o3d.geometry.TriangleMesh.create_coordinate_frame(size=0.5, origin=O_p)\n", + "plane_coordinate_frame.rotate(R, center=O_p)\n", + "\n", + "# Create visualization for the camera coordinate frame\n", + "camera_coordinate_frame = o3d.geometry.TriangleMesh.create_coordinate_frame(size=0.5, origin=t_c2w)\n", + "camera_coordinate_frame.rotate(R_c2w, center=t_c2w)\n", + "\n", + "# Create the plane\n", + "plane = create_plane(center=O_p.flatten(), normal=n, width=2., height=2., R=np.eye(3), t=np.array([0,0,0]))\n", + "\n", + "# Shows POIs -> these should overlap, we should see only the red ball\n", + "poi = create_sphere(plane_intersection, 0.02, [0.0, 1.0, 0.0])\n", + "poi2 = create_sphere(poi2_w, 0.02, [1.0, 0.0, 0.0])\n", + "\n", + "# Draw everything\n", + "o3d.visualization.draw_geometries([world_coordinate_frame, plane, plane_coordinate_frame, camera_coordinate_frame, poi, poi2])\n", + "\n", + "print(\"Red is the X-axis\")\n", + "print(\"Green is the Y-axis\")\n", + "print(\"Blue is the Z-axis\")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "3d-computer-vision", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/documentation/images/camera_pose_to_homography.png b/documentation/images/camera_pose_to_homography.png new file mode 100644 index 0000000..c545878 Binary files /dev/null and b/documentation/images/camera_pose_to_homography.png differ diff --git a/documentation/images/camera_pose_to_homography_generic.jpg b/documentation/images/camera_pose_to_homography_generic.jpg new file mode 100644 index 0000000..5c0e127 Binary files /dev/null and b/documentation/images/camera_pose_to_homography_generic.jpg differ