Algebraic Rotations and Stereography

Rotation can be rather challenging, even in two dimensional space. Typically the subject is approached from complicated-looking rotation matrices, with the occasional accompanying comment about complex numbers. The challenge only increases when examining three dimensional space, where everyone has different ideas about the best implementations. In what follows, I attempt to provide a derivation of 2D and 3D rotations based in intuition and without the use of trigonometry.

Complex Numbers: 2D Rotations

Multiplication between complex numbers is frequently analyzed as a combination of scaling (a ratio of two scales) and rotation (a point on the complex unit circle). However, the machinery used in these analyses usually expresses complex numbers in a polar form involving the complex exponential, concepts which leverage knowledge of trigonometry and some calculus. In fact, they obfuscate the algebraic properties of complex numbers when neither concept is truly necessary to understand points on the complex unit circle, or complex number multiplication in general.

A review of complex multiplication

A complex number $z = a + bi$ multiplies with another complex number $w = c + di$ in the obvious way: by distributing over addition.

zw = (a+bi)(c+di) = ac + bci + adi + bdi^2 = (ac -\ bd) + i(ad + bc)

z also has associated to it a conjugate $z^* = a -\ bi$. The product $zz^*$ is the norm $a^2 + b^2$, a positive real quantity whose square root is typically called the magnitude of the complex number. Taking the square root is frequently unnecessary, as many of the properties of this quantity do not depend on this normalization. One such property is that for two complex numbers z and w, the norm of the product is the product of the norm.

\begin{align*} (zw)(zw)^* &= (ac -\ bd)^2 + (ad + bc)^2 \\ &= a^2c^2 -\ 2abcd + b^2d^2 + a^2d^2 + 2abcd + b^2c^2 \\ &= a^2c^2 + b^2d^2 + a^2d^2 + b^2c^2 = a^2(c^2 + d^2) + b^2(c^2 + d^2) \\ &= (a^2 + b^2)(c^2 + d^2) \end{align*}

Conjugation also distributes over multiplication, and complex numbers possess multiplicative inverses. Also, the norm of the multiplicative inverse is the inverse of the norm.

\begin{gather*} z^*w^* = (a -\ bi)(c -\ di) = ac -\ bci -\ adi + bdi^2 \\ = (ac -\ bd) -\ i(ad + bc) = (zw)^* \\ {1 \over z} = {z^* \over zz^*} = {a -\ bi \over a^2 + b^2} \\ \left({1 \over z}\right) \left({1 \over z}\right)^* = {1 \over zz^*} = {1 \over a^2 + b^2} \\ \end{gather*}

A final interesting note about complex multiplication is the product $z^*w$

z^* w = (a -\ bi)(c + di) = ac -\ bci + adi -\ bdi^2 = (ac + bd) + i(ad -\ bc)

The real part is the same as the dot product $(a, c) \cdot (b, d)$ and the imaginary part looks like the determinant of $\pmatrix{a & b \\ c & d}$. This shows the inherent value of complex arithmetic, as both of these quantities have important interpretations in vector algebra.

Onto the Circle

If $zz^* = a^2 + b^2 = 1$, then z lies on the complex unit circle. Since the norm is 1, a general complex number w has the same norm as the product zw. Thus, multiplication by z results in the rotation that maps 1 to z. But how do you get such a point on the unit circle?

Simple. There are four distinct complex numbers directly related to any complex number w that are guaranteed to share its norm: $w, -w, w^*$, and $-w^*$. Therefore, dividing any of these numbers by any other of these numbers results in a complex number on the unit circle. Division by -w is boring and just results in -1. However, division by $w^*$ turns out to be important:

\begin{align*} z &= {w \over w^*} = \left( {a + bi \over a -\ bi} \right) \left( {a + bi \over a + bi} \right) = 1 \\[8pt] &= {(a^2 -\ b^2) + (2ab)i \over a^2 + b^2} \end{align*}

Dividing the numerator and denominator of this expression through by $a^2$ yields an expression in $t = b/a$, where t can be interpreted as the slope of the line joining 0 and w.

\begin{align*} {(a^2 -\ b^2) + (2ab)i \over a^2 + b^2} &= {1/a^2 \over 1/a^2} \cdot {(a^2 -\ b^2) + (2ab)i \over a^2 + b^2} \\ &= {(1 -\ b^2/a^2) + (2b/a)i \over 1 + b^2/a^2} = {(1 -\ t^2) + (2t)i \over 1 + t^2} \\ &= {1 -\ t^2 \over 1 + t^2} + i{2t \over 1 + t^2} = z(t) \end{align*}

Slopes range from $-\infty$ to $\infty$, which means that the variable t does as well. Moreover, if z is on the unit circle, then its reciprocal is:

{1 \over z} = {z^* \over zz^*} = z^*

In other words, for points already on the unit circle, $z / z^* = z^2$ describes the rotation from 1 to $z^2$. This demonstrates the decomposition of the rotation into two parts. For a general complex number w which may not lie on the unit circle, multiplication by this number dilates the complex plane as well rotating 1 toward the direction of w. Then, division by $w^*$ undoes the dilation and performs the rotation again. The proper description of the rotation described by $w/w^*$ is the combination of these half steps.

Two Halves, Twice Over

Plugging in some values for t reveals something else: $z(0) = 1 + 0i$, $z(1) = 0 + 1i$, and $z(-1) = 0 -\ 1i$. This shows that the semicircle in the right half-plane exists where $t \in (-1, 1)$. The other half of the circle comes from t with magnitude greater than 1. The point where $z(t) = -1$ does not properly exist unless it is allowed that $t = \pm \infty$.

Is it possible to get the other half into the nicer interval? No problem, just double the rotation.

\begin{align*} z(t)^2 &= (\Re[z]^2 -\ \Im[z]^2) + i(2\Re[z]\Im[z]) \\[8pt] &= {(1 -\ t^2)^2 -\ (2t)^2 \over (1 + t^2)^2} + i{2(1 -\ t^2)(2t) \over (1 + t^2)^2} \\[8pt] &= {1 -\ 6t^2 + t^4 \over 1 + 2t^2 + t^4} +i {4t -\ 4t^3 \over 1 + 2t^2 + t^4} \\ \end{align*}

Now the entire circle fits in the interval $t \in (-1, 1)$, and $z(-1)^2 = z(1)^2 = -1$. This action of squaring z means that as t ranges from $-\infty$ to $\infty$, the point $z^2$ loops around the circle twice, since the rotation z is applied twice.

I’d now like to go on a brief tangent and compare these expressions to their transcendental trigonometric counterparts. Graphing the real and imaginary parts separately shows how much they resemble $\cos(\pi t)$ and $\sin(\pi t)$ around 0.

Incidentally, an approximation of a function as the ratio of two polynomials is called a Padé approximant. Specifically, the real part (which approximates cosine) has order [4/4] and the imaginary part (which approximates sine) has order [3/4]. Ideally, the zeroes of the real part should occur at $t = \pm 1/2$, since $\cos(\pm \pi/2) = 0$. Instead, they occur at

0 = 1 -\ 6t^2 + t^4 = (t^2 -\ 2t -\ 1)(t^2 + 2t -\ 1) \implies t = \pm {1 \over \delta_s} = \pm(\sqrt 2 -\ 1) \approx \pm0.414

With a bit of polynomial interpolation, this can be rectified. A cubic interpolation is given by:

\begin{gather*} P(x) = ax^3 + bx^2 + cx + d \\ \begin{pmatrix} 1 & 0 & 0 & 0 \\ 1 & 1/2 & 1/4 & 1/8 \\ 1 & -1/2 & 1/4 & -1/8 \\ 1 & 1 & 1 & 1 \\ \end{pmatrix} \begin{pmatrix} d \\ c \\ b \\ a \end{pmatrix} = \begin{pmatrix} 0 \\ \sqrt 2 -\ 1 \\ -\sqrt 2 + 1 \\ 1 \end{pmatrix} \\ \implies \begin{pmatrix} d \\ c \\ b \\ a \end{pmatrix} = \begin{pmatrix} 0 \\ -3 + 8\sqrt 2/3 \\ 0 \\ 4 -\ 8\sqrt 2/3 \end{pmatrix} \implies P(x) \approx 0.229 x^3 + 0.771x \end{gather*}

The error over the interval [-1, 1] in these approximations is around 1% (RMS).

Notably, this interpolating polynomial is entirely odd, which gives it some symmetry about 0. Along with being a very good approximation, it has the feature that a rotation by an n^th of a turn is about $z(P(1/n))^2$. The approximation be improved further by taking z to higher powers and deriving another interpolating polynomial.

It is impossible to shrink this error to 0 because the derivative of the imaginary part at $t = 1$ would be $\pi$ with no error. But the imaginary part is a rational expression (technically with the interpolating polynomial, a ratio of two irrational numbers), so its derivative is also a rational expression. Since $\pi$ is transcendental, there must always be some error.

Even though it is arguably easier to interpret, there probably isn’t a definite benefit to this approximation. For example,

Padé approximants can be calculated directly from the power series for sine and cosine
There are no inverse functions like acos or asin to complement these approximations.
- This isn’t that bad, since I have refrained from describing rotations with an angle. Even so, the inverse functions have their use cases
Trigonometric functions can be hardware-accelerated (at least in some FPUs)
If no such hardware exists, approximations of sin and cos can be calculated by software beforehand and stored in a lookup table (which is also probably involved at some stage of the FPU).

Elaborating on the latter two points, using a lookup table means evaluation happens in roughly constant time. A best-case analysis of the above approximation, given some value t is

\begin{matrix} \text{Expression} & \text{Additions} & \text{Multiplications} & \text{Divisions} \\ \hline p = t \cdot (0.228763834 \cdot t \cdot t + 0.771236166) & 1 & 3 \\[4pt] q = p \cdot p & & 1 \\[4pt] r = 1 + q & 1 \\[4pt] c = {1 -\ q \over r} & 1 & & 1 \\[4pt] s = {2 \cdot p \over r} & 2 & 1 & 1 \\[4pt] real = c \cdot c -\ s \cdot s & 1 & 2 \\[4pt] imag = 2 \cdot c \cdot s & & 2 \\ \hline \Sigma & 6 & 9 & 2 \\ \end{matrix}

Or 17 FLOPs. On a more optimistic note, a Monte Carlo test of the above approximation on my computer yields promising results when compared with the GCC libmath implementations of sin and cos (also tested with cexp).

Timing for 100000 math.h sin and cos:   5409619ns
Timing for 100000 approximations:       2408645ns
Approximation faster, speedup: 3000974ns (2.25x)
Squared error in cosines:
        Average: 0.000051 (0.713743% error)
        Largest: 0.000174 (1.320551% error)
                Input:          0.729202
                Value:          -0.659428
                Approximation:  -0.672634
Squared error in sines:
        Average: 0.000070 (0.835334% error)
        Largest: 0.000288 (1.698413% error)
                Input:          0.729202
                Value:          0.475669
                Approximation:  0.458685

For the source which I used to generate this output, see the repository linked below.

Even though results can be inaccurate, this exercise in the algebraic manipulation of complex numbers is fairly interesting since it requires no calculus to define, unlike sine, cosine, and their Padé approximants (and to a degree, $\pi$).

From Circles to Spheres

The expression for complex points on the unit circle coincides with the stereographic projection of a circle. This method is achieved by selecting a point on the circle, fixing t along a line (such as the y-axis), and letting t range over all possible values.

The equations of the line and circle appear in the diagram above. Through much algebra, expressions for x and y in t can be formed.

\begin{align*} y &= t(x + 1) \\ 1 &= x^2 + y^2 = x^2 + (tx + t)^2 \\ &= x^2 + t^2x^2 + 2t^2 x + t^2 \\ {1 \over t^2 + 1} -\ {t^2 \over t^2 + 1} &= x^2 + {t^2 \over t^2 + 1} 2x \\ {1 -\ t^2 \over t^2 + 1} + \left( {t^2 \over t^2 + 1} \right)^2 &= \left(x + {t^2 \over t^2 + 1} \right)^2 + {t^2 \over t^2 + 1} \\ {1 -\ t^4 \over (t^2 + 1)^2} + {t^4 \over (t^2 + 1)^2} &= \left(x + {t^2 \over t^2 + 1} \right)^2 \\ {1 \over (t^2 + 1)^2} &= \left(x + {t^2 \over t^2 + 1} \right)^2 \\ {1 \over t^2 + 1} &= x + {t^2 \over t^2 + 1} \\ x &= {1 -\ t^2 \over 1 + t^2} \\ y &= t(x + 1) = t\left( {1 -\ t^2 \over 1 + t^2} + {1 + t^2 \over 1 + t^2} \right) \\ &= {2t \over 1 + t^2} \end{align*}

These are exactly the same expressions which appear in the real and imaginary components of the ratio between $1 + ti$ and its conjugate. Compared to the algebra using complex numbers, this is quite a bit more work. Therefore, one might also ask whether, given a proper arithmetic setting, the projection of a sphere might be obtained in the same manner as the circle with respect to complex numbers.

What we are **not** doing: intersecting the sphere with a line through a pole and a point in the xy plane

Algebraic Projection

The simplest thing to do is try it out. Let’s say an extended number $h = 1 + si + tj$ has a conjugate of $h^* = 1 -\ si -\ tj$. It’s a bit of an assumption to be able to take the reciprocal of the latter (we don’t know whether i and j create a field or division ring), but following our noses:

\begin{gather*} {h \over h^*} = {1 + si + tj \over 1 -\ si -\ tj} = \left({1 + si + tj \over 1 -\ si -\ tj}\right) \left({1 + si + tj \over 1 + si + tj}\right) = \\[8pt] { 1 + si + tj + si + s^2i^2 + stij + tj + stji + t^2j^2 \over 1 -\ si -\ tj + si -\ s^2 i^2 -\ stij + tj -\ stji -\ t^2j^2 } \end{gather*}

While there is some cancellation in the denominator, both products are rather messy. To get more cancellation, we apply some nice algebraic properties between i and j. For the $stij$ terms to cancel, we let i and j be anticommutative, meaning that $ij = -ji$. So that the denominator is totally real (and therefore can be guaranteed to divide), we also apply the constraint that $i^2$ and $j^2$ are both real. Then the expression becomes

{1 + si + tj \over 1 -\ si -\ tj} = {1 + s^2i^2 + t^2j^2 \over 1 -\ s^2 i^2 -\ t^2j^2} + i{2s \over 1 -\ s^2 i^2 -\ t^2j^2} + j{2t \over 1 -\ s^2 i^2 -\ t^2j^2}

If it is chosen that $i^2 = j^2 = -1$, this produces the correct equations (Wikipedia) parametrizing the unit sphere. One can also check that the squares of the real, i, and j components sum to 1.

If both i and j anticommute, then their product also anticommutes with both i and j.

\textcolor{red}{ij}j = -j\textcolor{red}{ij} ~,~ i\textcolor{red}{ij} = -\textcolor{red}{ij}i

Calling this product k and noticing that $k^2 = (ij)(ij) = -ijji = i^2 = -1$, this completely characterizes the quaternions.

Choosing different values for $i^2$ and $j^2$ yield different shapes than a sphere. If you know a little group theory, you might know there are only two nonabelian (noncommutative) groups of order 8: the quaternion group and the dihedral group of degree 4. In the latter group, j and k are both imaginary, but square to 1 (i still squares to -1). I’m being a bit careless with the meanings of “1” and “-1” here, but these objects come from the additive structure of the algebra, rather than the multiplicative.

Changing the sign of one (or both) of the imaginary squares in the expression $h/h^*$ above switches the multiplicative structure from quaternions to the dihedral group. In this group, picking $i^2 = -j^2 = -1$ parametrizes a hyperboloid of one sheet, and picking $i^2 = j^2 = 1$ parametrizes a hyperboloid of two sheets.

Quaternions and Rotation

Complex numbers are useful for describing 2D rotations and elegantly describe the stereographic projection of a circle. Consequently, since quaternions elegantly describe the stereographic projection of a sphere, they are useful for 3D rotations.

Since the imaginary units i, j, and k are anticommutative, a general quaternion does not commute with other quaternions like complex numbers do. This embodies a difficulty with 3D rotations in general: unlike 2D rotations, they do not commute.

Only three of the four components in a quaternion are necessary to describe a point in 3D space. The three imaginary axes are symmetric in some way (i.e., they all square to -1), so we use the ijk subspace. Quaternions in this subspace are called vectors. Another reason for using the ijk subspace comes from considering a point $u = ai + bj + ck$ on the unit sphere ($a^2 + b^2 + c^2 = 1$). Then the square of this point is:

\begin{align*} u^2 &= (ai + bj + ck)(ai + bj + ck) \\ &= a^2i^2 + abij + acik + abji + b^2j^2 + bcjk + acki + bckj + c^2k^2 \\ &= (a^2 + b^2 + c^2)(-1) + ab(ij + ji) + ac(ik + ki) + bc(jk + kj) \\ &= -1 + 0 + 0 + 0 \end{align*}

This property means that u behaves similarly to a typical imaginary unit. If we form pseudo-complex numbers of the form $a + b u$, then ${1 + tu \over 1 -\ tu} = \alpha + \beta u$ specifies some kind of rotation in terms of the parameter t. Identically as with complex numbers, the inverse rotation is the conjugate $1 -\ tu \over 1 + tu$.

Another useful feature of quaternion algebra deals with symmetry transformations of the imaginary units. If an imaginary unit (one of i, j, k) is left-multiplied by a quaternion $q$ and right-multiplied by its conjugate $q^*$, then the result is still imaginary. In other words, if p is a vector, then $qpq^*$ is also a vector since i, j, and k form a basis and multiplication distributes over addition. I will not demonstrate this fact, as it requires a large amount of algebra.

These two features combine to characterize a transformation for a vector quaternion. For a vector p, if $q_u(t)$ is a “rotation” as above for a point u on the unit sphere, then another vector (the image of p) can be described by

\begin{gather*} qpq^* = qpq^{-1} = \left({1 + tu \over 1 -\ tu}\right) p \left({1 -\ tu \over 1 + tu}\right) = (\alpha + \beta u)p(\alpha -\ \beta u) \\[4pt] = (\alpha + \beta u)(\alpha p -\ \beta pu) = \alpha^2 p -\ \alpha \beta pu + \alpha \beta up -\ \beta^2 upu \end{gather*}

Testing a Transformation

If this transformation is a rotation, then it should recreate the 2D case which can be described more simply by complex numbers. For example, if $u = k$ and $p = xi + yj$, then this can be expanded as:

\begin{align*} qpq^{-1} &= (\alpha + \beta k)(xi + yj)(\alpha -\ \beta k) \vphantom{\over} \\ &= \alpha^2(xi + yj) -\ \alpha \beta (xi + yj)k + \alpha \beta k(xi + yj) -\ \beta^2 k(xi + yj)k \vphantom{\over} \\ &= \alpha^2(xi + yj) -\ \alpha \beta(-xj + yi) + \alpha \beta(xj -\ yi) -\ \beta^2(xi + yj) \vphantom{\over} \\ &= (\alpha^2 -\ \beta^2)(xi + yj) + 2\alpha \beta(xj - yi) \vphantom{\over} \\ &= [(\alpha^2 -\ \beta^2)x -\ 2\alpha \beta y]i + [(\alpha^2 -\ \beta^2)y + 2\alpha \beta x]j \end{align*}

If p is rewritten as a vector (in the linear algebra sense), then this final expression can be rewritten as a linear transformation of p:

\begin{pmatrix} (\alpha^2 -\ \beta^2)x -\ 2 \alpha \beta y \\ (\alpha^2 -\ \beta^2)y + 2 \alpha \beta x \end{pmatrix} = \begin{pmatrix} \alpha^2 -\ \beta^2 & -2\alpha \beta \\ 2\alpha \beta & \alpha^2 -\ \beta^2 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}

The matrix which appears above is the square of the matrix $\pmatrix{\alpha & -\beta \\ \beta & \alpha}$. This is isomorphic to the complex number $\alpha + \beta i$. Since $\alpha$ and $\beta$ were chosen so to lie on the (pseudo-) complex unit circle, this means that the vector (x, y) is rotated by $\alpha + \beta i$ twice. This also demonstrates that u specifies the axis of rotation, or equivalently, the plane of a great circle. This means that the “some kind of rotation” specified by $q = {1 + tu \over 1 -\ tu}$ is a rotation around the great circle normal to u.

The double rotation resembles the half-steps of rotation in the complex numbers, where $1 + ti$ performs a dilation, ${1 \over 1 -\ ti}$ reverses it, and together they describe a rotation. Since quaternions do not commute, sandwiching p between q and its conjugate makes the algebra simpler. The product of a vector with itself should be totally real, as shown above (and equal to the negative of the norm of p, the sum of the squares of its components). The rotated p shares the same norm as p, so it should equal $p^2$. Indeed, this is the case, as

(qpq^{-1})^2 = (qpq^{-1})(qpq^{-1}) = q p p q^{-1} = p^2 q q^{-1} = p^2

As a final remark, due to rotation through quaternions being doubled, the interpolating polynomial used above to smooth out double rotations can also be used here. That is, $q_u(P(t))pq_u^{-1}(P(t))$ rotates p by (approximately) the fraction of a turn specified by t through the great circle which intersects the plane normal to u.

Closing

It is difficult to find a treatment of rotations which does not hesitate to use the complex exponential $e^{ix} = \cos(x) + i \sin(x)$. The true criterion for rotations is simply that a point lies on the unit circle. Perhaps this contributes to why understanding the 3D situation with quaternions is challenging for many. Past the barrier to entry, I believe them to be rather intuitive. I have outlined some futher benefits of this approach in this post.

As a disclaimer, this post was inspired by this video by 3blue1brown on YouTube (at least subconsciously; I had not seen the video in years before checking that I wasn’t just plagiarizing it). It also uses stereographic projections of the circle and sphere to describe rotations in the complex plane and quaternion vector space. However, I feel like it fails to provide an algebraic motivation for quaternions, or even stereography in the first place. Hopefully, my remarks on the algebraic approach can be used to augment the information in the video.

Diagrams created with GeoGebra. Repository with approximations (as well as GeoGebra files) available here.