๐Ÿงญ

What if your GPS couldn't tell left from right?

Imagine giving someone directions, but every time you say "turn left," there's a 50-50 chance they hear "turn right." That's basically what happens when computers try to find the hidden directions inside data โ€” and this paper figured out how to fix it.

โฌ‡๏ธ Scroll to begin the adventure โฌ‡๏ธ
๐Ÿงฑ

But First, You Need to Know...

The building blocks that make this paper make sense

๐ŸŽฏ What Are Eigenvectors?

You know how when you squeeze a ball of Play-Doh, it stretches in one direction and squishes in another? Well, data can do the same thing! When you have a big table of numbers (like measurements from sensors, or prices of things over time), the data naturally "stretches" in certain hidden directions.

Eigenvectors are those special stretching directions. Think of them as invisible arrows inside your data that say: "THIS is the most important direction!"

And each eigenvector has a buddy called an eigenvalue โ€” a number that tells you how much the data stretches in that direction. Bigger eigenvalue = more important direction.

๐Ÿ”„ What's a Rotation?

Imagine you're standing in a room pointing at a poster on the wall. Now spin yourself 45ยฐ. You're still standing in the same spot, but you're pointing in a new direction. That's a rotation โ€” you change direction without flipping anything.

Key fact: A rotation never creates a mirror image. Everything stays "the same handedness." Your left hand is still on your left side!

๐Ÿชž What's a Reflection?

Now imagine you're looking in a mirror. Your reflection is flipped โ€” your left hand appears to be on the right side. That's a reflection. In math, a reflection flips one direction while keeping the others the same.

The determinant of a matrix is like a handedness detector: if it equals +1, the matrix is a pure rotation (no mirror flip). If it equals -1, there's at least one hidden reflection inside.

๐Ÿ“ What's a Givens Rotation?

Imagine a lazy Susan on a dinner table. It only spins in one flat plane. A Givens rotation is exactly that โ€” a rotation that only acts in one 2D plane at a time, leaving everything else untouched. To rotate something in 3D or higher, you chain together multiple Givens rotations, one plane after another. Like adjusting a satellite dish: first tilt left-right, then tilt up-down.

๐Ÿ“ Arcsin vs. Arctan2: Two Ways to Measure Angles

The arcsin method is like a protractor that only shows you 0ยฐ to 90ยฐ (and their negatives). It covers a ฯ€ interval โ€” that's half a circle, or 180ยฐ.

The arctan2 method is like a full compass โ€” it shows you all 360ยฐ (a 2ฯ€ interval). It can tell the difference between pointing northeast vs. southwest, which arcsin cannot!

arcsin: 180ยฐ Half the picture ๐Ÿคท arctan2: 360ยฐ The FULL picture ๐ŸŽฏ

๐Ÿง  Quick Check!

An eigenvector with a BIG eigenvalue means the data stretches...

๐Ÿค”

The Great Eigenvector Mix-Up

Why computers get confused about which way arrows point

๐ŸŽฒ The Sign Ambiguity Problem

Here's a wild fact: when you ask a computer to find eigenvectors (using functions called svd or eig), the arrows it returns can randomly point forward or backward. It's like getting compass directions where north might randomly show up as south! ๐Ÿ˜ฑ

Mathematically, if v is an eigenvector, then -v (the exact opposite direction) is also a valid eigenvector. The computer just picks one randomly. This is called sign ambiguity.

๐Ÿ“ˆ Why This Matters: Watching Data Over Time

Imagine you're tracking how seven currency markets move every day (that's exactly what this paper does!). Each day, you compute eigenvectors to understand the hidden patterns. But if the arrows randomly flip day to day, your analysis looks like a seismograph during an earthquake โ€” even when nothing interesting is happening!

The original algorithm (from an earlier paper by the same author) fixed this by making all eigenvectors consistently oriented. It used the arcsin method to compute the rotation angles. But there was a catch...

๐ŸŒฏ The Angular Wrap-Around Problem

Because the arcsin method only covers 180ยฐ (half a circle), some vectors that were actually close together in space appeared to be on opposite sides of the measurement window. It's like trying to track something on a globe but only being allowed to see one hemisphere โ€” objects near the equator can seem to "jump" from one edge to the other!

Example: An eigenvector angle of 89ยฐ (just barely positive) wobbles to 91ยฐ due to noise. But arcsin can't represent 91ยฐ โ€” it wraps around to -89ยฐ! That's a jump of 178ยฐ even though the actual change was only 2ยฐ. YIKES!

๐ŸŸ  arcsin method

Range: -90ยฐ to +90ยฐ

Wrap-around: YES ๐Ÿ˜ฐ

Angles: All "minor" (small)

๐ŸŸข modified arctan2

Range: -180ยฐ to +180ยฐ

Wrap-around: NO ๐ŸŽ‰

First angles: "major" (full circle)

๐Ÿง  Checkpoint!

The main problem with the arcsin method is that it only covers...

๐Ÿ’ก

The Big Idea: A "Modified arctan2" Method

How a new perspective on handedness unlocked the full 360ยฐ

๐Ÿงฉ The Key Insight: Think Holistically!

The original algorithm was "opportunistic" โ€” at each step, if an eigenvector pointed the "wrong" way, it would immediately flip it with a reflection. The new approach is holistic โ€” it looks at the whole matrix's handedness (like checking whether the whole coordinate system is left-handed or right-handed) before deciding what to do.

The breakthrough: Instead of reflecting an eigenvector when it points the wrong way, you can rotate it through a major angle (more than 90ยฐ) to get it where it needs to go. This turns a "reflection + small rotation" into a single "big rotation."

๐Ÿ”‘ Reducible vs. Irreducible: Where the Magic Happens

Imagine you have 7 arrows to align (like in the paper's 7-currency example). You work through them one at a time:

The first 6 arrows are in reducible subspaces โ€” there's still room to rotate them into place. Think of it like having a swivel joint: you CAN rotate. For these, the new algorithm uses the modified arctan2 (full 360ยฐ angles) for the first rotation in each subspace.

The 7th (last) arrow is in an irreducible subspace โ€” there are no more axes to rotate around. It's like being stuck in a one-way corridor: you can only go forward or backward. If it's pointing the wrong way, the ONLY option is a reflection.

Result: The new reflection matrix is simply Stan = diag(1, 1, ..., ยฑ1). At most ONE reflection, and only in the last position! ๐ŸŽฏ

๐Ÿ“ The Math (Made Friendly!)

The system of equations the algorithm solves looks like this (in 4D):

cos(ฮธโ‚‚)ยทcos(ฮธโ‚ƒ)ยทcos(ฮธโ‚„) = aโ‚
sin(ฮธโ‚‚)ยทcos(ฮธโ‚ƒ)ยทcos(ฮธโ‚„) = aโ‚‚
sin(ฮธโ‚ƒ)ยทcos(ฮธโ‚„) = aโ‚ƒ
sin(ฮธโ‚„) = aโ‚„

The original solves bottom-up using arcsin. The new solves using modified arctan2:

ฮธโ‚,โ‚‚ = arctan2(aโ‚‚, aโ‚)   โ† full 360ยฐ!
ฮธโ‚,โ‚ƒ = arctan2(aโ‚ƒ, |aโ‚‚ยทcsc(ฮธโ‚‚)|)   โ† 180ยฐ only
ฮธโ‚,โ‚„ = arctan2(aโ‚„, |aโ‚ƒยทcsc(ฮธโ‚ƒ)|)   โ† 180ยฐ only

Only the first angle gets the full 360ยฐ treatment. The rest are guaranteed to have a positive projection on the main axis, so they're inherently limited to 180ยฐ. This is what makes the method work without breaking!

๐Ÿ”ข How Many Angles Are There?

For a system with N dimensions, the total number of embedded rotation angles is:

N ร— (N - 1) / 2
N = 7 โ†’ 21 angles (that's like 21 little compass settings!)

The arcsin Method

โœ… Checks each eigenvector one-by-one, flips if needed (sk = -1)

โœ… All angles are "minor" (within 180ยฐ)

โœ… Best for regression โ€” stabilizes sign of regression weights ฮฒ

โš ๏ธ Angular wrap-around on the ฯ€ interval

โš ๏ธ Can't distinguish directions on opposite sides of the hemisphere

๐Ÿ“ฆ Reflection matrix: S = diag(ยฑ1, ยฑ1, ..., ยฑ1) โ€” many possible reflections

The Modified arctan2 Method

โœ… Takes a holistic view of handedness

โœ… First rotation in each subspace spans full 360ยฐ

โœ… Best for directional statistics โ€” no angular wrap-around!

โœ… Can reveal outliers hidden by wrap-around

๐Ÿ“ฆ Reflection matrix: Stan = diag(1, 1, ..., ยฑ1) โ€” at most ONE reflection

โš ๏ธ Major rotations can create different sign patterns than arcsin

When to Use Which?

Use arcsin when...

You need stable signs for regression weights (ฮฒ)

Downstream calculations depend on eigenvector signs

Use arctan2 when...

You need to track pointing directions over time

You want full angular disambiguation

You need to spot outliers

๐Ÿง  Quick Quiz!

In the modified arctan2 method, the maximum number of reflections needed is...

๐Ÿ“Š

The Real-World Test: Currency Markets!

7 currencies, 23 days, hundreds of thousands of data points

๐Ÿ’ฑ The Dataset

The author analyzed futures data from the Chicago Mercantile Exchange (CME) โ€” that's one of the biggest financial markets in the world! ๐ŸŒŽ

0
FX Pairs
0
Business Days
0
Data Columns
0
Records/Day (max)

The seven currency pairs were:

๐Ÿ‡ช๐Ÿ‡บ/๐Ÿ‡บ๐Ÿ‡ธ EURUSD ๐Ÿ‡บ๐Ÿ‡ธ/๐Ÿ‡ฏ๐Ÿ‡ต USDJPY ๐Ÿ‡ฌ๐Ÿ‡ง/๐Ÿ‡บ๐Ÿ‡ธ GBPUSD ๐Ÿ‡บ๐Ÿ‡ธ/๐Ÿ‡จ๐Ÿ‡ญ USDCHF ๐Ÿ‡ฆ๐Ÿ‡บ/๐Ÿ‡บ๐Ÿ‡ธ AUDUSD ๐Ÿ‡ณ๐Ÿ‡ฟ/๐Ÿ‡บ๐Ÿ‡ธ NZDUSD ๐Ÿ‡จ๐Ÿ‡ฆ/๐Ÿ‡บ๐Ÿ‡ธ CADUSD

Each day had 14 columns: 7 for price quotes (bid-offer mid-price) and 7 for signed trades. After careful filtering, centering, standardizing, and mapping through a copula to Gaussian distributions, the data was ready for eigenanalysis via SVD (Singular Value Decomposition โ€” a way to find eigenvectors and eigenvalues).

๐Ÿ“ The Angle Matrix

For 7 dimensions, the N(N-1)/2 = 21 rotation angles are organized into a triangular matrix. Here's what it looks like:

0 ฮธโ‚โ‚‚ ฮธโ‚โ‚ƒ ฮธโ‚โ‚„ ฮธโ‚โ‚… ฮธโ‚โ‚† ฮธโ‚โ‚‡ โ† mode 1 (6 angles)
   0 ฮธโ‚‚โ‚ƒ ฮธโ‚‚โ‚„ ฮธโ‚‚โ‚… ฮธโ‚‚โ‚† ฮธโ‚‚โ‚‡ โ† mode 2 (5 angles)
      0 ฮธโ‚ƒโ‚„ ฮธโ‚ƒโ‚… ฮธโ‚ƒโ‚† ฮธโ‚ƒโ‚‡ โ† mode 3 (4 angles)
         0 ฮธโ‚„โ‚… ฮธโ‚„โ‚† ฮธโ‚„โ‚‡ โ† mode 4 (3 angles)
            0 ฮธโ‚…โ‚† ฮธโ‚…โ‚‡ โ† mode 5 (2 angles)
               0 ฮธโ‚†โ‚‡ โ† mode 6 (1 angle)
                  0   โ† mode 7 (irreducible!)

The first angle of each colored stripe (ฮธโ‚โ‚‚, ฮธโ‚‚โ‚ƒ, ฮธโ‚ƒโ‚„, etc.) can span the full 360ยฐ with the new method. All the other angles stay within 180ยฐ.

๐Ÿ“Š The Participation Score

How do you measure whether an eigenvector represents a "team effort" across all currencies, or just one dominating player? Enter the participation score (PS)!

It's calculated from the Inverse Participation Ratio (IPR): IPR = ฮฃviโด, and then PS = 1/(N ร— IPR).

All equal (PS=1)
1.0
Good mix (PSโ‰ˆ0.7)
0.7
One dominates (PS=1/7)
0.14

High PS = all currencies participate equally. Low PS = one currency dominates.

๐Ÿ”ฌ

The Results: This Is Where It Gets WILD ๐Ÿ”ฅ

What happened when the new algorithm was unleashed on real data

๐ŸŽฏ Finding #1: Informative vs. Noisy Modes โ€” Revealed!

The BIGGEST result: switching from arcsin to modified arctan2 let the researchers clearly see which eigenmodes carry real information and which are just noise.

๐Ÿ“ˆ Quotes (Price Data)

Mode 1
Highly directed โœจ
Mode 2
Somewhat directed
Mode 3
Somewhat directed
Mode 4
Random scatter ๐ŸŽฒ
Mode 5
Random scatter ๐ŸŽฒ
Mode 6
Random scatter ๐ŸŽฒ

๐Ÿ“Š Trades (Buy/Sell Data)

Mode 1
Highly directed โœจ
Mode 2
Random scatter ๐ŸŽฒ
Modes 3-6
All random ๐ŸŽฒ

๐ŸŽฏ Finding #2: Random Matrix Theory Agrees!

The Marฤenkoโ€“Pastur (MP) distribution is like a "noise detector" from a branch of math called Random Matrix Theory (RMT). It predicts exactly where eigenvalues would land if your data were pure random noise.

The MP distribution has edges at: ฮปยฑ = (1 ยฑ โˆšq)ยฒ, where q = N/T (features divided by records).

The match was excellent! Quote modes 1โ€“3 and trade mode 1 had eigenvalues outside the MP distribution (= real information!). All other modes fell inside the distribution (= indistinguishable from noise). This perfectly matched what the pointing directions showed! ๐ŸŽฏ

๐Ÿ“ˆ Quotes

3 modes outside MP

(modes 1, 2, 3 = informative)

4 modes inside MP

(modes 4โ€“7 = noise)

๐Ÿ“Š Trades

1 mode outside MP

(mode 1 = informative)

6 modes inside MP

(modes 2โ€“7 = noise)

๐Ÿ•ต๏ธ Finding #3: The CPI Outlier โ€” Caught!

On May 10, 2023, the US Consumer Price Index (CPI) was released โ€” a major economic event that markets were watching closely. The EURUSD currency pair reacted anomalously compared to the other six pairs.

With the old arcsin method, this outlier was hidden by angular wrap-around. With the new arctan2 method, it stuck out like a sore thumb! ๐Ÿ”ด The paper highlights it with a red circle in quote modes 2 and 3.

Real-world impact: Being able to spot outliers like this helps analysts separate "normal" market behavior from event-driven jumps. Removing EURUSD from the panel restored the outlier point to the normal cluster. That's powerful diagnostics!

๐ŸŽฏ Finding #4: The Haar Measure Connection

Random Matrix Theory also says that eigenvectors of purely random matrices should point uniformly in all directions on a hypersphere โ€” this is called the Haar measure. Think of it like throwing darts at a globe: they should land everywhere equally.

The noisy modes (4โ€“6 for quotes, 2โ€“6 for trades) scattered uniformly across the full 360ยฐ circle with the new method โ€” exactly what the Haar measure predicts! The informative modes, by contrast, clustered in specific directions.

๐Ÿง  Pop Quiz!

The May 10, 2023 CPI outlier was visible with...

โš–๏ธ

Stabilizing the Eigenvectors

Making the arrows stop wobbling โ€” two clever methods

๐Ÿ”„ Dynamic Stabilization: Averaging the Wobble

You know how a camera stabilizer keeps your video smooth even when your hand shakes? Dynamic stabilization does the same for eigenvectors!

The trick is to stack eigenvectors from multiple time steps end-to-end (like laying arrows tip-to-tail) and measure the resultant direction. This is the correct way to average directions! (Simply averaging angles would give wrong answers.)

The paper used a 5-point causal filter with weights h[n] = [1, 2, 3, 2, 1]/9 (made by convolving a 3-point box filter with itself). This gives an average delay of just 2 days.

Results: After filtering, the scatter of pointing directions was visibly reduced. Participation scores tightened and slightly increased. The wobble was tamed! ๐ŸŽ‰

๐Ÿ”’ Static Stabilization: Locking Down the Noise

For the noisy modes (whose directions are random and meaningless), why bother tracking their directions at all? Static stabilization sets all their rotation angles to zero, which is like saying: "Stop pointing randomly and just align with the identity matrix."

For quotes, this means keeping angles for modes 1โ€“3 and zeroing out modes 4โ€“7. The result: Vmodal = Rโ‚Rโ‚‚Rโ‚ƒ ยท I (modes 4, 5, 6 become identity rotations).

Important distinction from PCA: In PCA, you throw away noisy modes entirely. With static stabilization, all modes stay โ€” the noisy ones just get a fixed, stable direction. The eigenvector matrix remains full rank!

๐Ÿ“‰ The Impact on Correlations

When you stabilize the eigenvectors and reconstruct the correlation matrix (using Corr(P) = Vฬ„ ฮ›ฬ„ Vฬ„T, with normalization), the structure becomes cleaner and more persistent over time.

Original
(lots of structure)
โ†’
Dynamically
stabilized
โ†’
Dynamic + Static
(cleanest!) โœจ

The pairwise correlation dispersion (scatter) decreased at each stage. Blue circles (original) had the widest spread. Gray circles (dynamically stabilized) were tighter. Orange circles (fully stabilized) were the tightest!

๐Ÿ”— Connection to Ledoitโ€“Wolf Shrinkage

Ledoitโ€“Wolf shrinkage is a famous technique that blends a noisy correlation matrix with an identity matrix: ฮฃshr = ฮฑยทฮฃฬ‚ + (1-ฮฑ)ยทI, where ฮฑ controls how much to trust the data.

Static stabilization is like an even smarter version of this. Instead of shrinking the whole matrix, you can:

  1. Rotate away the informative modes (they're good โ€” keep them!)
  2. Apply Ledoitโ€“Wolf shrinkage to just the noisy modes
  3. Rotate the informative modes back

The effective shrinkage for static stabilization is ฮฑ = 0 โ€” maximum shrinkage on the noisy part!

๐Ÿง  Knowledge Check!

Static stabilization differs from PCA because...

๐ŸŒ

Why This Matters (Even to You!)

๐ŸŽฎ The Big Picture

Any time a computer needs to find patterns in data โ€” whether it's currencies, brain scans, climate data, or even video game physics โ€” it uses eigenanalysis. And every single time, those pesky sign flips and angular ambiguities can mess things up.

This paper gave us two tools for two different jobs:

๐Ÿ”ง arcsin method

Stabilize signs

โ†’ Better for regression, predictions, interpretability

๐Ÿ”ง modified arctan2

Full directional tracking

โ†’ Better for spotting patterns, outliers, understanding how systems evolve

Plus, the eigenvector stabilization techniques (dynamic filtering + static locking) help clean up noisy correlation matrices โ€” which is useful EVERYWHERE, from weather prediction to stock portfolios to music recommendation algorithms! ๐ŸŽต

๐Ÿ The Code Is Free!

Everything is implemented in the thucyd Python package (version 0.2.5+), available for free on PyPi and Conda-Forge. The source code is on GitLab under the Apache 2.0 license. Anyone can use it!

FREE
Open Source
2
Methods Available
Python
Language
๐ŸŽฎ

Interactive Explorer

Play around with the numbers yourself!

๐ŸŒ€ Angle Range Visualizer

Drag the angle and see how the arcsin and arctan2 methods handle it differently:

Angle: 100ยฐ
arcsin sees:
80ยฐ
(wrapped around! โš ๏ธ)
arctan2 sees:
100ยฐ
(correct! โœ…)

๐Ÿ“ Dimension Explorer

See how the number of angles, modes, and the noise boundary change with dimension size:

21
Total Angles
6
Reducible Modes
6
Full-360ยฐ Angles
15
Half-Circle Angles
21 angles is like having 21 different compass settings to describe the orientation of your data!
๐Ÿ“–

Big Words I Learned! ๐Ÿ†

Your collectible trading cards of science terms

Eigenvector
A special direction in which data naturally stretches or compresses.
๐ŸŽฏ Like finding the strongest wind direction in a storm.
Eigenvalue
A number that says how much the data stretches along an eigenvector.
๐Ÿ’จ How FAST the wind blows in that direction.
SVD (Singular Value Decomposition)
A method to break any data table into eigenvectors, eigenvalues, and projections.
๐Ÿ”ฌ Like splitting white light into a rainbow with a prism.
Givens Rotation
A rotation that only acts in one 2D plane at a time.
๐ŸŽก Like turning a single dial on a combination lock.
Determinant
A single number from a matrix: +1 means pure rotation, -1 means there's a hidden reflection.
๐Ÿชž A handedness detector: right-hand or left-hand?
Orthogonal / Orthonormal
Vectors at perfect right angles (orthogonal) with length 1 (orthonormal).
๐Ÿ“ Like the three axes of a perfectly aligned 3D printer.
Marฤenkoโ€“Pastur Distribution
The expected shape of eigenvalues from a purely random matrix.
๐ŸŽฐ The "this is just noise" zone.
Haar Measure
The uniform distribution of random eigenvectors on a hypersphere.
๐ŸŽฏ Darts thrown evenly across the surface of a globe.
Participation Score
Measures how evenly the elements of an eigenvector contribute. PS=1 means all equal.
โšฝ A team sport score: 1.0 = everyone plays, 0.14 = one star carries.
Reducible Subspace
A subspace where rotation can still be applied to align a vector.
๐Ÿšช A room with a spinning door โ€” you CAN rotate.
Irreducible Subspace
The last 1D subspace where no rotation is possible, only reflection.
๐Ÿšง A dead-end hallway โ€” you can only go forward or backward.
Ledoitโ€“Wolf Shrinkage
A technique to denoise correlation matrices by blending them with the identity matrix.
๐ŸŽš๏ธ Like a volume knob between "trust the data" and "assume nothing."
Copula
A mathematical tool to transform data distributions while preserving their dependency structure.
๐Ÿ”„ Like converting temperatures from Fahrenheit to Celsius โ€” the relationships stay the same.
SO(N) / O(N)
SO(N) = "special orthogonal group" = pure rotations. O(N) = orthogonal group = rotations + reflections.
๐Ÿ  SO(N) is the "no mirrors" club; O(N) lets mirrors in.