🧭 The Eigenvector Compass: Making Math Point the Right Way!

🧱

But First, You Need to Know...

The building blocks that make this paper make sense

🎯 What Are Eigenvectors?

You know how when you squeeze a ball of Play-Doh, it stretches in one direction and squishes in another? Well, data can do the same thing! When you have a big table of numbers (like measurements from sensors, or prices of things over time), the data naturally "stretches" in certain hidden directions.

Eigenvectors are those special stretching directions. Think of them as invisible arrows inside your data that say: "THIS is the most important direction!"

And each eigenvector has a buddy called an eigenvalue — a number that tells you how much the data stretches in that direction. Bigger eigenvalue = more important direction.

🔄 What's a Rotation?

Imagine you're standing in a room pointing at a poster on the wall. Now spin yourself 45°. You're still standing in the same spot, but you're pointing in a new direction. That's a rotation — you change direction without flipping anything.

Key fact: A rotation never creates a mirror image. Everything stays "the same handedness." Your left hand is still on your left side!

🪞 What's a Reflection?

Now imagine you're looking in a mirror. Your reflection is flipped — your left hand appears to be on the right side. That's a reflection. In math, a reflection flips one direction while keeping the others the same.

The determinant of a matrix is like a handedness detector: if it equals +1, the matrix is a pure rotation (no mirror flip). If it equals -1, there's at least one hidden reflection inside.

📐 What's a Givens Rotation?

Imagine a lazy Susan on a dinner table. It only spins in one flat plane. A Givens rotation is exactly that — a rotation that only acts in one 2D plane at a time, leaving everything else untouched. To rotate something in 3D or higher, you chain together multiple Givens rotations, one plane after another. Like adjusting a satellite dish: first tilt left-right, then tilt up-down.

📏 Arcsin vs. Arctan2: Two Ways to Measure Angles

The arcsin method is like a protractor that only shows you 0° to 90° (and their negatives). It covers a π interval — that's half a circle, or 180°.

The arctan2 method is like a full compass — it shows you all 360° (a 2π interval). It can tell the difference between pointing northeast vs. southwest, which arcsin cannot!

🧠 Quick Check!

An eigenvector with a BIG eigenvalue means the data stretches...

🤔

The Great Eigenvector Mix-Up

Why computers get confused about which way arrows point

🎲 The Sign Ambiguity Problem

Here's a wild fact: when you ask a computer to find eigenvectors (using functions called svd or eig), the arrows it returns can randomly point forward or backward. It's like getting compass directions where north might randomly show up as south! 😱

Mathematically, if v is an eigenvector, then -v (the exact opposite direction) is also a valid eigenvector. The computer just picks one randomly. This is called sign ambiguity.

📈 Why This Matters: Watching Data Over Time

Imagine you're tracking how seven currency markets move every day (that's exactly what this paper does!). Each day, you compute eigenvectors to understand the hidden patterns. But if the arrows randomly flip day to day, your analysis looks like a seismograph during an earthquake — even when nothing interesting is happening!

The original algorithm (from an earlier paper by the same author) fixed this by making all eigenvectors consistently oriented. It used the arcsin method to compute the rotation angles. But there was a catch...

🌯 The Angular Wrap-Around Problem

Because the arcsin method only covers 180° (half a circle), some vectors that were actually close together in space appeared to be on opposite sides of the measurement window. It's like trying to track something on a globe but only being allowed to see one hemisphere — objects near the equator can seem to "jump" from one edge to the other!

Example: An eigenvector angle of 89° (just barely positive) wobbles to 91° due to noise. But arcsin can't represent 91° — it wraps around to -89°! That's a jump of 178° even though the actual change was only 2°. YIKES!

🟠 arcsin method

Range: -90° to +90°

Wrap-around: YES 😰

Angles: All "minor" (small)

🟢 modified arctan2

Range: -180° to +180°

Wrap-around: NO 🎉

First angles: "major" (full circle)

🧠 Checkpoint!

The main problem with the arcsin method is that it only covers...

💡

The Big Idea: A "Modified arctan2" Method

How a new perspective on handedness unlocked the full 360°

🧩 The Key Insight: Think Holistically!

The original algorithm was "opportunistic" — at each step, if an eigenvector pointed the "wrong" way, it would immediately flip it with a reflection. The new approach is holistic — it looks at the whole matrix's handedness (like checking whether the whole coordinate system is left-handed or right-handed) before deciding what to do.

The breakthrough: Instead of reflecting an eigenvector when it points the wrong way, you can rotate it through a major angle (more than 90°) to get it where it needs to go. This turns a "reflection + small rotation" into a single "big rotation."

🔑 Reducible vs. Irreducible: Where the Magic Happens

Imagine you have 7 arrows to align (like in the paper's 7-currency example). You work through them one at a time:

The first 6 arrows are in reducible subspaces — there's still room to rotate them into place. Think of it like having a swivel joint: you CAN rotate. For these, the new algorithm uses the modified arctan2 (full 360° angles) for the first rotation in each subspace.

The 7th (last) arrow is in an irreducible subspace — there are no more axes to rotate around. It's like being stuck in a one-way corridor: you can only go forward or backward. If it's pointing the wrong way, the ONLY option is a reflection.

Result: The new reflection matrix is simply S_tan = diag(1, 1, ..., ±1). At most ONE reflection, and only in the last position! 🎯

📐 The Math (Made Friendly!)

The system of equations the algorithm solves looks like this (in 4D):

      cos(θ₂)·cos(θ₃)·cos(θ₄) = a₁

      sin(θ₂)·cos(θ₃)·cos(θ₄) = a₂

      sin(θ₃)·cos(θ₄) = a₃

      sin(θ₄) = a₄

The original solves bottom-up using arcsin. The new solves using modified arctan2:

      θ₁,₂ = arctan2(a₂, a₁)   ← full 360°!

      θ₁,₃ = arctan2(a₃, |a₂·csc(θ₂)|)   ← 180° only

      θ₁,₄ = arctan2(a₄, |a₃·csc(θ₃)|)   ← 180° only

Only the first angle gets the full 360° treatment. The rest are guaranteed to have a positive projection on the main axis, so they're inherently limited to 180°. This is what makes the method work without breaking!

🔢 How Many Angles Are There?

For a system with N dimensions, the total number of embedded rotation angles is:

N × (N - 1) / 2

🎮 Drag to explore! How many dimensions?

N = 7 → 21 angles (that's like 21 little compass settings!)

The arcsin Method

✅ Checks each eigenvector one-by-one, flips if needed (s_k = -1)

✅ All angles are "minor" (within 180°)

✅ Best for regression — stabilizes sign of regression weights β

⚠️ Angular wrap-around on the π interval

⚠️ Can't distinguish directions on opposite sides of the hemisphere

📦 Reflection matrix: S = diag(±1, ±1, ..., ±1) — many possible reflections

The Modified arctan2 Method

✅ Takes a holistic view of handedness

✅ First rotation in each subspace spans full 360°

✅ Best for directional statistics — no angular wrap-around!

✅ Can reveal outliers hidden by wrap-around

📦 Reflection matrix: S_tan = diag(1, 1, ..., ±1) — at most ONE reflection

⚠️ Major rotations can create different sign patterns than arcsin

When to Use Which?

Use arcsin when...

You need stable signs for regression weights (β)

Downstream calculations depend on eigenvector signs

Use arctan2 when...

You need to track pointing directions over time

You want full angular disambiguation

You need to spot outliers

🧠 Quick Quiz!

In the modified arctan2 method, the maximum number of reflections needed is...

📊

The Real-World Test: Currency Markets!

7 currencies, 23 days, hundreds of thousands of data points

💱 The Dataset

The author analyzed futures data from the Chicago Mercantile Exchange (CME) — that's one of the biggest financial markets in the world! 🌎

FX Pairs

Business Days

Data Columns

Records/Day (max)

The seven currency pairs were:

🇪🇺/🇺🇸 EURUSD 🇺🇸/🇯🇵 USDJPY 🇬🇧/🇺🇸 GBPUSD 🇺🇸/🇨🇭 USDCHF 🇦🇺/🇺🇸 AUDUSD 🇳🇿/🇺🇸 NZDUSD 🇨🇦/🇺🇸 CADUSD

Each day had 14 columns: 7 for price quotes (bid-offer mid-price) and 7 for signed trades. After careful filtering, centering, standardizing, and mapping through a copula to Gaussian distributions, the data was ready for eigenanalysis via SVD (Singular Value Decomposition — a way to find eigenvectors and eigenvalues).

📐 The Angle Matrix

For 7 dimensions, the N(N-1)/2 = 21 rotation angles are organized into a triangular matrix. Here's what it looks like:

      0  θ₁₂  θ₁₃  θ₁₄  θ₁₅  θ₁₆  θ₁₇  ← mode 1 (6 angles)

         0   θ₂₃  θ₂₄  θ₂₅  θ₂₆  θ₂₇  ← mode 2 (5 angles)

            0   θ₃₄  θ₃₅  θ₃₆  θ₃₇  ← mode 3 (4 angles)

               0   θ₄₅  θ₄₆  θ₄₇  ← mode 4 (3 angles)

                  0   θ₅₆  θ₅₇  ← mode 5 (2 angles)

                     0   θ₆₇  ← mode 6 (1 angle)

                        0    ← mode 7 (irreducible!)

The first angle of each colored stripe (θ₁₂, θ₂₃, θ₃₄, etc.) can span the full 360° with the new method. All the other angles stay within 180°.

📊 The Participation Score

How do you measure whether an eigenvector represents a "team effort" across all currencies, or just one dominating player? Enter the participation score (PS)!

It's calculated from the Inverse Participation Ratio (IPR): IPR = Σv_i⁴, and then PS = 1/(N × IPR).

All equal (PS=1)

1.0

Good mix (PS≈0.7)

0.7

One dominates (PS=1/7)

0.14

High PS = all currencies participate equally. Low PS = one currency dominates.

🔬

The Results: This Is Where It Gets WILD 🔥

What happened when the new algorithm was unleashed on real data

🎯 Finding #1: Informative vs. Noisy Modes — Revealed!

The BIGGEST result: switching from arcsin to modified arctan2 let the researchers clearly see which eigenmodes carry real information and which are just noise.

📈 Quotes (Price Data)

Mode 1

Highly directed ✨

Mode 2

Somewhat directed

Mode 3

Somewhat directed

Mode 4

Random scatter 🎲

Mode 5

Random scatter 🎲

Mode 6

Random scatter 🎲

📊 Trades (Buy/Sell Data)

Mode 1

Highly directed ✨

Mode 2

Random scatter 🎲

Modes 3-6

All random 🎲

🎯 Finding #2: Random Matrix Theory Agrees!

The Marčenko–Pastur (MP) distribution is like a "noise detector" from a branch of math called Random Matrix Theory (RMT). It predicts exactly where eigenvalues would land if your data were pure random noise.

The MP distribution has edges at: λ_± = (1 ± √q)², where q = N/T (features divided by records).

The match was excellent! Quote modes 1–3 and trade mode 1 had eigenvalues outside the MP distribution (= real information!). All other modes fell inside the distribution (= indistinguishable from noise). This perfectly matched what the pointing directions showed! 🎯

📈 Quotes

3 modes outside MP

(modes 1, 2, 3 = informative)

4 modes inside MP

(modes 4–7 = noise)

📊 Trades

1 mode outside MP

(mode 1 = informative)

6 modes inside MP

(modes 2–7 = noise)

🕵️ Finding #3: The CPI Outlier — Caught!

On May 10, 2023, the US Consumer Price Index (CPI) was released — a major economic event that markets were watching closely. The EURUSD currency pair reacted anomalously compared to the other six pairs.

With the old arcsin method, this outlier was hidden by angular wrap-around. With the new arctan2 method, it stuck out like a sore thumb! 🔴 The paper highlights it with a red circle in quote modes 2 and 3.

Real-world impact: Being able to spot outliers like this helps analysts separate "normal" market behavior from event-driven jumps. Removing EURUSD from the panel restored the outlier point to the normal cluster. That's powerful diagnostics!

🎯 Finding #4: The Haar Measure Connection

Random Matrix Theory also says that eigenvectors of purely random matrices should point uniformly in all directions on a hypersphere — this is called the Haar measure. Think of it like throwing darts at a globe: they should land everywhere equally.

The noisy modes (4–6 for quotes, 2–6 for trades) scattered uniformly across the full 360° circle with the new method — exactly what the Haar measure predicts! The informative modes, by contrast, clustered in specific directions.

🧠 Pop Quiz!

The May 10, 2023 CPI outlier was visible with...

⚖️

Stabilizing the Eigenvectors

Making the arrows stop wobbling — two clever methods

🔄 Dynamic Stabilization: Averaging the Wobble

You know how a camera stabilizer keeps your video smooth even when your hand shakes? Dynamic stabilization does the same for eigenvectors!

The trick is to stack eigenvectors from multiple time steps end-to-end (like laying arrows tip-to-tail) and measure the resultant direction. This is the correct way to average directions! (Simply averaging angles would give wrong answers.)

The paper used a 5-point causal filter with weights h[n] = [1, 2, 3, 2, 1]/9 (made by convolving a 3-point box filter with itself). This gives an average delay of just 2 days.

Results: After filtering, the scatter of pointing directions was visibly reduced. Participation scores tightened and slightly increased. The wobble was tamed! 🎉

🔒 Static Stabilization: Locking Down the Noise

For the noisy modes (whose directions are random and meaningless), why bother tracking their directions at all? Static stabilization sets all their rotation angles to zero, which is like saying: "Stop pointing randomly and just align with the identity matrix."

For quotes, this means keeping angles for modes 1–3 and zeroing out modes 4–7. The result: V_modal = R₁R₂R₃ · I (modes 4, 5, 6 become identity rotations).

Important distinction from PCA: In PCA, you throw away noisy modes entirely. With static stabilization, all modes stay — the noisy ones just get a fixed, stable direction. The eigenvector matrix remains full rank!

📉 The Impact on Correlations

When you stabilize the eigenvectors and reconstruct the correlation matrix (using Corr(P) = V̄ Λ̄ V̄^T, with normalization), the structure becomes cleaner and more persistent over time.

Original
(lots of structure)

→

Dynamically
stabilized

→

Dynamic + Static
(cleanest!) ✨

The pairwise correlation dispersion (scatter) decreased at each stage. Blue circles (original) had the widest spread. Gray circles (dynamically stabilized) were tighter. Orange circles (fully stabilized) were the tightest!

🔗 Connection to Ledoit–Wolf Shrinkage

Ledoit–Wolf shrinkage is a famous technique that blends a noisy correlation matrix with an identity matrix: Σ_shr = α·Σ̂ + (1-α)·I, where α controls how much to trust the data.

Static stabilization is like an even smarter version of this. Instead of shrinking the whole matrix, you can:

Rotate away the informative modes (they're good — keep them!)
Apply Ledoit–Wolf shrinkage to just the noisy modes
Rotate the informative modes back

The effective shrinkage for static stabilization is α = 0 — maximum shrinkage on the noisy part!

🧠 Knowledge Check!

Static stabilization differs from PCA because...

🌍

Why This Matters (Even to You!)

🎮 The Big Picture

Any time a computer needs to find patterns in data — whether it's currencies, brain scans, climate data, or even video game physics — it uses eigenanalysis. And every single time, those pesky sign flips and angular ambiguities can mess things up.

This paper gave us two tools for two different jobs:

🔧 arcsin method

Stabilize signs

→ Better for regression, predictions, interpretability

🔧 modified arctan2

Full directional tracking

→ Better for spotting patterns, outliers, understanding how systems evolve

Plus, the eigenvector stabilization techniques (dynamic filtering + static locking) help clean up noisy correlation matrices — which is useful EVERYWHERE, from weather prediction to stock portfolios to music recommendation algorithms! 🎵

🐍 The Code Is Free!

Everything is implemented in the thucyd Python package (version 0.2.5+), available for free on PyPi and Conda-Forge. The source code is on GitLab under the Apache 2.0 license. Anyone can use it!

FREE

Open Source

Methods Available

Python

Language

📖

Big Words I Learned! 🏆

Your collectible trading cards of science terms

Eigenvector

A special direction in which data naturally stretches or compresses.

🎯 Like finding the strongest wind direction in a storm.

Eigenvalue

A number that says how much the data stretches along an eigenvector.

💨 How FAST the wind blows in that direction.

SVD (Singular Value Decomposition)

A method to break any data table into eigenvectors, eigenvalues, and projections.

🔬 Like splitting white light into a rainbow with a prism.

Givens Rotation

A rotation that only acts in one 2D plane at a time.

🎡 Like turning a single dial on a combination lock.

Determinant

A single number from a matrix: +1 means pure rotation, -1 means there's a hidden reflection.

🪞 A handedness detector: right-hand or left-hand?

Orthogonal / Orthonormal

Vectors at perfect right angles (orthogonal) with length 1 (orthonormal).

📐 Like the three axes of a perfectly aligned 3D printer.

Marčenko–Pastur Distribution

The expected shape of eigenvalues from a purely random matrix.

🎰 The "this is just noise" zone.

Haar Measure

The uniform distribution of random eigenvectors on a hypersphere.

🎯 Darts thrown evenly across the surface of a globe.

Participation Score

Measures how evenly the elements of an eigenvector contribute. PS=1 means all equal.

⚽ A team sport score: 1.0 = everyone plays, 0.14 = one star carries.

Reducible Subspace

A subspace where rotation can still be applied to align a vector.

🚪 A room with a spinning door — you CAN rotate.

Irreducible Subspace

The last 1D subspace where no rotation is possible, only reflection.

🚧 A dead-end hallway — you can only go forward or backward.

Ledoit–Wolf Shrinkage

A technique to denoise correlation matrices by blending them with the identity matrix.

🎚️ Like a volume knob between "trust the data" and "assume nothing."

Copula

A mathematical tool to transform data distributions while preserving their dependency structure.

🔄 Like converting temperatures from Fahrenheit to Celsius — the relationships stay the same.

SO(N) / O(N)

SO(N) = "special orthogonal group" = pure rotations. O(N) = orthogonal group = rotations + reflections.

🏠 SO(N) is the "no mirrors" club; O(N) lets mirrors in.

What if your GPS couldn't tell left from right?

But First, You Need to Know...

🎯 What Are Eigenvectors?

🔄 What's a Rotation?

🪞 What's a Reflection?

📐 What's a Givens Rotation?

📏 Arcsin vs. Arctan2: Two Ways to Measure Angles

🧠 Quick Check!

The Great Eigenvector Mix-Up

🎲 The Sign Ambiguity Problem

📈 Why This Matters: Watching Data Over Time

🌯 The Angular Wrap-Around Problem

🟠 arcsin method

🟢 modified arctan2

🧠 Checkpoint!

The Big Idea: A "Modified arctan2" Method

🧩 The Key Insight: Think Holistically!

🔑 Reducible vs. Irreducible: Where the Magic Happens

📐 The Math (Made Friendly!)

🔢 How Many Angles Are There?

The arcsin Method

The Modified arctan2 Method

When to Use Which?

Use arcsin when...

Use arctan2 when...

🧠 Quick Quiz!

The Real-World Test: Currency Markets!

💱 The Dataset

📐 The Angle Matrix

📊 The Participation Score

The Results: This Is Where It Gets WILD 🔥

🎯 Finding #1: Informative vs. Noisy Modes — Revealed!

📈 Quotes (Price Data)

📊 Trades (Buy/Sell Data)

🎯 Finding #2: Random Matrix Theory Agrees!

📈 Quotes

📊 Trades

🕵️ Finding #3: The CPI Outlier — Caught!

🎯 Finding #4: The Haar Measure Connection

🧠 Pop Quiz!

Stabilizing the Eigenvectors

🔄 Dynamic Stabilization: Averaging the Wobble

🔒 Static Stabilization: Locking Down the Noise

📉 The Impact on Correlations

🔗 Connection to Ledoit–Wolf Shrinkage

🧠 Knowledge Check!

Why This Matters (Even to You!)

🎮 The Big Picture

🔧 arcsin method

🔧 modified arctan2

🐍 The Code Is Free!

Interactive Explorer

🌀 Angle Range Visualizer

📏 Dimension Explorer

Big Words I Learned! 🏆