The Technology
Making the invisible
measurable.
EchoDepth analyses text, image, voice and video — adapting to your interaction format, whether phone calls, video interviews, written correspondence or document review. It translates involuntary emotional signals into structured, quantified data in real time, at scale, with no specialist hardware required.
Facial Action Coding System
44 Action Units.
One universal language.
The Facial Action Coding System (FACS) is the scientific gold standard for facial expression analysis — a taxonomy developed by Paul Ekman and Wallace Friesen that describes every visible movement of the human face through discrete, numbered Action Units (AUs).
EchoDepth tracks all 44 observable Action Units per frame, per person. Because AUs are involuntary — many cannot be consciously controlled — they provide a reliable signal that is independent of self-report.
- AU1 & AU4: Inner brow raise / brow lowerer — core stress markers
- AU6 & AU12: Cheek raiser / lip corner pull — genuine vs masked emotion
- AU17 & AU24: Chin raiser / lip press — suppression and withholding signals
- Temporal coherence: scored across the full session, not single frames
Example AU activation pattern
AU pattern consistent with suppressed stress and cognitive load. Session flag: elevated.
Output Model
Three dimensions.
One complete picture.
EchoDepth outputs a continuous VAD score — Valence, Arousal, Dominance — the three-dimensional model of emotional state that underpins modern affective computing.
Valence
VThe positive-to-negative dimension. High valence indicates a positive, comfortable emotional state. Low valence indicates distress, displeasure or anxiety.
In fintech: A valence drop during a specific question in a claims interview may indicate discomfort with that topic.
Arousal
AThe calm-to-excited dimension. Elevated arousal indicates heightened physiological activation — which may reflect stress, urgency, fear or excitement.
In fintech: Sustained high arousal in a mortgage interview correlates with elevated cognitive load — a potential indicator of rehearsed or constructed responses.
Dominance
DThe submissive-to-in-control dimension. High dominance indicates confidence and control. Low dominance indicates a sense of vulnerability or powerlessness.
In fintech: A sudden dominance drop mid-session can indicate the subject has encountered a question they were not prepared for.
Multimodal Analysis
Four signal streams.
One emotional picture.
EchoDepth combines four input modalities to build the most complete picture of emotional state — adapting to the interaction format, whether video call, phone call, or recorded session:
44-AU FACS analysis at up to 30fps. Temporal coherence scoring across the full session window. Primary modality for video calls and recorded interviews.
Pitch, rate, energy and micro-pause analysis. Detects vocal stress markers independent of content. Primary modality for phone-based vulnerability detection — where most collections and complaint interactions occur.
Single-frame AU analysis for non-video contexts — document verification, ID check photos, or post-session review of captured stills.
Sentiment, hedging, temporal inconsistency and confidence markers in transcribed speech or written correspondence.
Bias Reduction
Trained across
cultures, not just data.
Many emotion AI systems fail outside the demographic of their training data. EchoDepth was deliberately built to avoid this.
- Training data collected across 6 countries
- 14 cultural cohorts represented in the model
- Active bias auditing — cultural expression variance is modelled, not averaged
- No reliance on posed expression datasets
- Validated on spontaneous, naturalistic video — not lab conditions
EchoDepth will not be deployed in a context where cultural expression bias would produce discriminatory outcomes. Read our full methodology →
Privacy by Design
No biometric data stored.
No exceptions.
No raw video retained
Video is processed in memory. Frames are never stored. Only VAD scores and AU activations are output.
GDPR compliant
Designed for UK and EU regulatory environments. FCA Regulatory Sandbox participant. Data residency options available.
On-device processing
Edge deployment option for organisations where video data cannot leave the premises. Full API feature parity.
ISO 9001 infrastructure
Built on ISO 9001 and Cyber Essentials certified infrastructure designed for regulated environments.
Ready to go deeper?
Explore our methodology, the API documentation, or talk to the team about a proof of concept.