Käyttäytymisbiometria ja kuinka torjua sitä
Fingerprint spoofing is a solved problem. You can make every browser profile look like a unique device with the right anti-detect browser. But the next generation of detection doesn't care what your browser reports — it cares how you use it.
Behavioral biometrics analyzes the patterns of human interaction: how you move the mouse, how you type, how you scroll, how you hold your phone. These patterns are as unique as a fingerprint and far harder to fake. This is the detection frontier, and it's the reason accounts with perfect technical setups still get caught.
What Behavioral Biometrics Measures
Every major platform collects behavioral data through JavaScript event listeners. The data is sent to server-side models that build a behavioral profile per session and per account.
Mouse Movement Analysis
Detection scripts track mousemove, mousedown, mouseup, and click events with timestamps. The collected features include:
Movement trajectory. Human mouse movements follow Fitts's Law — the time to move to a target is logarithmically related to distance and inversely related to target size. The path curves toward the target with characteristic acceleration/deceleration profiles.
Velocity distribution. Human mouse velocity follows a bell curve with a peak around 400-800 pixels/second. Bots either move at constant velocity or use randomization that doesn't match the natural distribution.
Micro-corrections. Humans overshoot targets and make small corrective movements. The frequency and magnitude of corrections are consistent per user but different between users. Bots either don't overshoot or use uniform random corrections.
Jitter. Human hands introduce micro-tremors of 1-3 pixels even during "still" periods. The jitter frequency (8-12 Hz) matches physiological hand tremor. Bots either have zero jitter or add noise at the wrong frequency.
Typing Cadence
Keystroke dynamics — the timing between key presses — is one of the most reliable behavioral biometrics:
Hold time. How long each key is held down (typically 80-120ms for most keys).
Flight time. The interval between releasing one key and pressing the next (50-200ms depending on the key pair).
Digraph latency. Specific two-key combinations have characteristic timing. "th" is faster than "tz" because the keys are closer and the combination is more practiced. Each person has a unique digraph timing profile.
// Platform-side collection
document.addEventListener('keydown', (e) => {
events.push({
key: e.key,
time: performance.now(),
type: 'down'
});
});
document.addEventListener('keyup', (e) => {
events.push({
key: e.key,
time: performance.now(),
type: 'up'
});
});
Error patterns. Humans make typos and correct them. The pattern of errors, backspaces, and corrections is individual. Bots that type perfectly or use element.value = text bypass keystroke events entirely — both are detectable.
Scroll Behavior
Scroll patterns reveal whether content is being consumed by a human or processed by a bot:
Scroll velocity. Humans scroll in variable-speed bursts. Fast through uninteresting content, slow through interesting content, with pauses. Bots scroll at constant velocity or in fixed increments.
Scroll direction changes. Humans frequently scroll back up to re-read content. The frequency and distance of reverse scrolls varies by individual but follows consistent patterns. Bots rarely scroll backwards.
Scroll-to-click correlation. Humans scroll to a section, pause to read, then click. The time between scroll stop and click is variable (500ms-5s depending on content density). Bots either click immediately after scrolling or wait a fixed interval.
Device Orientation (Mobile)
On mobile devices, how the user holds the phone is a biometric:
Tilt angle. Most users hold phones at 60-80 degrees from horizontal. The specific angle is habitual and consistent for each user.
Grip pressure. Touch pressure patterns (available on some devices) indicate how the user holds the phone — light grip vs firm grip, thumb vs index finger.
Orientation changes. How frequently and rapidly the user shifts phone orientation during use. This correlates with body position (sitting vs standing vs walking).
How Machine Learning Identifies Bots
Behavioral biometric detection uses supervised machine learning trained on labeled datasets of human and bot behavior.
Feature Engineering
Raw events (mousemove timestamps, coordinates) are transformed into features:
| Feature | Description | Human range | Bot telltale |
|---|---|---|---|
| Mean mouse velocity | Average pixels/second | 300-900 | >1500 or constant |
| Velocity variance | How much speed varies | High (σ > 200) | Low (σ < 50) |
| Path curvature | Deviation from straight line | 0.1-0.5 | <0.05 or >0.8 |
| Click precision | Distance from target center | 5-25px | <3px |
| Dwell time variance | Time between actions | High (CV > 0.5) | Low (CV < 0.2) |
| Scroll jerk | Rate of scroll speed change | Irregular | Smooth |
| Keystroke entropy | Variability of typing speed | High | Low |
Model Architecture
Most production systems use ensemble models:
- Random Forest for initial classification (fast, interpretable)
- LSTM/GRU neural network for sequence analysis (captures temporal patterns in event streams)
- Anomaly detection (Isolation Forest or Autoencoders) for identifying behavior that doesn't match any known pattern
The ensemble approach catches both bots that fail individual feature checks (caught by Random Forest) and bots that pass individual checks but have unnatural temporal patterns (caught by LSTM).
Accuracy
Production behavioral biometric systems report:
- True positive rate (bot detection): 85-95%
- False positive rate (humans flagged as bots): 2-5%
- Time to classification: 15-30 seconds of interaction data
The 2-5% false positive rate is why platforms use behavioral biometrics as one signal rather than a sole determinant. A behavioral score feeds into a broader risk model alongside fingerprinting, IP analysis, and account history.
Defeating Behavioral Analysis
Software-Based Approaches
Human-like mouse movement libraries. Libraries like ghost-cursor generate Bézier curve trajectories with human-like acceleration profiles:
const { createCursor } = require('ghost-cursor');
const cursor = createCursor(page);
// Moves the mouse along a human-like path
await cursor.click('#submit-button');
// Includes: curved path, overshoot, micro-correction, variable speed
The problem: these libraries approximate average human behavior. They don't produce individual behavioral patterns. If the platform's model is trained to recognize specific user profiles (returning user verification), a different behavioral pattern from the same account is suspicious.
Typing simulation with digraph statistics. Use real typing data to model realistic keystroke timing:
import random
import time
DIGRAPH_DELAYS = {
'th': (45, 75), # Fast - common combination
'er': (50, 80),
'in': (55, 85),
'an': (60, 90),
'qz': (120, 200), # Slow - uncommon, distant keys
'xp': (100, 170),
}
def type_text(page, selector, text):
for i, char in enumerate(text):
if i > 0:
digraph = text[i-1] + char
delay_range = DIGRAPH_DELAYS.get(digraph, (70, 130))
delay = random.gauss(
sum(delay_range) / 2,
(delay_range[1] - delay_range[0]) / 4
)
time.sleep(max(0.03, delay / 1000))
page.keyboard.press(char)
hold_time = random.gauss(0.1, 0.02)
time.sleep(max(0.05, hold_time))
Scroll behavior randomization. Variable scroll speeds, pauses at content sections, occasional reverse scrolls:
async function humanScroll(page, targetY) {
let currentY = 0;
while (currentY < targetY) {
const scrollAmount = Math.random() * 200 + 100;
const speed = Math.random() * 300 + 200;
await page.mouse.wheel({ deltaY: scrollAmount });
await page.waitForTimeout(scrollAmount / speed * 1000);
currentY += scrollAmount;
if (Math.random() < 0.15) {
// 15% chance to pause and "read"
await page.waitForTimeout(Math.random() * 3000 + 1000);
}
if (Math.random() < 0.05) {
// 5% chance to scroll back up slightly
await page.mouse.wheel({ deltaY: -(Math.random() * 100 + 50) });
await page.waitForTimeout(Math.random() * 1000 + 500);
}
}
}
The Hardware Approach
When software simulation isn't convincing enough, physical input devices provide genuine human-like input signals.
Arduino/Raspberry Pi input injection. A microcontroller connected via USB acts as a HID (Human Interface Device) — the computer sees it as a real mouse and keyboard. Input signals generated by the microcontroller include the analog noise characteristics of real hardware that software emulation can't reproduce.
The Arduino generates mouse movements with:
- Genuine analog sensor noise
- Hardware interrupt-driven timing (more irregular than software timers)
- Physical movement simulation with realistic acceleration curves
Pre-recorded human input. Record actual human interactions (mouse movements, keystrokes) and replay them through a HID device. This produces genuine human behavioral patterns because they are genuine human behavioral patterns.
Disadvantages: Hardware solutions are expensive ($50-200 per device), require physical setup, and don't scale to hundreds of concurrent sessions. They're reserved for the highest-value operations where behavioral detection is the confirmed blocking point.
The Human Approach
The ultimate countermeasure: use actual humans. For operations where behavioral biometrics is the primary detection vector and the value per account justifies the cost, human operators provide undetectable behavioral patterns because they are real behavioral patterns.
This is common in high-stakes operations: human operators perform the sensitive interactions (login, verification, trust-building activities) while automation handles the routine work (data extraction, monitoring).
FAQ
How accurate are behavioral biometrics really? In controlled lab conditions, behavioral biometric systems achieve 95-99% accuracy distinguishing humans from bots. In production, accuracy drops to 85-95% due to the diversity of human behavior, mobile vs desktop differences, and the challenge of maintaining model accuracy across different device types and user populations.
Do all platforms use behavioral biometrics? No. It's primarily used by platforms with sophisticated anti-fraud teams: Meta (Facebook/Instagram), Google, Amazon, major banks, and platforms protected by commercial bot detection services (Cloudflare Bot Management, DataDome, PerimeterX). Smaller platforms typically rely on simpler detection (IP, fingerprint, rate limiting).
Can behavioral biometrics identify the same person across accounts? Theoretically yes — behavioral patterns are individual. In practice, it requires significant data (multiple minutes of interaction per session) and the platform must be specifically looking for cross-account behavioral correlation. This is an emerging threat but not yet widely deployed outside of fraud detection in banking.
What about the privacy implications of behavioral tracking? Behavioral biometrics collects intimate data about physical characteristics and habits. Under GDPR, this may constitute biometric data processing, which requires explicit consent. The privacy implications are significant — platforms collect this data silently, without user awareness or consent, and use it for purposes the user didn't agree to. This is an active area of regulatory debate.