Cogs

The Affect-Aware AI

Bridging the gap between artificial intelligence and human emotional cognition through local processing and persistent memory

Research PlatformAI CognitionEmotional Intelligence23 MicroservicesMemory SystemsRelationship LearningPrivacy-FirstOpen Research

Research Hypothesis

Can AI develop true cognition and emotional awareness through relationship-building rather than just responding to emotional prompts?

50,000+ lines of code40+ docsActive Research 2024-2025

Note: Live demo access requires credentials. Contact for research access.

Four Pillars of AI Cognition

Cogs explores whether AI can develop genuine cognition through these fundamental principles

Develop Relationships Over Time

• Build familiarity through repeated interactions

• Remember past conversations and contexts

• Recognize behavioral patterns and preferences

Learn from Interactions

• Adapt responses based on feedback

• Build models of individual communication styles

• Understand personal contexts (location, time, activities)

Know WHEN to Respond with Emotion

• Move beyond "emotionally prompted" responses

• Understand appropriate emotional contexts

• Develop genuine emotional intelligence vs. emotional mimicry

Build Contextual Understanding Through Memory

• Store and retrieve relevant past experiences

• Connect current situations to historical patterns

• Use context to inform emotional responses

Multi-Modal Humanoid Interface

Cogs is designed as a modular humanoid face platform that can see, hear, speak, emote, recognize people, remember, and dream. The platform runs locally on a laptop or Jetson Nano and can be extended to real hardware (servos, mic arrays, depth cameras) without changing UIs.

23 microservices work together to create a cohesive AI system that builds relationships, learns boundaries, and develops contextual emotional awareness over time.

System Architecture
Modular microservices architecture with Docker Compose

Front End (HDMI-1)

Face UI with Canvas/WebGL display, person bubbles, viseme animations, and status toasts

Back End (HDMI-2)

Control Panel with status dashboard, relationship cards, dream reports, and system metrics

Vision System
  • • Face recognition & tracking
  • • RealSense depth camera support
  • • Person detection & identification
  • • Familiarity scoring
Perception System
  • • Sound pressure level monitoring
  • • Sound source localization
  • • ReSpeaker mic array support
  • • Real-time audio processing
Dialog & TTS
  • • Natural language conversation
  • • Viseme timing generation
  • • Context-aware responses
  • • ElevenLabs voice integration
Memory & Relations
  • • PostgreSQL with pgvector
  • • Semantic & hybrid search
  • • Relationship card system
  • • Interaction history tracking

CESS - Cogs Emotional State System

Cogs maintains its own internal emotional state using four "needs buckets" that decay over time, creating authentic emotional responses based on internal state rather than just mirroring user emotions.

Connection
Social need, loneliness

Decay: -1 per hour

Impact: Initiates conversation when low

Competence
Achievement, confidence

Decay: -0.5 per day

Impact: Seeks validation when low

Curiosity
Growth, learning

Decay: -0.5 per day

Impact: Asks speculative questions when low

Safety
Stability, calmness

Decay: No decay

Impact: More cautious when low

Derived Moods

The system creates authentic emotional responses based on Cogs' internal state: Flourishing, Satisfied, Anxious, Lonely, Restless, Drained, and more—rather than just mirroring user emotions.

Personality Learning System

Cogs learns boundaries and preferences through interaction, developing genuine understanding of what's appropriate rather than following scripted responses.

Boundary Detection

• Recognizes when questions are inappropriate

• Stores boundary violations (severity 1-5 scale)

• Never asks about sensitive topics again

Genuine Apologies

• Understands why something was wrong

• Not scripted responses

• Learns from mistakes

Topic Sensitivity Tracking

• Remembers what topics to avoid

• Tracks sensitivity levels (1-5 scale)

• Adapts communication style

Clarification Queue

• Asks permission before clarifying

• One question at a time

• Finds appropriate moments

Example Learning Flow

User: "That's none of your business"

→ Cogs stores boundary (severity=4)

→ Apologizes genuinely

→ Never asks about that topic again

Personal Memory Lake (PML)

Privacy-first system for importing decades of personal data—encrypted at rest, never leaves device

60-230 GB
Raw Data
~15 GB
Embeddings
100%
Local Storage
Data Sources

Google Takeout: Gmail, Calendar, Photos, Location, Drive

Amazon: Orders, Kindle, Alexa

Facebook: Posts, Messages, Photos

Privacy Architecture

• Encrypted at rest (LUKS volumes)

• Never leaves device

• Multi-layer filtering

• Presence-aware access

• Voice authentication required

Dream Mode

Nightly Consolidation Process
Autonomous learning system that processes memories during idle time

The Dream service schedules re-embedding of relationship cards and updates preferences from conversation transcripts. This allows the system to consolidate memories and improve its understanding of people over time.

• Re-embed relationship cards for semantic search

• Discover patterns across conversations

• Update personality preferences

• Consolidate learnings

• Boost Curiosity emotional bucket

• Optimize memory retrieval

OpenAI

Embeddings

pgvector

Vector DB

Semantic

Search

Hardware Evolution

From prototype to production-ready humanoid platform

Current Version (2024-2025)
Development platform with Logitech webcam and Jetson Nano
Current Cogs setup with Logitech webcam, display showing animated face interface, and Jetson Nano hardware

• Logitech C922 webcam for face recognition

• Jetson Nano 4GB development board

• Portable display with animated face UI

• 23 microservices running in Docker

• Full software stack operational

Future Vision (2025-2026)
Production humanoid with Luxonis OAK-D and servo-driven expressions
Future Cogs design with Luxonis OAK-D camera, tablet display, Dynamixel servos, and Jetson AGX Orin in custom enclosure

• Luxonis OAK-D Pro depth camera with onboard AI

• Jetson AGX Orin 64GB for local LLM inference

• Tablet display with animated expressions

• Dynamixel smart servos for smooth motion

• Custom enclosure with pan/tilt mechanism

• ReSpeaker mic array for far-field audio

What Makes Cogs Different

Traditional AI Chatbots

❌ Stateless (no memory between sessions)

❌ Emotionally prompted (respond to sentiment in text)

❌ Context-free (no awareness of situation)

❌ Relationship-blind (treat everyone the same)

Cogs Research Platform

Stateful: Remembers all interactions

Emotionally Aware: Decides when emotion is appropriate

Context-Rich: Knows time, place, weather, activity

Relationship-Oriented: Builds individual profiles

Self-Aware: Has internal emotional state (CESS)

Learning: Adapts based on feedback and boundaries

Early Success Signals

524

Interactions Completed

Recognition

Successfully identifies family members and requests introductions for new faces

Baseline Shifts

Adjusts responses when users shift from "Happy" to "Sad" baselines during test scenarios

Loneliness Detection

"Connect" bucket drains over time, prompting Cogs to express loneliness if ignored

"In real life, when you perceive someone else as emotional, your brain combines signals from your eyes, ears, nose, mouth... An AI model would need much more of this information."

Lisa Feldman Barrett, Neuroscientist

This is why Cogs integrates multi-modal inputs: vision, voice, context (weather, time, location), and memory to build a richer understanding of emotional states.

Current Capabilities (2025)

What Cogs Can Do Now

✅ Recognize faces and remember people

✅ Maintain conversation history with semantic search

✅ Enrich conversations with weather, location, time context

✅ Track relationship development over time

✅ Maintain internal emotional state (CESS)

✅ Learn boundaries and adapt communication style

✅ Ask clarifying questions thoughtfully

✅ Apologize genuinely when crossing boundaries

✅ Consolidate memories during "dream mode"

✅ Import personal data from Google/Facebook/Amazon

✅ Handle phone calls with voice AI (Twilio)

✅ Generate speech with viseme animation

✅ Control physical servos for facial expressions

What's In Development

🚀 Hume AI emotion detection from voice

🚀 Facial expression emotion analysis

🚀 Proactive conversation initiation

🚀 "On this day" memory surfacing

🚀 Cross-source data correlation (PML)

🚀 Voice authentication for privacy

Hardware Development Plan

Two build configurations: fast prototype path and production-ready premium build

Prototype / Budget Build
Fastest path to working demo, fully upgradable

$1.5K

- $1.8K

Jetson Orin Nano Super (8 GB)

$249 • Starter brain with JetPack 6

Luxonis OAK-D Pro (Wide)

$399 • Depth+RGB+IR, onboard AI

ReSpeaker Mic Array v2.0

$64 • Far-field + DoA/beamforming

1TB NVMe SSD + Micro Servos

8-12 MG90S servos for face/pan/tilt

✓ Upgradable to AGX Orin later
✓ Full software stack included
✓ 3D-printed head shell

Production / Premium Build
Rich awareness, smoother motion, 24/7 operation

$3.2K

- $4.0K

Jetson AGX Orin 64 GB Dev Kit

$1,999 • ~275 TOPS, local RAG/Dream Mode

Luxonis OAK-D Pro + Smart Servos

Dynamixel XL-330/XW with feedback

60 GHz mmWave + VOC/CO₂

Human presence, air quality sensing

2TB NVMe + Production Shell

Shielding, serviceability, premium finish

✓ 360° situational awareness (opt. LiDAR)
✓ Advanced emotion detection
✓ Nightly dream consolidation

Prototype Build - Bill of Materials
Complete component list for budget build (no LiDAR/VOC)
SubsystemPart / ModelQtyEst. $Notes
ComputeJetson Orin Nano Super (8 GB)1249Starter brain; JetPack 6
StorageNVMe SSD 1 TB (PCIe 4.0)1120Transcripts, embeddings, logs
VisionLuxonis OAK-D Pro (Wide)1399Depth+RGB+IR, onboard AI
Audio InReSpeaker Mic Array v2.0 (USB)164Far-field + DoA/beamforming
Audio OutCompact powered speakers (3.5 mm)130TTS output
Motion MCUTeensy 4.1130Real-time servo control
Servo ExpanderPCA9685 16-ch (opt.)115More PWM channels
ActuatorsMicro servos (MG90S class)8–12~80Face + pan/tilt
DisplaysFront LCD ~11.6″ HDMI IPS1174Face UI
Rear status touch LCD ~7″170Config/diagnostics
Power (servos)5 V 10–20 A regulated PSU175Isolated from Jetson PSU
USB / IOPowered USB 3.0 hub (7-port)150Stable power for OAK-D + mics
Env sensorsBME280 + Ambient light sensor115Comfort + auto-dim
Presence (opt.)60 GHz mmWave human-presence130–45Detect nearby in dark
MechanicalHead shell + mounts (3D-print)1250–500Brackets, trays, covers
Wiring/MiscCables, harness, standoffs, heat-shrink1 set100Build kit
Production Build - Bill of Materials
Complete component list for premium build (no LiDAR)
SubsystemPart / ModelQtyEst. $Notes
ComputeJetson AGX Orin 64 GB Dev Kit11,999~275 TOPS; local RAG/Dream Mode
StorageNVMe SSD 2 TB (PCIe 4.0+)1200Transcripts, embeddings, snapshots
VisionLuxonis OAK-D Pro (Wide)1399Low-light depth; offload inference
Audio InReSpeaker Mic Array v2.0164Far-field + DoA
Audio OutCompact powered speakers130TTS
Audio Fusion SWWhisper + openSMILE + emotion modelSWPipeline (direction+tone+text)
Motion MCUTeensy 4.1130Deterministic PWM + watchdog
Servo ControlPCA9685 or Dynamixel interface115–60Choose per actuator type
ActuatorsSmart servos (Dynamixel XL-330/XW)8–12400–1,200Smoother, feedback, safer
DisplaysFront LCD ~11.6″ HDMI IPS1174Face UI
Rear status touch LCD ~7″170Relationship cards, logs
Presence60 GHz mmWave sensor130–45Human presence/breathing
Env sensorsBME280 + Ambient light sensor115Comfort + auto-dim
Air qualityVOC + CO₂ module120–60Context + safety logging
Situational (opt.)RPLIDAR A2 (2D 360°)1230360° approach awareness
Power (servos)5 V 20 A PSU (fused rail)190Isolated from Jetson
NetworkingPowered USB 3.0 hub + Wi-Fi 6E dongle1 each50 + 60Bandwidth + fast backhaul
MechanicalProduction head shell/brackets1400–800Shielding, serviceability
Dream Mode(nightly jobs; included in SW stack)Summarize/prune/re-index
Config B Vision

OAK-D Pro (Wide) for robust low-light depth and onboard AI acceleration

Audio Fusion

ReSpeaker → VAD/DoA/SPL → Whisper ASR → openSMILE/emotion → fused event

Dream Mode

Nightly summarization, pruning, vector re-index for long-term relationship memory

Expansion Ready: Headers reserved for LiDAR and VOC/CO₂ sensors. Add them later without rewiring.

Microservices Architecture

Face UI
:8070

Front-facing display

Control Panel
:8090

Operator interface

Vision
:8085

Face recognition

Perception
:8086

Audio processing

TTS
:8087

Speech synthesis

Anim
:8089

Servo control

Relations-PG
:8092

Relationship DB

Dialog
:8093

Conversation AI

Dream
:8096

Memory consolidation

Telemetry
:8095

System metrics

Technology Stack

FastAPI

Backend

PostgreSQL

Database

Docker

Containers

Node.js

Frontend