DigiFrens Docs

DigiFrens Technical Documentation

Version 3.0 | March 2026


1. Executive Summary

DigiFrens is a sophisticated iOS AI companion application that combines animated avatars, intelligent memory systems, and natural voice interaction to create meaningful digital relationships. Unlike conventional AI assistants optimized for productivity, DigiFrens is designed around emotional connection - companions that remember, evolve, and respond to users as individuals.

The application features a triple avatar engine (3D VRM, 2D Live2D, and photorealistic Gaussian Splatting), a five-phase emotional memory system with cognitive graph integration, six AI providers including free on-device Apple Intelligence, and premium voice synthesis. All conversation data is stored locally on the user's device, with no cloud dependency for core functionality.

Key Differentiators:

  • Privacy-first architecture - conversations never leave the device
  • Living User Model (LUM) - a cognitive graph that models the user's beliefs, values, and goals
  • Cognitive Memory Pipeline - spreading activation retrieval inspired by human associative memory
  • On-device avatar creation - single selfie to photorealistic animated avatar via CoreML
  • Adaptive conversation intelligence - 11-state mental process that responds to emotional context

2. Product Overview

What is DigiFrens?

DigiFrens creates AI entities with visual presence, unique voices, persistent memories, and consistent personalities. Each companion provides judgment-free interaction and emotional support through natural conversation.

Core Experience

Users interact with animated avatar companions through text or voice. Each conversation is enriched by:

  • Visual presence - avatars express emotions in real-time through facial expressions, gestures, and idle animations
  • Voice interaction - premium neural voice synthesis with synchronized lip movements
  • Persistent memory - companions remember personal details, past conversations, and emotional patterns
  • Personality evolution - avatar traits shift based on conversation dynamics over time
  • Proactive intelligence - companions initiate follow-ups, check-ins, and celebrations based on life events

User Flow

  1. Launch - automatic device-based authentication (no sign-up required)
  2. Landing page - horizontal carousel of available avatars with recent conversation previews
  3. Avatar selection - tap to start or resume a conversation
  4. Conversation - text or voice input with real-time avatar responses
  5. Settings - configure AI provider, voice, subscription, and security

Available Companions

NameEnginePersonality
HaruVRM (3D)Cool, introspective, main character energy
EmiVRM (3D)Kind, cheerful, playful, optimistic
HiyoriLive2D (2D)Cheerful, energetic, bubbly, studious
MaoLive2D (2D)Mischievous, playful, witty, curious
CustomGaussian SplatUser-created from selfie, personality selectable

3. Technical Architecture

Platform

  • iOS 26.0+ (iPhone 11 or newer)
  • Swift 6.0+ / Xcode 16.0+
  • iPhone 15 Pro+ for Apple Intelligence features
  • ~500MB storage (app + models + CoreML cache)

Architecture

The codebase follows MVVM architecture with 14 service domains covering AI, memory, emotion, voice, calendar, avatar management, and more. Three dedicated engine modules handle avatar rendering (VRM, Live2D, Gaussian Splatting), each conforming to a shared protocol for unified expression control.

Key Technologies

TechnologyPurpose
SwiftUIDeclarative UI with @Observable macro
SceneKitVRM 3D rendering (Metal backend)
Live2D Cubism SDK2D avatar animation (Metal renderer)
MetalSplatterGaussian splat rendering
CoreMLOn-device embeddings (GTE-Small) + avatar reconstruction (LAM)
Foundation ModelsApple Intelligence on-device AI (iOS 26+)
SQLiteLocal memory and LUM graph persistence
StoreKit 2Subscription management
EventKitCalendar integration
VisionFace detection for avatar capture
AVFoundationAudio recording, camera, speech

Conversation Flow

CONVERSATION FLOW • MESSAGE TO RESPONSE
USER INPUTUser sends messageCONVERSATION VIEW MODELStore memoryreturns UUIDAnalyze emotionuser stateExtract LUM databeliefs, values, goalsCreate edgesmemory → LUMCONTEXT BUILDER • PARALLEL ASSEMBLYLUM Contextbeliefs, mood, chapterMental ProcessOpenSouls statesMemory Retrieval9 strategiesCalendarupcoming eventsPersonalityHEXACO traitsRelevant memories • Emotional history • Shared language • Avatar blueprintAI SERVICESystem prompt + contextAI Providerresponse + emotionApple IntelOpenAIClaudeLocal LEAPOpenRouterOpenClawAVATAR RESPONSEVoice synthesisElevenLabs / systemLip syncviseme synchronizationEmotion expressionVRM / Live2D / Gaussian

Data Layer

StoreContentsSecurity
SQLiteMemories, LUM graph, emotions, sessions, personality evolutionDevice-local
KeychainAPI keys, device ID, passkey credentialsHardware-encrypted
UserDefaultsPreferences, settingsStandard

4. Avatar System

DigiFrens features a triple-engine avatar system, each optimized for different use cases.

Engine Architecture

AVATAR ENGINE ARCHITECTURE
AvatarEngineProtocolVRM EngineSceneKit + MetalLive2D EngineCubism SDK + MetalGaussian SplatMetalSplatterHaru, EmiHiyori, MaoCustom (selfie)~35MB~32MB~1.2GB model

All three engines conform to AvatarEngineProtocol, providing a unified interface for expression control, lip sync, and idle animation.

VRM Engine (3D)

Avatars: Haru, Emi (~35MB total)

The VRM engine renders 3D avatar models using SceneKit with a Metal backend. It provides:

  • 60 FPS unified animation loop with layered blending
  • 7 emotion expressions (happy, sad, angry, surprised, excited, confused, neutral)
  • 5 viseme shapes for lip synchronization
  • Idle animations - breathing, blinking, head movement, body sway
  • Physics simulation - hair and clothing dynamics
  • NSCache with 100MB byte limit and automatic memory pressure handling

Live2D Engine (2D)

Avatars: Hiyori, Mao (~32MB)

The Live2D engine uses the Cubism SDK with a Metal renderer (migrated from OpenGL ES for iOS 26 compatibility). It features:

  • Emotion-driven facial animations with parameter mapping
  • Swift-to-C++ bridge via Objective-C++ (Live2DBridge.mm)
  • Metal rendering via Live2DMetalView

Gaussian Splatting Engine (Photorealistic Custom Avatars)

Status: In development (core pipeline complete)

The Gaussian Splatting engine enables users to create photorealistic animated avatars from a single selfie photo. This is rendered via MetalSplatter at 30-60 FPS.

On-Device Reconstruction Pipeline

ON-DEVICE RECONSTRUCTION • SELFIE TO AVATAR
INPUTSingle selfie photo518×518, center-croppedCOREML INFERENCE • 557.6M PARAMSDINOv2 ViT-L/14multi-scale image featuresSD3-style Transformer10-layer decoderGSLayer MLP20,018 Gaussians × 14chLAM: Large Avatar Model (SIGGRAPH 2025) • Apache-2.0 • FP16: 1,214.6 MBEXPORTExport to .spz formatcompressed GaussiansBLENDSHAPE GENERATIONflame_arkit_mapping.json52 ARKit blendshapesFLAME LBS position deltasper-Gaussian deformationsOUTPUTGaussianSplatEngine renders at 60 FPS

Performance Comparison

MetricPrevious (Cloud)Current (On-Device)
Input60-second guided video1 selfie photo
Wait time5-15 minutesTBD (benchmarking)
Network200MB up, 10MB downModel download only (~1.2GB, one-time)
Cost per avatar$0.50-1.50 (cloud GPU)$0
Works offlineNoYes (after initial download)

Animation

Gaussian avatars are animated using the 3D Gaussian Blendshapes technique:

deformed_position[i] = neutral[i] + sum(weight_j * delta_j[i])

52 ARKit blendshape weights drive per-Gaussian position deltas. A fallback region-based deformation system operates when precomputed deltas are unavailable (jaw, mouth, eyes, brows).

LAM Model Details

PropertyValue
PaperLAM: Large Avatar Model (SIGGRAPH 2025)
LicenseApache-2.0
Parameters557,642,254 (557.6M)
InputSingle image (518x518)
Output20,018 animatable 3D Gaussians x 14 channels
AnimationFLAME Linear Blend Skinning + blendshapes
Model size (FP16)1,214.6 MB

Custom Avatar Personalities

Users select from 6 built-in AI personalities when creating a custom avatar. Each personality includes a unique character primer that shapes the companion's conversational style and responses.


5. Memory System

The DigiFrens memory system is a five-phase architecture that goes far beyond simple conversation history. It models emotional patterns, detects behavioral routines, maintains shared language, and uses multiple retrieval strategies to surface the most relevant memories.

Architecture Overview

MEMORY SYSTEM ARCHITECTURE • 5-PHASE
STORE PATHUser messageStore memoryreturns UUIDEmbeddingsGTE-SmallSQLiteFloat16 + zlibLUM ExtractorsBelief / Value / Goal EdgesQUERY PATHQuery9 Retrieval Strategiessemantic, emotional, spreading...LRU Cache20MB limitRanked ResultsMAINTENANCEConsolidation (session end)Pruning (hourly)Cache <1msDB 20-50msSemantic 100-300msTotal 150-400ms

Five Phases

Phase 1: Emotional Timeline

Mood tracking with per-emotion baselines and anomaly detection. Each conversation turn records the user's emotional state, building a timeline that reveals patterns over days, weeks, and months.

Phase 2: Proactive Intelligence

Automated follow-ups, check-ins, celebrations, and crisis detection based on life events and emotional patterns. The system proactively surfaces relevant context without being asked.

Phase 3: Pattern Detection

Detection of behavioral routines (day/time patterns), emotional triggers (90-day timeline analysis), and coping strategies (mood recovery sequences).

Phase 4: Shared Language

Inside jokes, communication quirks, and shared experiences accumulate over time, giving each companion relationship a unique vocabulary and history.

Phase 5: Context Windows

Nine retrieval strategies ensure the most relevant memories surface for each conversation:

  1. Semantic - embedding similarity
  2. Topical - tag and category matching
  3. Emotional - mood-aligned retrieval
  4. Temporal - recent and time-relevant
  5. Recency - most recent interactions
  6. Importance - high-importance memories first
  7. Social - relationship-relevant memories
  8. Associative - linked memory chains
  9. Spreading Activation - graph-based BFS traversal (see Cognitive Memory Pipeline)

Storage & Performance

MetricValue
Storage formatSQLite with Float16 + zlib compression
Compression ratio~70% storage reduction
CacheThread-safe LRU, 20MB limit
Cache hit latency<1ms
DB query latency20-50ms
Semantic search (500 memories)100-300ms
Total retrieval150-400ms

6. Living User Model (LUM)

The Living User Model is a cognitive graph that models the user's mental landscape. It tracks what the user believes, values, and aspires to, creating a rich understanding that goes beyond surface-level conversation.

Graph Structure

Node Types

  • Beliefs - statements the user holds true (with confidence and valence scores)
  • Values - principles the user considers important (with categories)
  • Goals - objectives the user is working toward (with progress and status)
  • Emotional Triggers - situations that reliably produce emotional responses
  • Narrative Themes - recurring life narrative patterns
  • Emergent Types - new node types automatically discovered from conversation patterns

Edge Types

Edge TypeDescription
supportsOne node reinforces another
contradictsNodes are in tension
triggersOne node activates another
motivatesOne node drives another
leadsToNarrative arc connection
coEntityShared named entities
coSessionSame conversation session
temporalWithin 24-hour window
coTopicShared tags/topics
semanticSimilarEmbedding similarity > 0.65

Key Features

Emergent Schema Learning

The system automatically detects new node and edge types from conversation patterns. When users discuss concepts that don't fit existing categories, the LUM proposes new schema elements through EmergentPredicateDetector.

Life Chapters

Automatic narrative arc detection segments the user's experience into chapters (e.g., "job transition," "new relationship," "health focus"). Each chapter provides context for how current conversations relate to broader life patterns.

Mood Trajectory

Real-time classification of emotional direction:

  • Improving - positive trend
  • Declining - negative trend (triggers empathic support)
  • Stable - consistent emotional state
  • Volatile - rapid emotional shifts (triggers active listening)

Traversal Decay & Reinforcement

  • Unused connections decay with a 30-day half-life
  • Traversed edges are reinforced proportional to usage
  • Ensures the graph stays current and relevant

Context Integration

LUM insights feed directly into AI prompts through LUMContext:

LUM CONTEXT • AI PROMPT INTEGRATION
LUMContextbeliefswith confidencevaluesrankedgoalswith progressmoodTrajectorydirectionidentitysummarychapteractiveInjected into AI system promptFeeds beliefs, values, goals, mood trajectory, and life context into every AI response

7. Cognitive Memory Pipeline

The Cognitive Memory Pipeline brings memories into the LUM cognitive graph as first-class nodes, connecting them via typed edges. It is inspired by cognitive science research on associative memory and a 6Rs processing pipeline (Record, Reduce, Reflect, Reweave, Verify, Rethink).

Three Capabilities

Memory Reweaving

Retroactively enriches older memories when new information arrives. Operates in three tiers:

TierTriggerLatencyScope
Tier 1: InlineEach new memory stored~50msEntity overlap detection, tag updates, importance boost
Tier 2: Session-endConversation endsSecondsSemantic similarity edges, narrative continuation, emotional reinterpretation
Tier 3: Deep scanDaily maintenanceMinutesFull graph analysis, cross-session patterns

Example: A user mentions "interview next week" in one conversation, then says "I got the job!" two weeks later. Reweaving links these memories via a .leadsTo narrative arc edge.

Knowledge Quality Pipeline (Verify/Rethink)

Systematic quality checks on stored knowledge:

  • Contradiction detection - flags beliefs that conflict with newer information
  • Staleness detection - identifies outdated information
  • Confidence decay - reduces certainty on old, unreinforced beliefs
  • Sentiment drift analysis - detects emotional valence changes via UnifiedEmotionAnalyzer
  • Findings routing - quality issues surface through proactive intelligence (max 3 per pass to prevent flooding)

Spreading Activation

The 9th retrieval strategy. Uses breadth-first graph traversal along cognitive edges to discover memories through associative structure rather than just embedding similarity. Activation decays with each hop, and multi-path convergence receives a boost — mirroring how human associative memory works.

Data Flow

COGNITIVE MEMORY PIPELINE • DATA FLOW
New memory storedEDGE CREATIONMemoryEdgeManagerco_entityco_sessiontemporalco_topicTIER 1: INLINEMemoryReweaver~50msEntity overlapTag updatesImportance boostSESSION ENDMemoryReweaverTier 2: session-endSemantic edgesnarrative continuationEmotional reinterpretationDAILY MAINTENANCEMemoryReweaverTier 3: deep scanKnowledgeQualityPipelineverify + rethinkEdge maintenance7-day decay, 50K capContradictionsStalenessConfidenceSentimentEdge cap: 50,000 • Reweave limit: 3 per memory • Entity index: O(1) lookup

8. AI & Intelligence

Multi-Provider Architecture

DigiFrens supports six AI providers through a unified abstraction layer (AIService), giving users flexibility in cost, quality, and privacy.

ProviderModelsCostNotes
Apple Intelligence3B on-device modelFreeNo API key, works offline, iOS 26+ required
OpenAIGPT-4.1 Nano, Mini, GPT-4oUser API keyCloud-based
AnthropicClaude Haiku 4.5, Sonnet 4.5, OpusUser API keyCloud-based
Local LEAPOn-device LEAP SDK modelsDigiFrens+No network required
OpenRouterVarious free and paid modelsUser API keyModel aggregator
OpenClawSelf-hosted modelsWebSocket gatewaySelf-hosted option

Context Building

The ContextBuilder assembles a comprehensive system prompt for every AI request, including:

  • Relevant memories (multi-strategy retrieval)
  • Emotional history and current mood
  • Mental process state and prompt
  • Avatar personality blueprint (HEXACO traits)
  • Calendar context (upcoming events)
  • Shared language (inside jokes, quirks)
  • LUM cognitive context (beliefs, values, goals)
  • Spreading activation results

Context is truncated to a configurable maxContextTokens (default: 2000) to stay within provider limits.

Mental Process (OpenSouls Pattern)

An adaptive conversation state machine integrated with LUM:

11 Mental States

StateWhen Used
Crisis SupportDetected distress or safety concerns
Empathic SupportDeclining mood trajectory
Active ListeningVolatile emotions
Deep ConversationGoal discussion or intellectual topics
Problem SolvingUser seeking practical help
CelebrationMilestones or achievements
Playful BanterLight, fun interactions
StorytellingNarrative or experience sharing
Casual ChatDefault relaxed conversation
ProcessingAbsorbing complex information
TransitioningShifting between modes

LUM-Aware State Selection

The mental process considers LUM data when selecting states:

  • Declining mood - +0.3 weight toward empathic support
  • Volatile emotions - +0.2 weight toward active listening
  • Goal mention - +0.4 weight toward deep conversation
  • Negative self-beliefs - applies gentler response modifiers

Response Modifiers

Each state adjusts response parameters:

  • Verbosity level
  • Question frequency
  • Reflection depth
  • Humor level
  • Formality

Streaming

The AI system supports streaming responses with:

  • Streaming API responses - tokens appear as they're generated
  • Parallel context building - context assembly runs concurrently
  • TTS pipelining - voice synthesis begins before the full response completes

Embeddings

On-device CoreML embeddings via GTE-Small (384-dimensional). Embeddings never leave the device, powering semantic search across memories.


9. Emotion System

Detection Architecture

EMOTION DETECTION • 6-SIGNAL FUSION
User textSIGNAL FUSION (6 SOURCES)SemanticLinguisticSentimentContextualHistoricalExplicitSarcasm detection • Context bias correction • Adaptive learningEmotionalState7 core + 8 complex emotionsOUTPUTComplex Emotion MappingAvatar ExpressionsTimeline StorageUnifiedEmotionAnalyzerAdaptiveEmotionLearnerEmotionalTimelineManagerMemoryGraphV2

Emotion Categories

Core Emotions (7)

happy, sad, angry, surprised, excited, confused, neutral

Complex Emotions (8)

tired, anxious, content, frustrated, grateful, bored, embarrassed, proud

Detection Features

  • 6-signal weighted fusion - semantic analysis, linguistic markers, sentiment scoring, contextual cues, historical patterns, and explicit statements
  • Sarcasm detection - identifies when literal text contradicts intended emotion
  • Context bias correction - adjusts for conversation context
  • Adaptive learning - self-improving system that calibrates to each user's expression patterns

10. Personality Evolution

HEXACO Model

DigiFrens uses the HEXACO personality framework - six core traits on a 0.0 to 1.0 scale:

TraitLow EndHigh End
Honesty-HumilityManipulative, self-importantSincere, modest, fair
EmotionalityStoic, detachedEmpathetic, anxious, sentimental
ExtraversionReserved, quietSocial, energetic, cheerful
AgreeablenessCritical, stubbornForgiving, flexible, patient
ConscientiousnessSpontaneous, disorganizedOrganized, diligent, perfectionist
OpennessPractical, conventionalCreative, curious, unconventional

Per-Avatar Baselines

AvatarHEXACOCharacter
Haru65%40%45%60%70%70%Cool, introspective
Emi70%85%75%90%65%60%Warm, expressive
Hiyori80%55%40%65%90%85%Studious, intellectual
Mao55%60%80%50%45%75%Mischievous, playful

Evolution Mechanics

  • Session updates: ~1% change per trait per session based on conversation metrics (depth, positivity, engagement)
  • Relationship multiplier: Scales from 0.5x (new companion) to 1.5x (soulmate level)
  • Weekly decay: 0.5% per week toward baseline when inactive, ensuring personalities return to character when not reinforced
  • Trait bounds: Always constrained to 0.0-1.0

AI Integration

Personality traits directly influence avatar responses through behavioral prompts injected into the AI system prompt. Higher extraversion produces more talkative responses; higher emotionality produces more empathetic ones.

Visualization

A radar chart (PersonalityRadarChartView) displays current trait levels against baseline values on a hexagonal chart, letting users see how their companion's personality has evolved.


11. Voice System

Architecture

VOICE SYSTEM ARCHITECTURE
speakResponse()ConversationViewModelelevenlabs_*StreamingTTSAudioChunksystem voiceVoiceServiceAVSpeechLIP SYNCHRONIZATION5 viseme shapesPhoneme est.Audio energyPer-engine adapt

Voice Options

Premium Voices (DigiFrens+)

ElevenLabs integration with 30+ neural voices. Features:

  • Streaming TTS with word-by-word captions
  • Per-avatar voice assignment
  • Natural prosody and expressiveness

System Voices (Free)

Built-in AVSpeechSynthesizer voices available on all devices.

Planned: On-Device TTS (Kokoro)

Kokoro TTS (82M params, ~86MB quantized) is planned as a free offline alternative:

FeatureElevenLabsKokoro (Planned)System
QualityExcellentGoodBasic
CostDigiFrens+FreeFree
OfflineNoYesYes
Voices30+11Many
Latency~200ms network~300ms localInstant

Lip Synchronization

Real-time lip sync maps audio/text to viseme shapes on the avatar:

  • 5 viseme categories for VRM avatars
  • Text-based phoneme estimation for immediate sync
  • Audio energy analysis for natural movement timing
  • Per-engine adaptation - VRM morph targets, Live2D parameters, Gaussian position deltas

12. Calendar Integration

Status: Production-ready

Features

  • Read calendar events with natural language queries
  • Create events and reminders from natural language
  • Support for recurring events (daily, weekly, monthly)
  • Advanced time parsing (ranges, relative dates, durations)
  • Proactive 30-minute event reminders
  • Schedule stress analysis and break suggestions
  • Graceful degradation without calendar permissions

Natural Language Parsing

InputInterpretation
"Meeting tomorrow at 9am-3pm"6-hour event, tomorrow
"Remind me to call Mom next Monday"Reminder, next Monday
"Block time this afternoon for 2 hours"2-hour event, today at 2pm
"Cancel my dentist appointment"Event cancellation

13. Proactive Intelligence

The proactive intelligence system enables companions to initiate contextually relevant interactions without being prompted.

Action Types

TypeTriggerExample
Follow-upLife event needs follow-up"How did the interview go?"
Check-in3+ days inactivity"Hey, I haven't heard from you in a while"
CelebrationMilestones reached"Congrats on reaching your reading goal!"
ConcernCrisis or anomaly detectedSupportive outreach during emotional distress
ReminderUser goal or intention"You mentioned wanting to start exercising"
EncouragementUpcoming event support"Good luck with your presentation tomorrow"

Pattern Detection

Pattern TypeAnalysis
Emotional Triggers90-day timeline analysis identifies situations that reliably produce specific emotions
Coping StrategiesTracks mood recovery sequences to understand what helps the user feel better
Behavioral RoutinesDay/time patterns reveal the user's natural rhythms

Data Flow

PROACTIVE INTELLIGENCE • DATA FLOW
ConversationsLIFE CONTEXT TRACKINGLifeContextTrackerLife EventsFollow-upsPATTERN DETECTIONPatternDetectionServiceEmotional TriggersCoping StrategiesRoutines90-day timeline • Mood recovery • Day/time patternsENGINEProactiveIntelligenceEngineProactive ActionsOUTPUTReflectiveMemoryProcessorContext Hints

14. Privacy & Security

Design Principles

DigiFrens follows a local-first, privacy-by-default architecture. All sensitive data stays on the user's device.

Data Storage Security

DataStorageSecurity Level
ConversationsLocal SQLiteDevice-only, never uploaded
MemoriesLocal SQLiteDevice-only
LUM cognitive graphLocal SQLiteDevice-only
Emotional timelineLocal SQLiteDevice-only
API keysKeychainHardware-encrypted, kSecAttrAccessibleWhenUnlockedThisDeviceOnly
Device IDKeychainHardware-encrypted, device-local
Passkey credentialsKeychainHardware-encrypted
User preferencesUserDefaultsStandard
Custom avatar modelsDocuments folderDevice-only

Authentication

  • Automatic device-based accounts - no sign-up required
  • Optional passkey security - WebAuthn protocol with biometric verification
  • No email or password - device ID serves as the user identifier

On-Device Processing

CapabilityImplementation
Text embeddingsCoreML (GTE-Small, 384-dim)
Emotion analysisOn-device NLP + learned models
Avatar reconstructionCoreML (LAM, 557M params)
AI responsesApple Intelligence (on-device, optional)

Cloud Interactions

The only cloud interactions are:

  • AI providers (optional) - when using OpenAI, Anthropic, or OpenRouter
  • ElevenLabs (optional) - premium voice synthesis
  • Subscription verification - StoreKit receipt validation
  • Model download - one-time LAM model download for custom avatars

Conversation content is never sent to DigiFrens servers.


15. Subscription Model

Tiers

Free ($0)

  • 2 VRM avatars (Haru, Emi) + 2 Live2D avatars (Hiyori, Mao)
  • Apple Intelligence AI (on supported devices)
  • Bring your own API keys (OpenAI, Anthropic, OpenRouter)
  • Basic system voices
  • Full memory system and LUM
  • Calendar integration

DigiFrens+ ($15/month)

  • Everything in Free, plus:
  • Download and use local LEAP LLMs (no API key needed)
  • Use GPT without an API key
  • Premium ElevenLabs voices (30+ options)
  • Custom avatar creation (Gaussian Splatting)
  • Unlimited interaction time
  • Up to 3 custom avatars
  • Voice customization per avatar
  • Priority processing
  • Early access to new features
  • Priority support
  • 1-week free trial included

16. Platform Requirements

System Requirements

RequirementMinimumRecommended
iOS version26.026.0+
DeviceiPhone 11iPhone 15 Pro+
Storage~500MB~2GB (with custom avatars)

17. Development Status & Roadmap

Current Status

Development began July 2025. The core platform — triple avatar engine, five-phase memory system, LUM cognitive graph, six AI providers, premium voice synthesis, and calendar integration — is fully implemented and functional.

Current focus areas include on-device custom avatar reconstruction via CoreML and Gaussian Splatting rendering polish.

Roadmap

FeatureDescriptionStatus
On-device TTSKokoro 82M-param model as free offline voicePlanned
Desktop companionmacOS app via Catalyst or nativeDocumented
Live2D widgetsHome and lock screen widgetsDocumented
Multimodal inputImage and audio input supportDocumented
Crypto paymentsx402 payment agent integrationDocumented
AR integrationAugmented reality avatar overlayWhitepaper Phase 2

18. Codebase Statistics

MetricValue
Swift source files175
Service domains14
AI providers6
Avatar engines3 (VRM, Live2D, Gaussian Splat)
Memory retrieval strategies9
Mental process states11
Emotion categories15 (7 core + 8 complex)
HEXACO personality traits6
Database tables20+

Appendix: Key References

Research Papers

  • LAM: Large Avatar Model (SIGGRAPH 2025) - single-image animatable Gaussian avatar reconstruction
  • 3D Gaussian Blendshapes (SIGGRAPH 2024) - pure linear blendshape deformation for Gaussians
  • HEXACO Personality Model - six-factor personality framework

Open Source Dependencies

PackageLicensePurpose
MetalSplatterMITGaussian splat rendering on Metal
SplatIOMIT.splat/.ply/.spz file I/O
spz-swiftMITSPZ compressed format support
Live2D Cubism SDKCommercial2D avatar animation
LAMApache-2.0Avatar reconstruction model

Inspiration

  • Cognitive science research - associative memory models, spreading activation retrieval, 6Rs processing pipeline
  • OpenSouls - mental process state machine pattern for adaptive conversation

DigiFrens - Built with Swift, SwiftUI, and the power of Apple's ecosystem.

Version 3.0 | March 2026 | All rights reserved.