WhatsApp Local Agent - System Documentation
Overview
WhatsApp Local Agent es un módulo standalone de ProjectOS que permite automatizar WhatsApp Web de forma local, similar al scraper-local-agent pero especializado en mensajería.
Características principales:
- ✅ Enviar mensajes (texto, imágenes, audios, ubicaciones)
- ✅ Recibir mensajes - Detecta mensajes entrantes y los envía a ProjectOS
- ✅ Disparar workflows con trigger
whatsapp_message - ✅ Comportamiento humano (anti-detección)
- ✅ Persistencia de sesión
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ ProjectOS Cloud │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Workflow Executor │ │
│ │ ┌──────────────────────────────────────────────┐ │ │
│ │ │ WhatsApp Media Sender Node │ │ │
│ │ │ - whatsapp_media_sender (Business API) │ │ │
│ │ │ - whatsapp_local_sender (Local Agent) │ │ │
│ │ └──────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ │ WebSocket │
└──────────────────────────┼───────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ WhatsApp Local Agent (Local Machine) │
│ │
│ ┌────────────────┐ ┌────────────────┐ ┌───────────────┐ │
│ │ REST API │ │ WebSocket │ │ Dashboard │ │
│ │ (Fastify) │ │ Server │ │ (HTML/JS) │ │
│ └───────┬────────┘ └───────┬────────┘ └───────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ WhatsApp Client ││
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
│ │ │ Human Delays │ │ Selectors │ │ Message │ ││
│ │ │ │ │ │ │ Handlers │ ││
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
│ └─────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Browser Manager (Playwright) ││
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
│ │ │ Stealth Mode │ │ Session │ │ Anti-Detect │ ││
│ │ │ │ │ Persistence │ │ │ ││
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
│ └─────────────────────────────────────────────────────────────┘│
│ │ │
│ ┌───────────────────────┼───────────────────────────────────┐ │
│ │ ▼ │ │
│ │ ┌──────────────────────┐ │ │
│ │ │ Message Queue │ │ │
│ │ │ (SQLite) │ │ │
│ │ └──────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────┐
│ WhatsApp Web │
│ (web.whatsapp.com) │
└──────────────────────────────────────┘
Components
1. Server (src/server.js)
- Main Fastify server entry point
- Registers routes, middleware, and WebSocket
- Graceful shutdown handling
2. Browser Manager (src/browser/manager.js)
- Playwright browser lifecycle management
- Stealth configuration with anti-detection
- Session persistence using persistent context
- Screenshot and debugging utilities
3. WhatsApp Client (src/whatsapp/client.js)
- Core WhatsApp Web automation
- QR code scanning and authentication
- Message sending (text, image, audio, location)
- Chat reading and management
- Event emitter for real-time status
4. Human Delay (src/whatsapp/humanDelay.js)
- Random delay generation
- Human-like typing simulation
- Natural mouse movements
- Reading time calculation
- Distraction simulation
5. Selectors (src/whatsapp/selectors.js)
- WhatsApp Web DOM selectors
- Version-agnostic element targeting
- Helper functions for dynamic content
6. Message Queue (src/services/messageQueue.js)
- SQLite-based message persistence
- Retry logic with exponential backoff
- Queue statistics and monitoring
- Activity logging
7. Message Watcher (src/services/messageWatcher.js) ⭐ NEW
- Polls WhatsApp Web for new incoming messages
- Detects unread chats and extracts new messages
- Emits events for each new message received
- Forwards messages to ProjectOS Cloud via webhook/WebSocket
- Triggers workflows with
whatsapp_messagetrigger
8. Remote Worker (src/services/remoteWorker.js)
- WebSocket connection to ProjectOS Cloud
- Remote command execution
- Forwards incoming messages to cloud
- Heartbeat and reconnection logic
- Status reporting
9. API Routes (src/api/routes.js)
- REST endpoints for all operations
- WebSocket endpoint for real-time events
- Message watcher control endpoints
- Input validation and error handling
Human-Like Behavior
The agent implements several techniques to mimic human behavior:
Typing Delays
// Random delay between keystrokes (30-80ms)
for (const char of text) {
await page.keyboard.type(char);
await delay(randomDelay(30, 80));
// 5% chance of longer pause (thinking)
if (Math.random() < 0.05) {
await delay(randomDelay(200, 500));
}
}
Mouse Movements
// Move to element with natural trajectory
const box = await element.boundingBox();
const x = box.x + box.width / 2 + randomDelay(-3, 3);
const y = box.y + box.height / 2 + randomDelay(-3, 3);
await page.mouse.move(x, y, { steps: randomDelay(5, 15) });
Action Delays
// Random delay between actions (500-2000ms)
await humanSleep(500, 2000);
// Occasional "distraction" (3-8 seconds)
if (Math.random() < 0.1) {
await humanSleep(3000, 8000);
}
Security Considerations
Anti-Detection
- Stealth plugin removes automation markers
- Rotating user agents
- Natural navigator properties
- No headless detection flags
Session Security
- Sessions stored in local
sessions/directory - Automatic session restoration
- Secure logout with session cleanup
Rate Limiting
- Message delays prevent flood detection
- Queue system for controlled sending
- Automatic retry with backoff
Integration with ProjectOS
Incoming Message Flow (Workflow Trigger)
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ WhatsApp Web │────▶│ Message Watcher │────▶│ Remote Worker │
│ (New Message) │ │ (Detects) │ │ (Forwards) │
└──────────────────┘ └──────────────────┘ └────────┬─────────┘
│
┌─────────────────────────────────┘
▼
┌────────────────────────────┐
│ POST /api/agents/webhook │
│ (ProjectOS Cloud) │
└────────────┬───────────────┘
│
┌────────────▼───────────────┐
│ incoming_whatsapp_messages│
│ (Database) │
└────────────┬───────────────┘
│
┌────────────▼───────────────┐
│ Workflow Executor │
│ trigger: whatsapp_message │
└────────────────────────────┘
Workflow Example
// Workflow with whatsapp_message trigger
{
nodes: [
{
type: 'trigger',
data: {
config: {
triggerType: 'whatsapp_message',
provider: 'local-agent:whatsapp-001'
}
}
},
{
type: 'ai_chat',
data: {
config: {
prompt: 'Responde al mensaje: {{trigger.text}}'
}
}
},
{
type: 'whatsapp_local_sender',
data: {
config: {
agentId: 'whatsapp-001',
phone: '{{trigger.from}}',
message: '{{ai_result}}'
}
}
}
]
}
Remote Commands
The agent listens for these command types via WebSocket:
send_message- Send text messagesend_image- Send image with captionsend_audio- Send audio filesend_location- Send locationget_chats- Get chat listget_messages- Get messages from chatstatus_request- Get agent status
Outgoing Commands (Cloud → Agent)
The agent receives commands from ProjectOS Cloud:
start_watcher- Start watching for new messagesstop_watcher- Stop watching
File Structure
whatsapp-local-agent/
├── src/
│ ├── api/
│ │ └── routes.js # REST API routes
│ ├── browser/
│ │ └── manager.js # Playwright browser management
│ ├── services/
│ │ ├── messageQueue.js # SQLite message queue
│ │ ├── messageWatcher.js # Incoming message detection ⭐
│ │ └── remoteWorker.js # ProjectOS cloud connection
│ ├── whatsapp/
│ │ ├── client.js # Main WhatsApp client
│ │ ├── humanDelay.js # Human-like timing utilities
│ │ └── selectors.js # DOM selectors
│ └── server.js # Main entry point
├── ui/
│ └── index.html # Web dashboard
├── sessions/ # Browser sessions (gitignored)
├── data/ # Database and temp files (gitignored)
├── .env.example # Environment template
├── .gitignore
├── package.json
└── README.md
Differences from WhatsApp Business API
| Feature | Business API | Local Agent |
|---|---|---|
| Account Type | Business | Personal |
| Setup | Meta verification | QR scan |
| Cost | Per-message pricing | Free |
| Rate Limits | Official limits | Self-managed |
| Templates | Required for 24h+ | Not needed |
| Media | Upload API | Direct send |
| Reliability | High (official) | Depends on DOM |
| Updates | Stable API | May need selector updates |
Troubleshooting
Common Issues
QR Code Not Showing
- Set
HEADLESS=falsein.env - Run
npm run install-browser
- Set
Session Expired
- Delete
sessions/folder - Re-scan QR code
- Delete
Elements Not Found
- WhatsApp Web may have updated
- Update selectors in
selectors.js
Rate Limited
- Increase delays in
.env - Reduce message frequency
- Increase delays in
Debugging
# Enable debug logging
LOG_LEVEL=debug npm start
# Save screenshots
DEBUG_SCREENSHOTS=true npm start
Future Enhancements
- Group message support ✅
- Workflow trigger with filters ✅ (ver WHATSAPP_TRIGGER_SYSTEM.md)
- STT - Audio transcription ✅ (Whisper, Groq, OpenAI)
- ITT - Image analysis ✅ (Gemini, OpenAI, LLaVA)
- Media download from chats
- Status/Story viewing
- Contact sync
- Multiple accounts
- Message templates
- Scheduled messages
Related Documentation
- WHATSAPP_TRIGGER_SYSTEM.md - Sistema completo de triggers, filtros, STT, ITT
- WHATSAPP_MEDIA_SENDER.md - Envío de mensajes multimedia
Created as part of ProjectOS - Personal Automation Platform