/projects/whatsapp

Dockerized WhatsApp CRM Backend

Dockerized WhatsApp-style CRM backend ingesting messages/media via webhooks, extracting text from images with OCR, and turning conversations into searchable business data with crash-safe queued processing.

Node.jsExpressPostgreSQLRedisBullMQDocker ComposePythonOCR

Highlights

  • Webhook ingestion pipeline storing messages/media with sender/group/timestamp metadata
  • OCR microservice (Python + pytesseract) extracts IDs/indexes from images
  • Auto-categorization using message context and consistent attachment to contacts/groups
  • BullMQ queueing with retries for heavy processing and crash resilience
  • Generates structured PDF reports and maintains audit trails for traceability

README

Markdown

Dockerized WhatsApp CRM Backend

Overview

A self-hosted, Dockerized WhatsApp-style CRM backend that ingests incoming messages and media via webhooks and converts them into searchable business data.

When images arrive, they are stored with metadata (sender, group, timestamp) and processed through an OCR pipeline to extract IDs/indexes from pictures, then categorized using message context for consistent search and reporting.

Key Features

  • Webhook ingestion for messages + media
  • Media storage with structured metadata (sender/group/timestamp)
  • OCR extraction (Python + pytesseract) for key text (indexes/IDs)
  • Context-based categorization and attachment to the correct group/contact
  • Crash-safe async processing via BullMQ:
    • retries
    • durability across restarts
  • Structured PDF report generation
  • Audit trails for traceability (what happened, when, why)

Architecture

  • Node.js + Express backend (webhook ingestion + API)
  • PostgreSQL with migrations for durable data storage
  • Redis + BullMQ for reliable background processing
  • Python microservice for OCR workloads
  • Docker Compose multi-service deployment with persistent volumes (DB + stored media)

Business Impact

Built to eliminate a real operational bottleneck: manual data transfer from chat attachments into spreadsheets. Automation reduced the workflow from hours to seconds, while keeping everything self-hosted for privacy and control.