Upload a WhatsApp, Telegram, or Facebook Messenger export and get instant, deep intelligence — 35+ analytics modules, question analysis, conversation starters, emoji sentiment, topic clusters, burst analysis, year-over-year comparison, and much more. Everything runs 100% in your browser. Your data never leaves your device.
Three simple steps from export to insight. The entire process takes under 2 minutes.
Export a chat from your preferred platform and save it to your device.
.txt fileresult.jsonmessage_1.jsonSelect your platform tab, then upload your file. Click Run Full Analysis.
Navigate the 9 result tabs to explore different dimensions of your conversation.
Select your platform, upload your file. All parsing is instant and local.
Drag & drop your file here, or click Browse
WhatsApp: .txt · Telegram: result.json · Facebook: message_1.json
🔒 Zero upload. Zero server. Your file never leaves your device — ever.
35+ analytics modules computed from your export. Use the tabs below to navigate.
When is the group most active?
Which day drives the most conversation?
Full trend with 7-day rolling average overlay
Month-by-month message volume
Days with ≥ 2× the daily average
Consecutive days with at least one message
35+ analytics modules. Zero cost. Zero server. Zero AI API.
WhatsApp .txt (Android, iOS, US AM/PM), Telegram result.json, Facebook message_1.json. Auto-detects format. Fixes Facebook's UTF-8 encoding. Handles multi-line messages.
Hourly, weekday, daily trend with 7-day rolling average, monthly, surge detection, streak calendar.
Messages, share %, media, links, emoji-only, avg words, avg chars, first/last date. Sortable table.
Word frequency, word cloud, emoji cloud, message types, length distribution, domain intelligence.
Top 15 bigrams and trigrams — revealing actual phrases and expressions used in the group.
Type-Token Ratio per group and per sender — standard linguistic diversity metric.
Who responds first most often — the most responsive, most engaged participant.
Longest and current consecutive active-day streaks. 90-day streak calendar.
Spiral-placed, logarithmically scaled, theme-aware word cloud. No external library.
Enter any keyword — see what words appear most often in the same message.
Detects question messages (? or question words). Ranks per-sender to reveal who drives vs responds.
Who initiates new sessions (gaps ≥ 60 min)? Distinguishes initiators from responders.
Rapid back-and-forth bursts (3+ messages, < 3 min gaps). Average and longest burst length.
Annual message count, unique senders, avg per active day — for chats spanning 2+ years.
Morning / Afternoon / Evening / Night breakdown. Late-night (11 PM–5 AM) percentage.
Average unique senders per active day — measures day-to-day breadth of participation.
Automatically detected conversation themes: Work, Faith, Family, Money, Events, Tech, Sports.
Messages per active hour — intensity metric independent of date range.
Top 5 longest messages with preview — announcements, essays, important posts.
Classifies emojis as positive / negative / neutral using a curated emoji dictionary.
VADER-inspired lexicon, monthly trend, per-sender sentiment stacked bar chart.
0–100 composite from participation balance, response speed, sentiment, consistency.
10+ plain-language observations generated from your data. No AI. Rule-based templates.
Full-text search with sender, date, type filters. Keyword highlighting. Up to 200 results.
Theme toggle + Aa accessibility font-size toggle. Charts re-render on theme switch.
CSV, JSON, Text Report, Markdown Report, Print/PDF.
10 algorithm explanations with Key Concept callouts. Written for all skill levels.
Zero server. Zero API. FileReader API only. Verify in browser Network Inspector.
Every algorithm explained in plain language. Both NLP concepts and the data science pipeline. Written by a 15+ year STEM educator.
The first task is reading the exported file and understanding its structure. Different platforms export in completely different formats.
messages array is traversed. Each entry's text field may be a plain string OR an array of entity objects — both are flattened by joining entity text values.timestamp_ms). Facebook double-encodes Unicode as Latin-1 — ChatLens applies decodeURIComponent(escape(str)) to fix garbled text automatically./^\[(\d{1,2}\/\d{1,2}\/\d{2,4}),\s*(\d{1,2}:\d{2})\]\s+(.+?):\s(.+)$/ extracts date, time, sender, and message in one pass. Same technique used in production log parsers, web scrapers, and ETL pipelines.Raw parsed data needs cleaning and enrichment. Derived features are extracted from each timestamp: hour, dayOfWeek, month, year — equivalent to pandas' df['hour'] = df['timestamp'].dt.hour. System messages are tagged and excluded. Messages are classified as text, media, link, or emoji_only.
EDA summarises dataset characteristics before modelling. Messages are grouped by hour, weekday, month, and sender using hash-map accumulation — equivalent to df.groupby('hour')['msg'].count(). Rolling averages use a 7-day centred window. Surge detection flags days where count ≥ 2× daily average.
rolling[i] = mean(counts[i-3 : i+3]).N-grams are contiguous sequences of N words. Bigrams (N=2): "good morning", "thank you". Trigrams (N=3): "I don't know", "happy new year". For each message, adjacent word pairs/triplets are extracted and counted in frequency maps. Bigrams and trigrams that consist only of stopwords are filtered out.
TTR = Unique Words ÷ Total Words. Measures linguistic diversity. Group chats typically score 0.1–0.4 because of repeated phrases. Higher per-sender TTR suggests more varied, expressive language. Computed independently for each top-10 sender.
~400 signal words embedded in analytics.js. Each message is tokenised. Positive and negative hits are counted. Score > 0 → Positive; Score < 0 → Negative; Score = 0 → Neutral. A separate emoji sentiment module classifies emojis using a curated positive/negative emoji dictionary — providing a second, independent sentiment signal.
Question detection: Messages containing "?" or starting with question words (who, what, when, where, why, how, is, are, can, could, etc.) are flagged. Ranked per sender to reveal who drives discussions.
Conversation starters: A session begins when the gap since the last message is ≥ 60 minutes. The first sender in each new session is the "starter" — distinguishing initiators from responders.
Bursts: Sequences of 3+ consecutive messages where each gap < 3 minutes. Total bursts, average length, and max length are computed. Distribution across 4 size buckets (3–5, 6–10, 11–20, 20+) is shown.
Pace score: Total messages ÷ unique hours with activity. Measures intensity independent of date range. A group chatting 500 messages in 10 active hours (pace=50) is more intense than one chatting 1000 messages over 500 active hours (pace=2).
Co-occurrence: For each message containing a keyword, all other non-stopword tokens are collected and counted — revealing semantic context.
Topic clusters: Seed word dictionaries define topics (Work, Faith, Family, Money, Tech, etc.). If 2+ seed words appear in the top 30 most-used words, the topic is detected. Score = total frequency of matched seed words.
Composite 0–100 from 4 factors: Participation Balance (Gini-based, 25pts), Response Speed (% < 5 min, 25pts), Sentiment Ratio (pos/total, 25pts), Engagement Consistency (% days with messages, 25pts).
(1 − Gini) × 25.Data Scientist · ML Engineer · EdTech Builder · FaithTech Builder · AI-Augmented Solutions Developer
Data Scientist · ML Engineer · EdTech Builder · FaithTech Builder
Virtual Tutor · AI-Augmented Solutions Developer
Lagos-based Data Scientist and ML Engineer with 15+ years as a STEM educator — teaching Mathematics, Further Mathematics, Physics, Chemistry, and Computer Science in Nigeria.
Founder and Director of HMG Concepts, operating HMG Academy — a full-service virtual learning institution supporting WAEC, NECO, BECE, UTME, Post-UTME, IGCSE, IELTS, JUPEB, and SAT.
Holds a B.Sc/Ed in Computer Science Education from Lagos State University (LASU, 2023). Builds data-driven tools combining analytical rigour with pedagogical clarity.
ChatLens reflects this philosophy: every algorithm is documented, every chart is annotated, and the entire pipeline is explained — because data without explanation is just noise.
All projects are live and free.