ChatLens v4 — Deep Chat Analytics by Adewale Samson Adeagbo

Getting Started

How to Use ChatLens

Three simple steps from export to insight. The entire process takes under 2 minutes.

Export Your Chat

Export a chat from your preferred platform and save it to your device.

📱 WhatsApp

Open any chat or group
Tap ⋮ → More → Export Chat
Choose Without Media
Save the .txt file

✈ Telegram

Open Telegram Desktop (PC/Mac)
Click ⋮ → Export Chat History
Set Format: JSON
Save result.json

👤 Facebook

Go to Settings → Your Facebook Information
Click Download Your Information
Select Messages · Format: JSON
Extract ZIP → upload message_1.json

Upload & Analyse

Select your platform tab, then upload your file. Click Run Full Analysis.

💡Select the correct platform tab first — WhatsApp, Telegram, or Facebook — before uploading. This tells the parser which format to expect.

💡Drag and drop works anywhere on the page — you don't need to click the upload button.

💡File size limit is 50 MB. For very large chats, export a shorter date range using WhatsApp's "Export from specific date" option.

💡Nothing is uploaded to any server. Your file is processed entirely in your browser memory and never leaves your device.

Explore Your Results

Navigate the 9 result tabs to explore different dimensions of your conversation.

📊 ActivityHourly heat-strip, weekday distribution, daily trend, monthly volume, streak calendar, surge table

👥 ParticipantsTop senders, share distribution, sentiment per sender, first-reply champions, full stats table

💬 ContentWord frequency, word cloud, emojis, message types, length distribution, top domains

🔤 LanguageBigrams, trigrams, vocabulary richness (TTR), co-occurrence explorer, topic clusters

📅 TimelineMilestones, response gap distribution, longest silent periods

😊 SentimentOverall tone, monthly trend, top signal words, emoji sentiment analysis

🔬 Deep AnalysisQuestion analysis, conversation starters, burst analysis, year-over-year, time patterns, topic clusters New v4

🔍 SearchFull-text search with sender, date, and type filters — results keyword-highlighted

💡 Insights10+ auto-generated plain-language observations from your data

⬇ ExportDownload your data as CSV, JSON, Text Report, Markdown, or PDF

💡Each chart has an explanation box. Click the "How to read this" section under any chart to understand exactly what it shows and how to interpret it.

💡The Learn section (in the nav) explains every algorithm used — from regex parsing to N-grams to Gini coefficient. Written for all skill levels.

🔒

Your Privacy is Guaranteed by Architecture

ChatLens is designed so that privacy is technically impossible to violate — not just a policy promise. Your chat file is read into browser RAM using the FileReader API and processed locally by JavaScript. No file upload occurs. No network request is made during analysis. No server exists. No database exists. Closing the tab wipes all data immediately. You can verify this by opening your browser's Network Inspector (F12 → Network tab) — you will see zero requests during analysis.

✅ No file upload

✅ No server

✅ No AI API

✅ No cookies

✅ No tracking

✅ Works offline

Analysis Complete

Chat Intelligence Report

35+ analytics modules computed from your export. Use the tabs below to navigate.

Messages by Hour of Day

When is the group most active?

How to read this: Each bar = total messages sent in that hour across all days. The highlighted bar is the peak hour. Use this to understand your group's active time zone and daily rhythm.

Messages by Day of Week

Which day drives the most conversation?

How to read this: Compare weekday vs weekend bars to understand whether your group is socially or professionally driven.

Daily Volume Over Time

Full trend with 7-day rolling average overlay

How to read this: The faint area = raw daily counts. The bright line = 7-day rolling average (smoothed trend). Rising trend = growing engagement. Dips reveal holidays, conflicts, or life events.

Monthly Activity

Month-by-month message volume

Activity Surges

Days with ≥ 2× the daily average

What is a surge? Any day where messages ≥ 2× the overall daily average. These spikes usually correspond to real-world events. Cross-reference with the Timeline tab to investigate what happened.

Conversation Streaks

Consecutive days with at least one message

Platform Features

Everything You Need to Understand a Conversation

35+ analytics modules. Zero cost. Zero server. Zero AI API.

📂

Multi-Platform Parsing

WhatsApp .txt (Android, iOS, US AM/PM), Telegram result.json, Facebook message_1.json. Auto-detects format. Fixes Facebook's UTF-8 encoding. Handles multi-line messages.

📊

Activity Intelligence

Hourly, weekday, daily trend with 7-day rolling average, monthly, surge detection, streak calendar.

👥

Participant Profiles

Messages, share %, media, links, emoji-only, avg words, avg chars, first/last date. Sortable table.

💬

Content Intelligence

Word frequency, word cloud, emoji cloud, message types, length distribution, domain intelligence.

🔤

N-Gram Language Patterns

Top 15 bigrams and trigrams — revealing actual phrases and expressions used in the group.

📚

Vocabulary Richness (TTR)

Type-Token Ratio per group and per sender — standard linguistic diversity metric.

🏆

First-Reply Analysis

Who responds first most often — the most responsive, most engaged participant.

🔥

Conversation Streaks

Longest and current consecutive active-day streaks. 90-day streak calendar.

☁

Canvas Word Cloud

Spiral-placed, logarithmically scaled, theme-aware word cloud. No external library.

🕸

Co-occurrence Explorer

Enter any keyword — see what words appear most often in the same message.

❓

Question Analysis v4

Detects question messages (? or question words). Ranks per-sender to reveal who drives vs responds.

🚀

Conversation Starter Analysis v4

Who initiates new sessions (gaps ≥ 60 min)? Distinguishes initiators from responders.

⚡

Burst / Reply Depth v4

Rapid back-and-forth bursts (3+ messages, < 3 min gaps). Average and longest burst length.

📅

Year-over-Year Comparison v4

Annual message count, unique senders, avg per active day — for chats spanning 2+ years.

🌙

Time of Day Patterns v4

Morning / Afternoon / Evening / Night breakdown. Late-night (11 PM–5 AM) percentage.

🧬

Message Diversity Score v4

Average unique senders per active day — measures day-to-day breadth of participation.

🎯

Topic Clusters v4

Automatically detected conversation themes: Work, Faith, Family, Money, Events, Tech, Sports.

💨

Conversation Pace Score v4

Messages per active hour — intensity metric independent of date range.

📜

Longest Messages v4

Top 5 longest messages with preview — announcements, essays, important posts.

😊

Emoji Sentiment v4

Classifies emojis as positive / negative / neutral using a curated emoji dictionary.

😊

Word-Based Sentiment

VADER-inspired lexicon, monthly trend, per-sender sentiment stacked bar chart.

📈

Conversation Health Score

0–100 composite from participation balance, response speed, sentiment, consistency.

💡

Auto-Insights

10+ plain-language observations generated from your data. No AI. Rule-based templates.

🔍

Message Search & Filter

Full-text search with sender, date, type filters. Keyword highlighting. Up to 200 results.

🌙

Dark / Light / Large Text

Theme toggle + Aa accessibility font-size toggle. Charts re-render on theme switch.

⬇

5 Export Formats

CSV, JSON, Text Report, Markdown Report, Print/PDF.

🎓

Educational Learn Section

10 algorithm explanations with Key Concept callouts. Written for all skill levels.

🔒

Fully Private Architecture

Zero server. Zero API. FileReader API only. Verify in browser Network Inspector.

Learn

How ChatLens Works — Step by Step

Every algorithm explained in plain language. Both NLP concepts and the data science pipeline. Written by a 15+ year STEM educator.

File Parsing & Format Detection

FoundationRegexFile I/O

The first task is reading the exported file and understanding its structure. Different platforms export in completely different formats.

WhatsApp (.txt): Four regex patterns covering Android, iOS bracket format, and US AM/PM locale. Multi-line messages are handled by checking whether each new line starts with a valid timestamp — if not, it is appended to the previous message.
Telegram (.json): The messages array is traversed. Each entry's text field may be a plain string OR an array of entity objects — both are flattened by joining entity text values.
Facebook (.json): Timestamps are Unix milliseconds (timestamp_ms). Facebook double-encodes Unicode as Latin-1 — ChatLens applies decodeURIComponent(escape(str)) to fix garbled text automatically.

Key Concept — Regex: A pattern-matching language for strings. The pattern /^\[(\d{1,2}\/\d{1,2}\/\d{2,4}),\s*(\d{1,2}:\d{2})\]\s+(.+?):\s(.+)$/ extracts date, time, sender, and message in one pass. Same technique used in production log parsers, web scrapers, and ETL pipelines.

Data Cleaning & Feature Engineering

Data EngineeringDerived Features

Raw parsed data needs cleaning and enrichment. Derived features are extracted from each timestamp: hour, dayOfWeek, month, year — equivalent to pandas' df['hour'] = df['timestamp'].dt.hour. System messages are tagged and excluded. Messages are classified as text, media, link, or emoji_only.

Key Concept — Feature Engineering: Creating new informative columns from existing data. Extracting hour from a timestamp unlocks "when is the group active?" — a question the raw timestamp alone cannot answer. Consistently ranked as the highest-value skill in competitive data science.

Exploratory Data Analysis (EDA)

EDAStatisticsAggregation

EDA summarises dataset characteristics before modelling. Messages are grouped by hour, weekday, month, and sender using hash-map accumulation — equivalent to df.groupby('hour')['msg'].count(). Rolling averages use a 7-day centred window. Surge detection flags days where count ≥ 2× daily average.

Key Concept — Rolling Average: Replaces each data point with the average of its neighbours. Removes noise to reveal the true trend. Used in finance (stock prices), epidemiology (COVID curves), and now ChatLens (conversation trends). Formula: rolling[i] = mean(counts[i-3 : i+3]).

N-Gram Language Analysis

NLPN-gramsText Mining

N-grams are contiguous sequences of N words. Bigrams (N=2): "good morning", "thank you". Trigrams (N=3): "I don't know", "happy new year". For each message, adjacent word pairs/triplets are extracted and counted in frequency maps. Bigrams and trigrams that consist only of stopwords are filtered out.

Key Concept — Why N-grams? Single-word frequency shows "good" and "morning" are common separately. Only bigrams reveal they almost always appear together as "good morning" — a greeting ritual. N-grams are foundational to autocomplete, language models (GPT started as trigram models), spam filters, and machine translation.

Vocabulary Richness — Type-Token Ratio

LinguisticsStylometry

TTR = Unique Words ÷ Total Words. Measures linguistic diversity. Group chats typically score 0.1–0.4 because of repeated phrases. Higher per-sender TTR suggests more varied, expressive language. Computed independently for each top-10 sender.

Key Concept — Stylometry: The analysis of writing style — used in authorship attribution (who wrote an anonymous text?), plagiarism detection, and literary analysis. TTR has been used as forensic evidence in court cases. ChatLens applies it to reveal each participant's expressive diversity.

Sentiment Analysis — VADER-Inspired Lexicon

NLPSentimentLexicon

~400 signal words embedded in analytics.js. Each message is tokenised. Positive and negative hits are counted. Score > 0 → Positive; Score < 0 → Negative; Score = 0 → Neutral. A separate emoji sentiment module classifies emojis using a curated positive/negative emoji dictionary — providing a second, independent sentiment signal.

Key Concept — VADER vs ML: VADER (Valence Aware Dictionary and sEntiment Reasoner, Hutto & Gilbert, 2014) achieves ~85% accuracy on social media text. Neural models achieve ~92% but require a GPU or paid API. ChatLens uses VADER-inspired logic: transparent, zero cost, runs in-browser, and provides accurate directional insight at group level.

Question & Conversation Starter Analysis

Behavioural AnalysisNew v4

Question detection: Messages containing "?" or starting with question words (who, what, when, where, why, how, is, are, can, could, etc.) are flagged. Ranked per sender to reveal who drives discussions.

Conversation starters: A session begins when the gap since the last message is ≥ 60 minutes. The first sender in each new session is the "starter" — distinguishing initiators from responders.

Key Concept — Behavioural Network Analysis: Who asks questions and who starts conversations are measures of proactive engagement — distinct from total message count. A participant can send many messages but rarely initiate. This dimension is invisible in traditional frequency analysis but critical for understanding group dynamics.

Burst Analysis & Conversation Pace

Time SeriesNew v4

Bursts: Sequences of 3+ consecutive messages where each gap < 3 minutes. Total bursts, average length, and max length are computed. Distribution across 4 size buckets (3–5, 6–10, 11–20, 20+) is shown.

Pace score: Total messages ÷ unique hours with activity. Measures intensity independent of date range. A group chatting 500 messages in 10 active hours (pace=50) is more intense than one chatting 1000 messages over 500 active hours (pace=2).

Key Concept — Temporal Granularity: Analysing behaviour at different time resolutions (day, hour, minute) reveals different patterns. Burst analysis works at minute resolution — capturing information invisible at hourly or daily scales. This is why meteorologists use both daily averages and hourly measurements.

Keyword Co-occurrence & Topic Clusters

Semantic AnalysisWord Embeddings

Co-occurrence: For each message containing a keyword, all other non-stopword tokens are collected and counted — revealing semantic context.

Topic clusters: Seed word dictionaries define topics (Work, Faith, Family, Money, Tech, etc.). If 2+ seed words appear in the top 30 most-used words, the topic is detected. Score = total frequency of matched seed words.

Key Concept — Word2Vec Connection: Google's Word2Vec (2013) learns word meanings from co-occurrence — words appearing in similar contexts have similar meanings ("king − man + woman = queen"). ChatLens's co-occurrence tool is the interpretable, human-readable version of the same idea: context reveals meaning.

Conversation Health Score & Gini Coefficient

Composite MetricEconomics → Data Science

Composite 0–100 from 4 factors: Participation Balance (Gini-based, 25pts), Response Speed (% < 5 min, 25pts), Sentiment Ratio (pos/total, 25pts), Engagement Consistency (% days with messages, 25pts).

Key Concept — Gini Coefficient: Originally an economics measure of income inequality (0 = perfect equality, 1 = one person has everything). ChatLens applies it to message distribution: Gini = 0 means everyone sends equally; Gini = 1 means one person sent all messages. Participation Balance score = (1 − Gini) × 25.

About the Builder

Built by Adewale Samson Adeagbo

Data Scientist · ML Engineer · EdTech Builder · FaithTech Builder · AI-Augmented Solutions Developer

ASA

Adewale Samson Adeagbo

Data Scientist · ML Engineer · EdTech Builder · FaithTech Builder
Virtual Tutor · AI-Augmented Solutions Developer

Lagos-based Data Scientist and ML Engineer with 15+ years as a STEM educator — teaching Mathematics, Further Mathematics, Physics, Chemistry, and Computer Science in Nigeria.

Founder and Director of HMG Concepts, operating HMG Academy — a full-service virtual learning institution supporting WAEC, NECO, BECE, UTME, Post-UTME, IGCSE, IELTS, JUPEB, and SAT.

Holds a B.Sc/Ed in Computer Science Education from Lagos State University (LASU, 2023). Builds data-driven tools combining analytical rigour with pedagogical clarity.

ChatLens reflects this philosophy: every algorithm is documented, every chart is annotated, and the entire pipeline is explained — because data without explanation is just noise.

Currently Enrolled (2026)

DSN × 3MTT × Google.org DeepTech_ReadyDSML Track · Cohort 3

DSN × Microsoft Elevate AI DevelopersAI-900 Track · Cohort 1

SkillBuild Hub AmbassadorCohort 4 · Apr–Jun 2026

Contact

📞+234 810 086 6322 📞+234 809 448 1488 ✉buildingmyictcareer@gmail.com ✉adeagboadewalesamson@gmail.com

Personal Links

🌐 Portfolio GitHub LinkedIn X Instagram Facebook YouTube HMG Academy

HMG Concepts Brand

Instagram X Facebook LinkedIn

Education

B.Sc/Ed Computer Science EducationLagos State University (LASU) · 2023

Bammy College2010

Smartland Ideal High School2004–2009

Divine Touch Primary School1995–2003

Certifications

Data Science3MTT Nigeria · Dec 2025

Data AnalysisDigital Skillup Africa · Nov 2025

Data ScienceYouthrive / Access CareerEx · Jul 2025

Digital MarketingAfritech · Feb 2025

System Data Mgmt / Desktop Publishing / Computer EngineeringQuality Computer Training Center · 2014

Piano & Music RudimentsRotop Music Academy · 2013

Portfolio Projects

All projects are live and free.

Yakub Staff Promotion PredictionML · Random Forest · Streamlit

Insurance Claim PredictionClassification · Logistic Regression · Streamlit

Bank Customer Churn PredictionML · XGBoost · Streamlit

Income Level PredictionClassification · Streamlit

TruthLens — Fake News DetectorNLP · TF-IDF · Streamlit

SwiftChain Delivery Delay PredictionML · Regression · Streamlit

NeuroWell Employee Burnout PredictorML · Analytics · Streamlit

Student Performance TrackerEdTech · Analytics · Streamlit

Student At-Risk PredictorML · Classification · Streamlit

CBT.ng — Free Computer-Based TestingEdTech · Vanilla JS · GitHub Pages · Supabase

ChatLens v4 — Deep Chat AnalyticsNLP · Vanilla JS · GitHub Pages ← You are here

ChatLens v4 — Tech Stack

HTML5CSS3 + Custom Properties Vanilla JavaScript ES6+Chart.js 4.4 PapaParse 5Regex NLP Canvas API (Word Cloud)GitHub Pages GitHub Actions CI/CDZero Backend Zero AI APIZero Cost

How to Use ChatLens

Export Your Chat

Upload & Analyse

Explore Your Results

Your Privacy is Guaranteed by Architecture

Upload Your Chat Export

📱 WhatsApp (.txt)

✈ Telegram (.json)

👤 Facebook (.json)

Chat Intelligence Report

Messages by Hour of Day

Messages by Day of Week

Daily Volume Over Time

Monthly Activity

Activity Surges

Conversation Streaks

Top 10 Most Active Participants

Message Share Distribution

Per-Sender Sentiment Profile

First-Reply Champions

Full Participant Statistics Table

Message Type Breakdown

Top 20 Most Used Words

Visual Word Cloud

Top 15 Emojis Used

Message Length Distribution

Top Shared Domains

Top 15 Bigrams

Top 15 Trigrams

Vocabulary Richness (Type-Token Ratio)

Keyword Co-occurrence Explorer

Topic Clusters New v4

Conversation Milestones

Response Gap Distribution

Longest Silent Periods

Overall Sentiment Distribution

Sentiment Trend Over Time

Emoji Sentiment Analysis New v4

Signal Words Driving Sentiment

📖 Sentiment Methodology

Question Analysis New

Conversation Starter Analysis New

Burst / Reply Depth Analysis New

Time of Day Patterns New

Weekday vs Weekend New

Year-over-Year Comparison New

Message Diversity Score New

Longest Messages in the Conversation New

Conversation Pace Score New

Search & Filter Messages

Key Findings from Your Chat

Download Your Data

CSV

JSON

Text Report

Markdown Report

Print / PDF

Everything You Need to Understand a Conversation

Multi-Platform Parsing

Activity Intelligence

Participant Profiles

Content Intelligence

N-Gram Language Patterns

Vocabulary Richness (TTR)

First-Reply Analysis

Conversation Streaks

Canvas Word Cloud

Co-occurrence Explorer

Question Analysis v4

Conversation Starter Analysis v4

Burst / Reply Depth v4

Year-over-Year Comparison v4

Time of Day Patterns v4

Message Diversity Score v4

Topic Clusters v4

Conversation Pace Score v4

Longest Messages v4

Emoji Sentiment v4

Word-Based Sentiment

Conversation Health Score