← All projects · Self-built AI & data tools
Document Intelligence, Semantic Search
What do fifteen years of journal entries say?
Conversing with 15 Years of Journal Entries
Using natural language processing to tag and semantically analyse unstructured content. Making it queryable through LLM models.
Personal build
The problem
Fifteen years of writing, drafts, entries, and notes, locked in formats that can't be asked anything. The same problem every organization has with its customer feedback, meeting notes, and research archives.
The approach
An ingestion pipeline pulls and semantically buckets the material, an LLM tags each entry against an evolving theme taxonomy, and embeddings make the whole archive semantically searchable. This is then made accessible through a dashboard and conversational AI assistants.
What it revealed
Themes and sentiment trace arcs invisible at the entry level. Any unstructured text, notes, charts, or visuals archive can be made meaningful and become a queryable asset with this architecture.
The same architecture applies to research repositories, presentation deck archives, customer feedback, meeting notes, or field survey notes, reanimating years of unstructured material that would otherwise be in danger of slipping into forgotten archives.
Under the hood
Python · Gmail API · SQLite · LLM tagging · sentence-transformers · Natural Language Processing