Skip to content

Tutorial: Analyzing Chat Data with Kura

Learn how to analyze RAG system chat data through a three-part tutorial series. Work with 560 real user queries to discover patterns and build production-ready classifiers.

Prerequisites

  • Install dependencies from pyproject.toml
  • Set GOOGLE_API_KEY for Gemini
  • Download the tutorial dataset

Download Dataset

Tutorial Series

1. Cluster Conversations

Discover user query patterns through topic modeling and clustering. Learn to identify that three major topics account for 67% of queries, with artifact management appearing in 61% of conversations.

2. Better Summaries

Transform generic summaries into domain-specific insights. Build custom summarization models that turn seven vague clusters into three actionable categories: Access Controls, Deployment, and Experiment Management.

3. Building Classifiers

Convert clustering insights into production classifiers. Build real-time systems that automatically categorize new queries and scale your insights.

What You'll Learn

  • Systematically analyze large volumes of user queries
  • Build custom models for your specific domain
  • Create production systems for automatic query classification
  • Make data-driven decisions about system improvements

Ready to start? Begin with the first notebook below.