Overview
This dataset contains over 1 million Q&A pairs from Reddit. It is perfect for training conversational AI models, chatbots, and sentiment analysis tools.
Key Features
- High Quality: Filtered for high-karma comments.
- Diverse Topics: Covers technology, science, finance, and more.
- Clean Format: JSONL format ready for training.
Statistics
- Total Pairs: 1,250,000
- Size: 4.5 GB (Compressed)
- Format: JSONL