This project aims to provide a **neutral, real-time text feed** of parliamentary discussions and legislative sessions. The foundation is simple: capture **verbatim speech** without commentary, filtering, or external influence.
Project Roadmap
Step 1: Establish a Live Text Stream ✅ (Completed)
- Set up
/help/govwith core project information. - Integrate
_nav-pills.phpfor structured external sources. - Ensure accessible links to official government debate records.
Step 2: Collect & Store Text Data
- Identify reliable sources (Hansard, GovTrack, European Parliament, etc.).
- Ensure legal and ethical data retrieval.
- Implement **simple file-based logging** (
.txtfiles per session). - Use **modification timestamps** for tracking updates.
Step 3: Structure & Standardize Text Feeds
- Define **format conventions** (timestamps, speaker identifiers, session metadata).
- Build lightweight retrieval methods for easy **searchability and navigation**.
- Index stored sessions with **minimal complexity**.
Step 4: Introduce Rule-Based Annotation
- Develop structured annotations (e.g., Point of Order, Contradiction).
- Establish a **timestamped reference system**.
- Begin **manual annotation experiments** before automation.
Step 5: Automate Annotations & Cross-Checks
- Implement **basic AI-assisted tagging** using predefined procedural rules.
- Validate annotation accuracy against **primary sources**.
- Ensure **neutrality and consistency** in procedural detection.
Current Approach
The project starts with raw transcription, ensuring **absolute accuracy** before introducing analysis tools. Gradual refinement will introduce safeguards against misinformation and procedural bias.