EU Parliament Monitor - API Documentation - v0.8.0
    Preparing search index...

    Hack23 Logo

    ๐Ÿ›๏ธ EU Parliament Monitor โ€” Architecture

    C4 Model Architecture for European Parliament Intelligence Platform
    ๐Ÿ“ System Context โ€ข ๐Ÿ“ฆ Container View โ€ข ๐Ÿ”ง Component Design

    Owner Version Effective Date Review Cycle OpenSSF Best Practices

    ๐Ÿ“‹ Document Owner: CEO | ๐Ÿ“„ Version: 1.1 | ๐Ÿ“… Last Updated: 2026-03-19 (UTC)
    ๐Ÿ”„ Review Cycle: Quarterly | โฐ Next Review: 2026-06-19


    This document serves as the primary entry point for the EU Parliament Monitor's architectural documentation. It provides a comprehensive view of the system's design using the C4 model approach, starting from a high-level system context and drilling down to component interactions.

    Document Focus Description Documentation Link
    Architecture ๐Ÿ›๏ธ Architecture C4 model showing current system structure View Source
    Future Architecture ๐Ÿ›๏ธ Architecture C4 model showing future system structure View Source
    Mindmaps ๐Ÿง  Concept Current system component relationships View Source
    Future Mindmaps ๐Ÿง  Concept Future capability evolution View Source
    SWOT Analysis ๐Ÿ’ผ Business Current strategic assessment View Source
    Future SWOT Analysis ๐Ÿ’ผ Business Future strategic opportunities View Source
    Data Model ๐Ÿ“Š Data Current data structures and relationships View Source
    Future Data Model ๐Ÿ“Š Data Enhanced European Parliament data architecture View Source
    Flowcharts ๐Ÿ”„ Process Current data processing workflows View Source
    Future Flowcharts ๐Ÿ”„ Process Enhanced AI-driven workflows View Source
    State Diagrams ๐Ÿ”„ Behavior Current system state transitions View Source
    Future State Diagrams ๐Ÿ”„ Behavior Enhanced adaptive state transitions View Source
    Security Architecture ๐Ÿ›ก๏ธ Security Current security implementation View Source
    Future Security Architecture ๐Ÿ›ก๏ธ Security Security enhancement roadmap View Source
    Threat Model ๐ŸŽฏ Security STRIDE threat analysis View Source
    Classification ๐Ÿท๏ธ Governance CIA classification & BCP View Source
    CRA Assessment ๐Ÿ›ก๏ธ Compliance Cyber Resilience Act View Source
    Workflows โš™๏ธ DevOps CI/CD documentation View Source
    Future Workflows ๐Ÿš€ DevOps Planned CI/CD enhancements View Source
    Business Continuity Plan ๐Ÿ”„ Resilience Recovery planning View Source
    Financial Security Plan ๐Ÿ’ฐ Financial Cost & security analysis View Source
    End-of-Life Strategy ๐Ÿ“ฆ Lifecycle Technology EOL planning View Source
    Unit Test Plan ๐Ÿงช Testing Unit testing strategy View Source
    E2E Test Plan ๐Ÿ” Testing End-to-end testing View Source
    Performance Testing โšก Performance Performance benchmarks View Source
    Security Policy ๐Ÿ”’ Security Vulnerability reporting & security policy View Source

    EU Parliament Monitor is developed and maintained in accordance with Hack23 AB's Information Security Management System (ISMS), which is aligned with ISO 27001:2022, NIST CSF 2.0, and CIS Controls v8.1.

    Policy Description Relevance to EU Parliament Monitor
    Information Security Policy Establishes organization-wide security governance and risk management framework Defines overall security posture, risk assessment methodology, and management responsibilities for the project
    Secure Development Policy Defines secure coding standards, code review requirements, and SDLC security gates Mandates security-first development practices: input validation, dependency scanning, SAST/DAST integration, secure CI/CD pipelines
    Open Source Policy Governs use, contribution, and licensing of open source software Ensures compliance with Apache-2.0 License, dependency license compatibility, and transparent open source contribution practices
    Classification Policy Defines data classification scheme (Public, Internal, Confidential, Restricted) and handling requirements All project content classified as PUBLIC; establishes data handling controls for any future sensitive data integration
    AI Policy Governs responsible AI usage, transparency, and human oversight requirements Governs LLM usage for content generation: transparency requirements, human review workflows, bias mitigation, prompt injection protection
    Access Control Policy Defines authentication, authorization, least privilege, and privileged access management Controls GitHub repository access, branch protection rules, secret management, and deployment permissions
    Cryptography Policy Establishes cryptographic standards for data protection (algorithms, key management, TLS) Mandates HTTPS-only content delivery, TLS 1.2+ (TLS 1.3 where supported) for outbound HTTPS API communications; EP MCP integration uses a local stdio JSON-RPC channel (no TLS); ensures secure secret storage for LLM API keys

    ISO 27001:2022 Controls Implemented:

    • A.5.10 - Information Security Policy (documented and reviewed quarterly)
    • A.8.3 - Secure Coding (ESLint security rules, CodeQL SAST scanning)
    • A.8.23 - Web Filtering (planned CSP headers via CloudFront, XSS prevention)
    • A.8.24 - Cryptography (HTTPS-only, TLS 1.2+ / TLS 1.3 where supported, site delivery via CloudFront)
    • A.8.28 - Secure Coding (input validation, dependency scanning)

    NIST CSF 2.0 Functions Addressed:

    • Identify (ID): Asset inventory, risk assessment, vulnerability management
    • Protect (PR): Access control, data security, secure development
    • Detect (DE): Security monitoring, vulnerability scanning, anomaly detection
    • Respond (RS): Incident response procedures, GitHub Security Advisories
    • Recover (RC): Business Continuity Plan, backup/restore procedures

    CIS Controls v8.1 Implemented:

    • Control 1: Inventory and Control of Enterprise Assets (documented in repo)
    • Control 4: Secure Configuration (branch protection, security policies)
    • Control 6: Access Control Management (GitHub RBAC, least privilege)
    • Control 8: Audit Log Management (GitHub audit logs, workflow logs)
    • Control 10: Malware Defenses (Dependabot, npm audit, CodeQL)
    • Control 16: Application Software Security (SAST, dependency scanning, secure coding)

    Evidence of ISMS compliance is maintained through:

    • Policy Documents: All policies stored in Hack23/ISMS-PUBLIC
    • Security Architecture: SECURITY_ARCHITECTURE.md maps controls to implementations
    • Threat Model: THREAT_MODEL.md documents STRIDE analysis and mitigations
    • Classification: CLASSIFICATION.md defines data classification and handling
    • Audit Trail: GitHub audit logs, workflow execution logs, dependency scan reports
    • Security Scanning: CodeQL results, Dependabot alerts, npm audit reports

    EU Parliament Monitor is a static site generator that creates multi-language news articles about European Parliament activities, leveraging the European Parliament MCP Server for data access and LLM-based content generation.

    Enable democratic transparency by providing automated, multilingual coverage of European Parliament activities through a secure, maintainable static site architecture.

    • Minimal Runtime Dependencies: Pure static HTML/CSS output with no server-side execution; single production dependency (european-parliament-mcp-server) used at build time
    • TypeScript Source: All source code in src/ written in TypeScript (strict mode), compiled via tsc to scripts/ (ES2024 target)
    • Multi-Language Support: Generates content in 14 languages
    • MCP Integration: Uses European Parliament MCP Server for data access
    • Security by Design: Minimal attack surface through static architecture
    • AWS Hosted: Leverages AWS S3 + CloudFront for zero-infrastructure static hosting

    ๐Ÿ‘ค User Focus: Shows how different user types interact with the EU Parliament Monitor system and what external systems it depends on.

    ๐ŸŒ Integration Focus: Illustrates the relationships with GitHub infrastructure, European Parliament APIs, and LLM services.

    C4Context
    title EU Parliament Monitor - System Context Diagram

    Person(citizen, "European Citizen", "Reads news about European Parliament activities in their native language")
    Person(journalist, "Journalist", "Uses site as research source for European political coverage")
    Person(researcher, "Political Researcher", "Analyzes EP activities and trends")
    Person(contributor, "Developer/Contributor", "Maintains and improves the news generation system")

    System(epmonitor, "EU Parliament Monitor", "Static site with multilingual news about European Parliament activities")

    System_Ext(github, "GitHub", "Hosts repository, runs CI/CD (GitHub Actions)")
    System_Ext(aws, "AWS (S3 + CloudFront)", "Serves static site globally via CDN")
    System_Ext(ep_mcp, "European Parliament MCP Server", "Provides structured access to EP data")
    System_Ext(ep_api, "European Parliament APIs", "Official EP data sources (plenary, committees, documents)")
    System_Ext(llm, "LLM Service", "Generates article content from structured EP data")

    Rel(citizen, epmonitor, "Reads news", "HTTPS")
    Rel(journalist, epmonitor, "Researches stories", "HTTPS")
    Rel(researcher, epmonitor, "Analyzes data", "HTTPS")
    Rel(contributor, github, "Contributes code", "Git/HTTPS")

    Rel(epmonitor, github, "Built and deployed via", "GitHub Actions")
    Rel(epmonitor, aws, "Hosted on", "S3 + CloudFront")
    Rel(github, epmonitor, "Generates site via", "GitHub Actions")
    Rel(epmonitor, ep_mcp, "Fetches EP data via", "MCP Protocol")
    Rel(ep_mcp, ep_api, "Queries EP data", "HTTPS/JSON")
    Rel(epmonitor, llm, "Generates content via", "API/SDK")

    UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="2")
    Element Type Description Technology
    European Citizen User Primary audience seeking EP news in native language Web Browser
    Journalist User Professional using site for research and story development Web Browser
    Political Researcher User Academic or analyst studying EP activities Web Browser
    Developer/Contributor User Maintainer improving system Git, Node.js, VS Code
    EU Parliament Monitor System Core static site generator Node.js, TypeScript
    GitHub External System Source control, CI/CD GitHub Actions
    EP MCP Server External System Structured EP data access MCP Protocol, TypeScript
    EP APIs External System Official data sources REST APIs, JSON
    LLM Service External System Content generation API (OpenAI/Anthropic/etc.)
    graph TB
    subgraph "Public Internet - Untrusted Zone"
    Users[Web Users<br/>Citizens, Journalists, Researchers]
    end

    subgraph "GitHub Infrastructure - Trusted Zone"
    subgraph "Build Environment"
    Actions[GitHub Actions Runner<br/>GitHub-hosted Ubuntu runner<br/>ubuntu-latest + Node.js 25]
    EPServer[European Parliament<br/>MCP Server<br/>(local process, stdio JSON-RPC)]
    end

    subgraph "Source Control"
    Repo[Git Repository<br/>Version Control]
    end
    end

    subgraph "AWS Hosting - Cloud Infrastructure Zone"
    Pages[AWS S3 + CloudFront CDN<br/>HTTPS via ACM]
    end

    subgraph "External Services - Partially Trusted Zone"
    EPAPI[European Parliament<br/>Official APIs]
    LLM[LLM Service<br/>OpenAI/Anthropic]
    end

    Users -->|HTTPS GET<br/>Read-Only| Pages
    Actions -->|Spawns locally<br/>(stdio JSON-RPC)| EPServer
    EPServer -->|HTTPS/JSON<br/>Data Queries| EPAPI
    Actions -->|API Calls<br/>Content Gen| LLM
    Actions -->|Git Push<br/>Authenticated| Repo
    Actions -->|S3 Sync + CF Invalidation<br/>Authenticated (OIDC)| Pages

    style Users fill:#f9f,stroke:#333,stroke-width:2px
    style Pages fill:#9f9,stroke:#333,stroke-width:2px
    style Actions fill:#99f,stroke:#333,stroke-width:2px
    style EPServer fill:#ff9,stroke:#333,stroke-width:2px
    style EPAPI fill:#ff9,stroke:#333,stroke-width:2px
    style LLM fill:#ff9,stroke:#333,stroke-width:2px

    Trust Boundary Analysis:

    Zone Trust Level Security Controls Threat Model
    Public Internet Untrusted HTTPS-only, planned CSP headers, static content only DDoS, XSS attempts (mitigated by static architecture)
    GitHub Infrastructure Trusted GitHub authentication, branch protection, optional signed commits, secret scanning Supply chain attacks (mitigated by Dependabot, CodeQL)
    AWS Hosting Trusted ACM certificate, HTTPS redirect, DDoS protection via CloudFront Hosting infrastructure compromise (mitigated by AWS security controls, OIDC deploy auth)
    External Services Partially Trusted API authentication, basic input parsing/shape validation; planned systematic sanitization/escaping and rate limiting Data poisoning, API compromise (mitigated by validation, monitoring, planned hardening)

    Key Security Boundaries:

    1. User โ†’ CloudFront: Read-only HTTPS access, no authentication required (public content)
    2. GitHub Actions โ†’ External APIs: Authenticated API calls, input validation, error handling
    3. GitHub Actions โ†’ AWS S3: Authenticated S3 sync + CloudFront invalidation, only static files deployed
    4. External Services โ†’ System: Data parsed and basic shape-validated before use; comprehensive sanitization/escaping and rate limiting are planned controls

    ๐Ÿ“ฆ Container Focus: Shows the major containers (applications, data stores, microservices) that make up the system.

    ๐Ÿ”„ Data Flow Focus: Illustrates how data flows between containers during news generation.

    C4Container
    title EU Parliament Monitor - Container Diagram

    Person(user, "User", "Reads multilingual EP news")
    Person(contributor, "Contributor", "Maintains system")

    Container_Boundary(epmonitor, "EU Parliament Monitor") {
    Container(news_generator, "News Generation Scripts", "Node.js/TypeScript", "Generates multilingual news articles from EP data")
    Container(index_generator, "Index Page Generator", "Node.js/TypeScript", "Creates language-specific index pages")
    Container(sitemap_generator, "Sitemap Generator", "Node.js/TypeScript", "Generates sitemap.xml for SEO")
    Container(mcp_client, "MCP Client", "TypeScript", "Communicates with EP MCP Server for data access")
    Container(template_engine, "Article Template Engine", "TypeScript", "Generates HTML from article data")
    ContainerDb(static_files, "Static Files", "HTML/CSS", "Generated news articles and indexes")
    }

    Container_Boundary(github_infra, "GitHub Infrastructure") {
    Container(actions, "GitHub Actions", "CI/CD", "Automated news generation workflow")
    ContainerDb(repo, "Git Repository", "Version Control", "Source code and generated content")
    }

    Container_Boundary(aws_infra, "AWS Infrastructure") {
    Container(pages, "Amazon CloudFront + S3", "CDN / Object Storage", "Serves static site globally via HTTPS")
    }

    System_Ext(ep_mcp, "European Parliament MCP Server", "Structured EP data access")
    System_Ext(llm, "LLM Service", "Article content generation")

    Rel(user, pages, "Reads news", "HTTPS")
    Rel(contributor, repo, "Commits code", "Git/HTTPS")
    Rel(actions, news_generator, "Triggers daily", "Node.js exec")
    Rel(news_generator, mcp_client, "Requests EP data", "MCP calls")
    Rel(news_generator, llm, "Generates content", "API calls")
    Rel(news_generator, template_engine, "Creates HTML", "Function calls")
    Rel(template_engine, static_files, "Writes files", "fs.writeFileSync")
    Rel(index_generator, static_files, "Generates indexes", "fs.writeFileSync")
    Rel(sitemap_generator, static_files, "Creates sitemap", "fs.writeFileSync")
    Rel(mcp_client, ep_mcp, "Queries data", "MCP Protocol")
    Rel(static_files, repo, "Committed by Actions", "Git commit/push")
    Rel(actions, pages, "Deploys via", "S3 sync + CloudFront invalidation")
    Rel(pages, static_files, "Serves from S3", "CloudFront edge")

    UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="2")
    Container Technology Purpose Data Flow
    News Generation Scripts Node.js/TypeScript Core article generation logic Orchestrates MCP data fetch and LLM generation
    Index Page Generator Node.js/TypeScript Creates language-specific index pages Aggregates article metadata into navigation
    Sitemap Generator Node.js/TypeScript SEO sitemap creation Lists all pages for search engine crawling
    MCP Client TypeScript EP data access Communicates with MCP Server for structured data
    Article Template Engine TypeScript HTML generation Converts article data to semantic HTML5
    Static Files HTML/CSS Generated output Committed to repository, deployed to AWS S3 and served via CloudFront
    GitHub Actions CI/CD Automation Daily workflow execution, build, deploy to S3/CloudFront
    Amazon CloudFront + S3 CDN/Object Storage Hosting HTTPS delivery of static content globally
    Git Repository Version Control Source & Content Stores code, generated articles, configuration
    Container Security Responsibility Implementation Controls
    News Generation Scripts Basic input validation, data handling, error handling (schema validation and sanitization planned) Parses EP data with JSON.parse and basic shape checks, performs minimal input validation, handles API errors gracefully; comprehensive schema validation and HTML sanitization planned A.8.3, A.8.28 (ISO 27001)
    Index Page Generator XSS risk awareness, metadata validation (systematic escaping planned) Performs basic article metadata structure checks and relies on static content; comprehensive XSS hardening and systematic escaping of user-facing strings planned A.8.23 (ISO 27001)
    Sitemap Generator URL validation, XML escaping Validates all URLs before inclusion, escapes XML special characters A.8.3 (ISO 27001)
    MCP Client Local MCP process communication, timeout handling, basic validation Spawns a local MCP server process over stdio JSON-RPC, applies connection retry backoff, per-request timeouts, and basic JSON parsing/validation (no TLS or API authentication at this local layer) A.8.24 (ISO 27001), CIS Control 16
    Article Template Engine HTML output generation, CSP-ready markup (systematic sanitization planned) Generates semantic HTML5 and interpolates dynamic content; systematic HTML escaping/sanitization and CSP hardening planned for all dynamic content A.8.23 (ISO 27001)
    Static Files Integrity verification, no sensitive data All files public, no secrets or PII, content integrity via Git A.5.10 (ISO 27001)
    GitHub Actions Secret management, least privilege, audit logging GitHub Secrets for API keys, OIDC authentication, workflow audit logs A.8.3, CIS Control 6
    Amazon CloudFront + S3 HTTPS-only, CDN security, DDoS protection Forces HTTPS redirect via ACM certificate, CloudFront with DDoS mitigation, HSTS headers (configured externally in CloudFront distribution) A.8.24 (ISO 27001)
    Git Repository Access control, branch protection, signed commits RBAC with least privilege, protected main branch, optional signed commits CIS Control 6, A.8.3
    graph TB
    subgraph "Generation Layer - Build Time Security"
    NewsGen[News Generator<br/>๐Ÿ›ก๏ธ Input Validation<br/>๐Ÿ›ก๏ธ Data Sanitization]
    MCPClient[MCP Client<br/>๐Ÿ›ก๏ธ Local stdio JSON-RPC<br/>๐Ÿ›ก๏ธ Connection Retry<br/>๐Ÿ›ก๏ธ Request Timeout]
    Template[Template Engine<br/>๐Ÿ›ก๏ธ XSS Prevention<br/>๐Ÿ›ก๏ธ CSP Generation<br/>๐Ÿ›ก๏ธ HTML Sanitization]
    end

    subgraph "Storage Layer - Version Control Security"
    GitRepo[Git Repository<br/>๐Ÿ›ก๏ธ Branch Protection<br/>๐Ÿ›ก๏ธ Code Review<br/>๐Ÿ›ก๏ธ Audit Logs]
    Secrets[GitHub Secrets<br/>๐Ÿ›ก๏ธ Encrypted Storage<br/>๐Ÿ›ก๏ธ Least Privilege]
    end

    subgraph "Delivery Layer - Runtime Security"
    Pages[Amazon CloudFront + S3<br/>๐Ÿ›ก๏ธ HTTPS-Only<br/>๐Ÿ›ก๏ธ HSTS Headers<br/>๐Ÿ›ก๏ธ DDoS Protection]
    CDN[CloudFront Edge<br/>๐Ÿ›ก๏ธ TLS Termination<br/>๐Ÿ›ก๏ธ Edge Caching<br/>๐Ÿ›ก๏ธ Geographic Distribution]
    end

    subgraph "External Layer - Third-Party Security"
    EPMCP[EP MCP Server<br/>๐Ÿ›ก๏ธ MCP Protocol<br/>๐Ÿ›ก๏ธ Data Validation]
    LLM[LLM Service<br/>๐Ÿ›ก๏ธ API Key Auth<br/>๐Ÿ›ก๏ธ Prompt Injection Prevention]
    end

    NewsGen -->|Validated Data| Template
    NewsGen -->|Spawns locally (stdio)| MCPClient
    MCPClient -->|JSON-RPC| EPMCP
    NewsGen -->|Secured API Calls| LLM
    Template -->|Safe HTML| GitRepo
    Secrets -->|Inject at Runtime| NewsGen
    GitRepo -->|Deploy to S3| Pages
    Pages -->|Cached Content| CDN

    style NewsGen fill:#9cf,stroke:#333,stroke-width:2px
    style Template fill:#9cf,stroke:#333,stroke-width:2px
    style MCPClient fill:#9cf,stroke:#333,stroke-width:2px
    style GitRepo fill:#cf9,stroke:#333,stroke-width:2px
    style Pages fill:#fc9,stroke:#333,stroke-width:2px
    style CDN fill:#fc9,stroke:#333,stroke-width:2px

    ๐Ÿ”ง Component Focus: Detailed view of the news generation container's internal components.

    ๐ŸŽฏ Responsibility Focus: Shows how different components collaborate to generate multilingual news articles.

    C4Component
    title EU Parliament Monitor - News Generation Components

    Container_Boundary(news_gen, "News Generation Container") {
    Component(cli, "CLI Interface", "TypeScript", "Parses command-line arguments for generation parameters")
    Component(article_gen, "Article Generator", "TypeScript", "Coordinates article creation for specified types/languages")
    Component(mcp_client, "MCP Client", "TypeScript", "Fetches EP data via MCP protocol")
    Component(llm_client, "LLM Client", "TypeScript", "Generates article content from EP data")
    Component(translator, "Translation Handler", "TypeScript", "Manages multi-language content generation")
    Component(template, "HTML Template Engine", "TypeScript", "Renders articles as semantic HTML5")
    Component(file_writer, "File System Writer", "TypeScript", "Writes generated articles to disk")
    Component(metadata, "Metadata Manager", "TypeScript", "Tracks generation metadata and timestamps")
    Component(validator, "Content Validator", "TypeScript", "Validates generated HTML and content")
    }

    System_Ext(ep_mcp, "EP MCP Server", "EP data access")
    System_Ext(llm, "LLM Service", "Content generation")
    ContainerDb(output, "Static Files", "HTML files")

    Rel(cli, article_gen, "Invokes with params", "Function call")
    Rel(article_gen, mcp_client, "Requests EP data", "Async calls")
    Rel(article_gen, llm_client, "Generates content", "Async calls")
    Rel(article_gen, translator, "Processes languages", "Function call")
    Rel(translator, llm_client, "Language-specific generation", "Async calls")
    Rel(article_gen, template, "Renders HTML", "Function call")
    Rel(template, validator, "Validates output", "Function call")
    Rel(article_gen, file_writer, "Writes articles", "Function call")
    Rel(article_gen, metadata, "Tracks generation", "Function call")
    Rel(mcp_client, ep_mcp, "Queries data", "MCP Protocol")
    Rel(llm_client, llm, "API calls", "HTTPS/JSON")
    Rel(file_writer, output, "Writes files", "fs.writeFileSync")
    Rel(metadata, output, "Writes JSON", "fs.writeFileSync")

    UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="1")
    Component Responsibility Dependencies File Location
    CLI Interface Parse command-line arguments Node.js process.argv src/generators/news-enhanced.ts
    Article Generator Orchestrate article creation MCP Client, LLM Client, Template src/generators/news-enhanced.ts
    MCP Client Fetch EP data via MCP EP MCP Server src/mcp/ep-mcp-client.ts
    LLM Client Generate article text LLM Service API Integrated in article generator
    Translation Handler Manage multi-language generation LLM Client src/generators/news-enhanced.ts
    HTML Template Engine Render semantic HTML5 Article data src/templates/article-template.ts
    File System Writer Write files to disk Node.js fs module src/generators/news-enhanced.ts
    Metadata Manager Track generation metadata Article data src/generators/news-enhanced.ts
    Content Validator Validate HTML output HTML validator Integrated in template
    sequenceDiagram
    autonumber
    participant CLI as CLI Interface
    participant Gen as Article Generator
    participant MCP as MCP Client
    participant EPMCP as EP MCP Server
    participant Tmpl as HTML Template
    participant Meta as Metadata Manager
    participant FS as File System Writer

    CLI->>Gen: generate(type, languages)
    Gen->>MCP: fetchEPData(type)
    MCP->>EPMCP: query(endpoint, params)
    EPMCP-->>MCP: return EP data
    MCP-->>Gen: return parsed EP data

    loop For each language (sequential)
    Gen->>Tmpl: renderHTML(epData, lang)
    Note over Gen,Tmpl: Current: placeholder English content<br/>Future (ADR-004): native LLM generation per language
    Tmpl-->>Gen: return HTML
    Gen->>FS: writeFile(path, html)
    Gen->>Meta: recordGeneration(article, lang)
    end

    Meta->>FS: writeMetadata(json)
    Gen-->>CLI: generation complete
    Pattern Components Involved Purpose Error Handling
    Cache-Aside (Planned) MCP Client โ†’ LRU Cache โ†’ EP MCP Server Reduce API calls, improve performance Planned: cache miss triggers fresh fetch; current: direct calls to EP MCP Server
    MCP Connection Retry with Backoff (Current) MCP Client โ†’ EP MCP Server Handle transient MCP connection failures Connection attempts retried with backoff; individual MCP requests use a fixed timeout and are not retried
    Validation Pipeline (Planned) Content Validator โ†’ Article Generator Ensure content quality Planned: failed validation triggers regeneration (max 2 attempts); current: single-pass generation without regeneration loop
    Sequential Multi-Language Article Generator โ†’ HTML Template (per language) Content generation per language Current: failure in one language aborts remaining languages; Planned: per-language failures logged while other languages still generate; parallel generation planned (ADR-004)
    Template Method Article Generator โ†’ HTML Template โ†’ File System Writer Consistent HTML generation Template errors logged and propagated to prevent partial writes
    Metadata Aggregation Metadata Manager โ†’ File System Writer Track generation history Current: metadata written synchronously via writeFileSync; failures throw and fail the run. Planned: non-blocking, best-effort writes

    โ˜๏ธ Infrastructure Focus: Shows how the system is deployed on GitHub infrastructure.

    ๐Ÿš€ CI/CD Focus: Illustrates the automated deployment pipeline.

    C4Deployment
    title EU Parliament Monitor - Deployment Diagram

    Deployment_Node(github_cloud, "GitHub Cloud", "GitHub Infrastructure") {
    Deployment_Node(actions_runner, "GitHub Actions Runner", "Ubuntu 24.04") {
    Container(workflow, "News Generation Workflow", "GitHub Actions YAML", "Daily scheduled workflow")
    Container(node_runtime, "Node.js Runtime", "Node.js 25", "Executes generation scripts")
    }

    Deployment_Node(pages_cdn, "AWS Infrastructure", "S3 + CloudFront") {
    Container(web_server, "Amazon CloudFront", "CDN / HTTPS", "Serves HTTPS content globally")
    ContainerDb(static_content, "Amazon S3 Bucket", "Object Storage", "Generated articles and pages")
    }

    Deployment_Node(repo_storage, "GitHub Repository", "Git Storage") {
    ContainerDb(git_repo, "Git Repository", "Version Control", "Source code and generated content")
    }
    }

    Deployment_Node(user_device, "User Device", "Desktop/Mobile") {
    Container(browser, "Web Browser", "Chrome/Firefox/Safari", "Renders news articles")
    }

    Deployment_Node(external_services, "External Services", "Cloud") {
    System_Ext(ep_mcp, "EP MCP Server", "EP data access")
    System_Ext(llm, "LLM Service", "Content generation")
    }

    Rel(workflow, node_runtime, "Executes", "Process")
    Rel(node_runtime, ep_mcp, "Fetches data", "stdio/JSON-RPC")
    Rel(node_runtime, llm, "Generates content", "HTTPS/API")
    Rel(node_runtime, git_repo, "Commits files", "Git")
    Rel(git_repo, static_content, "Deploys via", "S3 sync + CloudFront invalidation")
    Rel(browser, web_server, "Requests pages", "HTTPS")
    Rel(web_server, static_content, "Serves", "HTTP/2")

    UpdateLayoutConfig($c4ShapeInRow="2", $c4BoundaryInRow="1")
    Infrastructure Component Technology Purpose Configuration
    GitHub Actions Runner ubuntu-latest, Node.js 25 Execute generation workflow .github/workflows/news-*.lock.yml
    Amazon CloudFront AWS CDN Serve static content globally CloudFront distribution (deploy-s3.yml)
    Amazon S3 AWS Object Storage Host static site files S3 bucket (deploy-s3.yml)
    Git Repository GitHub Storage Version control + content storage public repository
    Web Browser Modern browsers Render news articles HTML5, CSS3, ES6+
    EP MCP Server Local Node process EP data access Spawned locally via stdio JSON-RPC
    LLM Service External API Content generation API key authentication

    Layer Technology Version Purpose Rationale
    Runtime Node.js 25.x Current JavaScript execution environment Current release for latest features, performance improvements; upgrade to Node.js 26 LTS planned April 2026
    Language TypeScript 5.x Primary development language Strict type safety, compile-time error detection; compiles from src/ to scripts/ targeting ES2024
    Package Manager npm 10.x Dependency management Native Node.js package manager, security audit integration
    Testing Vitest 4.x Unit and integration testing Fast, modern, ESM-native test runner with great DX
    E2E Testing Playwright 1.58.x End-to-end browser testing Cross-browser support, reliable selectors, parallel execution
    Linting ESLint 9.x Code quality and security Flat config support, security plugins, TypeScript rules via @typescript-eslint
    Formatting Prettier 3.x Code formatting Zero-config, opinionated formatter, consistent code style
    Technology Current Version Minimum Version End-of-Life Update Policy
    Node.js 25.x (current) 25.0.0 ~Apr 2026 (Current EOL; upgrading to Node.js 26 LTS) Update to Node.js 26 LTS within days of release (~Apr 2026)
    npm 10.x (latest) 10.0.0 Follows Node.js lifecycle Auto-updated with Node.js
    TypeScript 5.9.x 5.0.0 N/A Update to latest minor within 14 days, major within 90 days
    Vitest 4.0.18 4.0.0 N/A Update to latest minor within 14 days, major within 60 days
    Playwright 1.58.2 1.50.0 N/A Update to latest minor within 14 days, major within 60 days
    ESLint 9.39.2 9.0.0 N/A Update to latest minor within 14 days, major within 90 days
    Prettier 3.8.1 3.0.0 N/A Update to latest minor within 14 days, major within 90 days

    Production Dependencies (1):

    • european-parliament-mcp-server (@latest) - Provides European Parliament data access at build time via MCP protocol (stdio JSON-RPC)

    Development Dependencies (28 total):

    Category Dependencies Purpose
    Testing vitest, @vitest/ui, @vitest/coverage-v8, @playwright/test, @axe-core/playwright, happy-dom Unit, integration, E2E testing, accessibility testing, coverage reporting
    TypeScript typescript, @typescript-eslint/eslint-plugin, @typescript-eslint/parser, @types/node, ts-api-utils, tsx TypeScript compiler, ESLint TypeScript rules, type definitions, dev runner
    Code Quality eslint, @eslint/js, prettier, eslint-config-prettier, eslint-plugin-security, eslint-plugin-import, eslint-plugin-jsdoc, eslint-plugin-sonarjs, jscpd Linting, formatting, security checks, duplicate detection
    Git Hooks husky, lint-staged Pre-commit hooks for quality gates
    HTML Validation htmlhint HTML5 validation
    Documentation typedoc, jsdoc, docdash, jsdoc-to-markdown API documentation generation
    Tool Purpose Integration Configuration
    CodeQL SAST scanning GitHub Actions (weekly + PR) .github/workflows/codeql.yml
    Dependabot Dependency vulnerability scanning GitHub native (daily) .github/dependabot.yml
    npm audit Dependency security check Pre-commit + CI package.json scripts
    ESLint Security Security-focused linting Pre-commit + CI eslint.config.js (security plugin)
    HTMLHint HTML validation CI pipeline .htmlhintrc
    Husky Git hooks Pre-commit, pre-push .husky/ directory
    Playwright Accessibility testing E2E test suite playwright.config.js (axe integration)
    Service Purpose Configuration Cost
    GitHub Actions CI/CD automation .github/workflows/ Free (public repo)
    AWS S3 Static site hosting S3 bucket + static website Pay-per-use (storage, requests)
    Amazon CloudFront Content delivery CloudFront distribution (S3) Pay-per-use (data transfer, requests)
    Git Version control Repository Free (public repo)
    Service Purpose Protocol Authentication Rate Limits Cost Model
    European Parliament MCP Server EP data access Local process (stdio JSON-RPC) None (local process) N/A (handled by MCP server / EP APIs) Free (EP open data via MCP server)
    LLM Service (OpenAI/Anthropic) Content generation HTTPS/JSON API key (required) Varies by provider Pay-per-token
    GitHub API Repository operations REST/GraphQL GitHub token 5000 req/hr Free (authenticated)
    Browser Minimum Version Features Required Testing Coverage
    Chrome/Edge 90+ ES2020, CSS Grid, Flexbox โœ… Playwright E2E (Chromium in CI)
    Firefox 88+ ES2020, CSS Grid, Flexbox ๐Ÿงช Manual regression (no Playwright CI)
    Safari 14+ ES2020, CSS Grid, Flexbox ๐Ÿงช Manual regression (no Playwright CI)
    Mobile Chrome 90+ ES2020, Responsive Design ๐Ÿงช Manual responsive testing
    Mobile Safari 14+ ES2020, Responsive Design ๐Ÿงช Manual responsive testing

    No support for:

    • Internet Explorer (EOL June 2022)
    • Legacy Edge (Chromium-based only)

    TypeScript source in src/ is compiled to JavaScript in scripts/ via tsc. The generated JavaScript files are executed by Node.js during news generation.

    src/                          โ†’ scripts/             (tsc compilation)
    โ”œโ”€โ”€ types/index.ts โ†’ types/index.js Type definitions (ArticleCategory, LanguageCode, interfaces)
    โ”œโ”€โ”€ constants/config.ts โ†’ constants/config.js Project paths, BASE_URL, article filename pattern
    โ”œโ”€โ”€ constants/languages.ts โ†’ constants/languages.js 14-language translations, presets, flags
    โ”œโ”€โ”€ mcp/ep-mcp-client.ts โ†’ mcp/ep-mcp-client.js MCP client (JSON-RPC 2.0, retry, singleton)
    โ”œโ”€โ”€ templates/article-template.ts โ†’ templates/article-template.js HTML5 article generation (SEO, JSON-LD, Open Graph)
    โ”œโ”€โ”€ generators/news-enhanced.ts โ†’ generators/news-enhanced.js Core article generation engine
    โ”œโ”€โ”€ generators/news-indexes.ts โ†’ generators/news-indexes.js Multi-language index generator
    โ”œโ”€โ”€ generators/sitemap.ts โ†’ generators/sitemap.js XML sitemap generator
    โ”œโ”€โ”€ utils/file-utils.ts โ†’ utils/file-utils.js File operations, escapeHTML, isSafeURL
    โ”œโ”€โ”€ utils/news-metadata.ts โ†’ utils/news-metadata.js articles-metadata.json management
    โ”œโ”€โ”€ utils/copy-test-reports.ts โ†’ utils/copy-test-reports.js Test report archiver
    โ””โ”€โ”€ utils/generate-docs-index.ts โ†’ utils/generate-docs-index.js Docs hub generator

    Key build commands:

    • npm run build โ€” Runs tsc (TypeScript compilation)
    • npm run lint โ€” ESLint on src/ TypeScript files
    • npm run generate-news โ€” Executes compiled scripts/generators/news-enhanced.js
    • npm run generate-news-indexes โ€” Executes compiled scripts/generators/news-indexes.js

    TypeScript configuration (tsconfig.json):

    • target: ES2024 โ€” Modern JavaScript output
    • module: ESNext โ€” ES module syntax
    • strict: true โ€” Full strict mode enabled
    • rootDir: ./src โ€” TypeScript source root
    • outDir: ./scripts โ€” Compiled JavaScript output

    sequenceDiagram
    participant GHA as GitHub Actions
    participant CLI as CLI Interface
    participant Gen as Article Generator
    participant MCP as MCP Client
    participant EP as EP MCP Server (local)
    participant TPL as Template Engine
    participant FS as File System

    GHA->>CLI: Trigger daily workflow
    CLI->>Gen: generate-news --types=week-ahead --languages=all
    Gen->>MCP: getPlenarySessions()
    Note over MCP,EP: MCP client spawns EP MCP Server as local process (stdio JSON-RPC)
    MCP->>EP: JSON-RPC request (stdio)
    EP-->>MCP: EP data (JSON-RPC response)
    MCP-->>Gen: Parsed EP data (basic shape checks)

    loop For each language (sequential)
    Gen->>TPL: Render HTML(EP data, language)
    Note over Gen,TPL: Placeholder English body content; native per-language LLM generation planned (ADR-004)
    TPL-->>Gen: HTML output
    Gen->>FS: Write article file
    end

    Gen->>FS: Write metadata.json (writeFileSync โ€” failure fails run)
    GHA->>GHA: Commit and push changes
    GHA->>GHA: Deploy to S3 + invalidate CloudFront
    sequenceDiagram
    participant User as User Browser
    participant CDN as CloudFront CDN
    participant S3 as Amazon S3
    participant Repo as Git Repository

    User->>CDN: GET /index.html
    CDN->>S3: Forward request (cache miss)
    S3-->>CDN: HTML response
    CDN-->>User: Cached HTML

    User->>CDN: GET /news/week-ahead-2026-02-17-en.html
    CDN-->>User: Cached article (or fetch from S3)

    Cross-cutting concerns are aspects of the system that affect multiple components and layers. These concerns are implemented consistently across the entire architecture.

    Logging Levels:

    Level Usage Output Retention
    ERROR Unrecoverable errors (API failures, file write errors) console.error(), GitHub Actions logs 90 days (GitHub)
    WARN Recoverable issues (MCP connection retry/backoff, MCP tool fallback, JSON.parse recovery) console.warn(), GitHub Actions logs 90 days (GitHub)
    INFO Normal operations (generation start/complete, article count) console.log(), GitHub Actions logs 90 days (GitHub)
    DEBUG Detailed diagnostics (API responses, intermediate data) Disabled in production Dev only

    Structured Logging Format:

    {
    timestamp: "2026-02-20T10:30:00.000Z",
    level: "INFO",
    component: "ArticleGenerator",
    action: "generate_article",
    language: "en",
    article_type: "week-ahead",
    duration_ms: 1234,
    status: "success"
    }

    Logging Implementation:

    • Build Logs: All GitHub Actions workflow logs (generation, deployment, tests)
    • Error Tracking: Errors logged to GitHub Actions workflow logs for visibility
    • Performance Metrics: Generation time per article, API call durations
    • Audit Trail: Git commit history serves as audit log for all content changes
    graph TB
    subgraph "Generation Monitoring"
    Workflow[GitHub Actions Workflow]
    GenMetrics[Generation Metrics<br/>Article count, Duration, Errors]
    TestResults[Test Results<br/>Unit, Integration, E2E]
    end

    subgraph "Application Monitoring"
    Pages[Amazon CloudFront + S3]
    Analytics[Web Analytics<br/>Visits, Bounce Rate, Countries]
    Uptime[Uptime Monitoring<br/>AWS Health Dashboard]
    end

    subgraph "Security Monitoring"
    Dependabot[Dependabot Alerts]
    CodeQL[CodeQL Security Scans]
    Audit[npm audit]
    end

    subgraph "Alerting"
    Email[Email Notifications]
    GitHubUI[GitHub UI Alerts]
    Status[Status Checks]
    end

    Workflow -->|Logs| GenMetrics
    Workflow -->|Results| TestResults
    Pages -->|Metrics| Analytics
    Pages -->|Health| Uptime
    Dependabot -->|Alerts| Email
    CodeQL -->|Findings| GitHubUI
    Audit -->|Vulnerabilities| Status

    GenMetrics -->|Failures| Email
    TestResults -->|Failures| Status

    style Dependabot fill:#f99,stroke:#333,stroke-width:2px
    style CodeQL fill:#f99,stroke:#333,stroke-width:2px

    Monitoring Tools:

    Metric Tool Threshold Alert
    Build Success Rate GitHub Actions <95% over 7 days Email to maintainers
    Generation Duration Workflow logs >15 minutes Warning annotation
    Test Pass Rate Vitest + Playwright <100% Block merge
    Security Vulnerabilities Dependabot + CodeQL Any high/critical Email + PR
    Site Availability AWS Health Dashboard <99.9% AWS Health event notification
    Page Load Time Lighthouse (manual runs) >3 seconds Warning annotation

    Error Handling Strategy:

    flowchart TD
    Start([API Call / Operation])
    Try{Try Operation}
    Success[โœ… Success]
    Catch{Catch Error}
    Transient{Transient<br/>Error?}
    Retry[Retry with<br/>Exponential Backoff]
    MaxRetries{Max Retries<br/>Reached?}
    Fallback{Fallback<br/>Available?}
    UseFallback[Use Fallback Data]
    LogError[Log Error]
    PropagateError[Propagate Error]
    GracefulDegradation[Graceful Degradation]

    Start --> Try
    Try -->|Success| Success
    Try -->|Error| Catch
    Catch --> Transient
    Transient -->|Yes| Retry
    Transient -->|No| Fallback
    Retry --> MaxRetries
    MaxRetries -->|No| Try
    MaxRetries -->|Yes| Fallback
    Fallback -->|Yes| UseFallback
    Fallback -->|No| LogError
    UseFallback --> GracefulDegradation
    LogError --> PropagateError

    style Success fill:#9f9,stroke:#333,stroke-width:2px
    style LogError fill:#f99,stroke:#333,stroke-width:2px
    style PropagateError fill:#f99,stroke:#333,stroke-width:2px
    style GracefulDegradation fill:#ff9,stroke:#333,stroke-width:2px

    Error Categories and Handling:

    Error Category Examples Retry Strategy Fallback User Impact
    Transient Network Errors MCP connection failure during startup, LLM API rate limit Exponential backoff (1s, 2s, 4s), max 3 retries for MCP connection establishment and LLM calls; individual MCP requests use a single fixed timeout with no retry Use placeholder events or skip affected items (no cache) Missing or placeholder content for affected items
    Permanent API Errors Invalid API key, malformed request No retry Skip article generation for affected language Missing article for specific language
    Data Validation Errors Invalid EP data structure, missing required fields No automatic regeneration loop Skip invalid items (no cached-data fallback) Missing content for invalid items
    File System Errors Disk full, permission denied No retry Fail workflow Build failure (no deployment)
    Content Generation Errors LLM refusal, prompt injection detected Single generation attempt (no automatic regeneration loop) Insert placeholder events when content generation fails Reduced content quality or placeholder content

    Error Propagation:

    1. Component Level: Catch and log errors, attempt recovery
    2. Service Level: Propagate if unrecoverable, aggregate errors for reporting
    3. Workflow Level: Fail fast if critical (file system), continue if non-critical (single article failure)

    14 Languages Supported:

    • ๐Ÿ‡ฌ๐Ÿ‡ง English (en) - 67 million
    • ๏ฟฝ๐Ÿ‡ช Swedish (sv) - 10 million
    • ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish (da) - 6 million
    • ๐Ÿ‡ณ๐Ÿ‡ด Norwegian (no) - 5 million
    • ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish (fi) - 5 million
    • ๐Ÿ‡ฉ๐Ÿ‡ช German (de) - 95 million
    • ๐Ÿ‡ซ๐Ÿ‡ท French (fr) - 67 million
    • ๐Ÿ‡ช๐Ÿ‡ธ Spanish (es) - 47 million
    • ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch (nl) - 24 million
    • ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic (ar) - 420 million
    • ๐Ÿ‡ฎ๐Ÿ‡ฑ Hebrew (he) - 9 million
    • ๐Ÿ‡ฏ๐Ÿ‡ต Japanese (ja) - 125 million
    • ๐Ÿ‡ฐ๐Ÿ‡ท Korean (ko) - 77 million
    • ๐Ÿ‡จ๐Ÿ‡ณ Chinese (zh) - 1.3 billion

    i18n Architecture:

    graph LR
    subgraph "Content Generation"
    EPData[EP Data<br/>Language-Neutral]
    LLM[LLM Service]
    Prompt[Language-Specific Prompt]
    end

    subgraph "14 Language Variants"
    EN[English Article]
    SV[Swedish Article]
    DA[Danish Article]
    NO[Norwegian Article]
    FI[Finnish Article]
    DE[German Article]
    FR[French Article]
    ES[Spanish Article]
    NL[Dutch Article]
    AR[Arabic Article]
    HE[Hebrew Article]
    JA[Japanese Article]
    KO[Korean Article]
    ZH[Chinese Article]
    end

    subgraph "Delivery"
    Index[Language-Specific<br/>Index Pages]
    Sitemap[Multilingual<br/>Sitemap.xml]
    end

    EPData --> LLM
    Prompt --> LLM
    LLM --> EN
    LLM --> SV
    LLM --> DA
    LLM --> NO
    LLM --> FI
    LLM --> DE
    LLM --> FR
    LLM --> ES
    LLM --> NL
    LLM --> AR
    LLM --> HE
    LLM --> JA
    LLM --> KO
    LLM --> ZH

    EN --> Index
    DE --> Index
    FR --> Index
    ES --> Index
    Index --> Sitemap

    style EPData fill:#9cf,stroke:#333,stroke-width:2px
    style LLM fill:#fc9,stroke:#333,stroke-width:2px

    i18n Implementation:

    Aspect Implementation Example
    Content Generation Placeholder English content for all languages (current); native LLM per-language generation planned (ADR-004) Current: shared English body with localized titles/subtitles; Future: each article written directly in target language
    File Naming Language suffix in filename week-ahead-2026-02-17-en.html, week-ahead-2026-02-17-de.html
    HTML lang Attribute Set per page <html lang="en">, <html lang="de">
    Navigation Language-specific index pages index.html, index-de.html
    SEO hreflang tags for alternate languages <link rel="alternate" hreflang="de" href="...">
    Date Formatting Locale-specific date formats EN: "February 17, 2026", DE: "17. Februar 2026"
    Character Encoding UTF-8 for all languages <meta charset="UTF-8">

    Language Quality Assurance:

    • Current State: Placeholder English body content with localized metadata (title, subtitle, HTML lang attribute, date formats) per language
    • Target State (ADR-004): LLM generates content natively in each language (not machine translation)
    • Cultural Adaptation: Planned โ€” prompts will include cultural context for each language/region
    • Terminology Consistency: EP terminology to be used consistently per language
    • Quality Metrics: Human review of sample articles per language quarterly


    Architecture Decision Records document significant architectural decisions made during the design and development of EU Parliament Monitor. Each ADR captures the context, decision, and consequences of a specific architectural choice.

    Status: Accepted
    Date: 2025-12-01
    Decision Makers: CEO, Development Team

    Context:

    • Need to display European Parliament news to public audience
    • Security is paramount (public-facing system)
    • Limited development resources
    • GitHub Pages available as free hosting solution; AWS S3 + CloudFront chosen for production (see ADR-002)

    Decision: We will build EU Parliament Monitor as a static site generator rather than a dynamic web application with backend services.

    Rationale:

    1. Security: Static sites eliminate entire classes of vulnerabilities (SQL injection, XSS via server-side rendering, authentication bypass)
    2. Scalability: Static content scales infinitely via CDN with no server infrastructure
    3. Cost: Static hosting on AWS S3 + CloudFront is low-cost, no server infrastructure
    4. Maintainability: Simpler architecture with fewer moving parts
    5. Reliability: No database or server downtime risks

    Alternatives Considered:

    • WordPress: Rejected due to security vulnerabilities, plugin maintenance overhead
    • Node.js/Express backend: Rejected due to hosting costs, operational complexity
    • JAMstack with headless CMS: Rejected due to unnecessary complexity for simple content

    Consequences:

    • โœ… Positive: Minimal attack surface, zero infrastructure costs, infinite scalability
    • โœ… Positive: Fast page loads, excellent SEO, simple deployment
    • โš ๏ธ Negative: Content updates require regeneration (acceptable for daily news)
    • โš ๏ธ Negative: No real-time interactivity (not required for news consumption)

    Compliance: Aligns with ISO 27001 A.8.28 (Secure Development), NIST CSF PR.DS-5 (Minimal Attack Surface)


    Status: Accepted
    Date: 2025-12-05
    Decision Makers: CEO, DevOps Team

    Context:

    • Static site architecture chosen (ADR-001)
    • Need reliable, secure hosting with global CDN
    • Budget constraints (low-cost solution preferred)
    • Already using GitHub for source control and CI/CD

    Decision: We will host EU Parliament Monitor on AWS S3 with Amazon CloudFront as the global CDN (see .github/workflows/deploy-s3.yml).

    Rationale:

    1. Cost: Low-cost static hosting within current traffic and budget constraints
    2. Integration: GitHub Actions CI/CD deploys to S3 and invalidates the CloudFront distribution
    3. Security: HTTPS via AWS Certificate Manager, TLS termination at CloudFront edge
    4. Reliability: AWS S3 and CloudFront SLAs provide high availability and durability
    5. Performance: CloudFront global edge network with caching for low-latency delivery

    Alternatives Considered:

    • GitHub Pages: Considered for simplicity and zero direct hosting cost; kept as a documented alternative but not chosen due to less flexible edge configuration
    • Netlify: Rejected due to build minute limits on free tier
    • Vercel: Rejected due to commercial focus, potential future costs
    • Self-hosted Nginx: Rejected due to operational burden, security maintenance

    Consequences:

    • โœ… Positive: Globally distributed static hosting with strong reliability and performance
    • โœ… Positive: Automated deployments from GitHub Actions to S3 with CloudFront cache invalidation
    • โœ… Positive: Integration with AWS security services (WAF, Shield, ACM)
    • โš ๏ธ Negative: Ongoing AWS hosting costs and need to manage AWS credentials securely
    • โš ๏ธ Negative: Increased operational complexity compared to GitHub Pages

    Compliance: Aligns with ISO 27001 A.8.24 (Cryptography - HTTPS), CIS Control 1 (Asset Management)


    Status: Accepted
    Date: 2025-12-10
    Decision Makers: CEO, Data Team

    Context:

    • Need structured access to European Parliament data (MEPs, plenary sessions, votes, documents)
    • Official EP APIs are fragmented, inconsistent, and poorly documented
    • Data schemas vary across endpoints
    • Need caching, validation, and error handling

    Decision: We will access European Parliament data via the European Parliament MCP Server using the Model Context Protocol (MCP) rather than calling official EP APIs directly.

    Rationale:

    1. Abstraction: MCP Server provides unified interface to fragmented EP APIs
    2. Data Normalization: Consistent data structures across EP data sources
    3. Error Handling: Connection retry logic and graceful degradation
    4. Maintainability: API changes isolated to MCP Server, not news generator
    5. Local Process: Spawned as stdio JSON-RPC process during build, no separate deployment needed

    Alternatives Considered:

    • Direct EP API calls: Rejected due to fragmentation, lack of validation, poor error handling
    • Custom wrapper library: Rejected due to development overhead, maintenance burden
    • Third-party EP data services: Rejected due to cost, data freshness concerns

    Consequences:

    • โœ… Positive: Clean separation of concerns, reusable data layer
    • โœ… Positive: Standardized data structures, no direct EP API fragmentation
    • โœ… Positive: MCP Server maintained separately, used by multiple clients
    • โš ๏ธ Negative: Additional dependency (mitigated by fallback data strategy)
    • โš ๏ธ Negative: Requires MCP Server process availability during build

    Compliance: Aligns with ISO 27001 A.8.3 (Input Validation), NIST CSF PR.DS-2 (Data in Transit Protection)


    Status: Accepted
    Date: 2025-12-15
    Decision Makers: CEO, Content Team

    Context:

    • Need to support 14 languages
    • Machine translation often produces unnatural, awkward phrasing
    • European Parliament terminology requires domain expertise
    • Budget available for LLM API costs

    Decision: We will generate content natively in each language using LLMs rather than translating from a base language.

    Rationale:

    1. Quality: Native generation produces natural, idiomatic language
    2. Cultural Adaptation: LLM can adapt content for cultural context per language
    3. Terminology: LLM trained on EP documents uses correct terminology
    4. Flexibility: Different article structures possible per language/culture
    5. Scalability: Parallel generation for all languages

    Alternatives Considered:

    • Machine Translation (Google Translate, DeepL): Rejected due to unnatural phrasing, terminology issues
    • Human Translation: Rejected due to cost (~โ‚ฌ0.10/word x 14 languages), time delays
    • English-only: Rejected due to accessibility concerns, limited audience

    Consequences:

    • โœ… Positive: High-quality, natural language content in all 14 languages
    • โœ… Positive: Cultural adaptation, correct terminology
    • โš ๏ธ Negative: Higher LLM API costs (~$5-10/day) vs translation (~$1-2/day)
    • โš ๏ธ Negative: Content may vary slightly across languages (acceptable, even beneficial)

    Compliance: Aligns with Hack23 AI Policy (Transparency, Human Oversight), ISO 27001 A.5.10 (Information Processing)


    Status: Accepted
    Date: 2026-01-05
    Decision Makers: CEO, Development Team

    Context:

    • Building news generation scripts and static site generator
    • Need compile-time type safety for complex data structures from EP MCP Server
    • Multiple article categories, 14 languages, and complex data pipelines
    • Small development team (1-2 developers) benefits from IDE support

    Decision: We will use TypeScript (strict mode) as the primary development language, compiling from src/ to scripts/ targeting ES2024.

    Rationale:

    1. Type Safety: Strict mode catches errors at compile time, especially important for complex EP data structures and MCP client interfaces
    2. IDE Support: Full IntelliSense, refactoring, and navigation in VS Code
    3. Self-Documenting: TypeScript interfaces serve as living documentation for data models (ArticleCategory, LanguageCode, MCPToolResult, etc.)
    4. Build Pipeline: tsc compiles src/*.ts โ†’ scripts/*.js; rootDir: ./src, outDir: ./scripts, target: ES2024, module: ESNext
    5. Ecosystem: Full access to Node.js and npm ecosystem with type definitions

    Alternatives Considered:

    • JavaScript (ES2024) with JSDoc: Rejected due to weaker type guarantees, less comprehensive IDE support for complex interfaces
    • Flow: Rejected due to declining community support
    • JavaScript ES2015: Rejected due to lack of modern features (optional chaining, nullish coalescing)

    Consequences:

    • โœ… Positive: Compile-time error detection, comprehensive IDE support, self-documenting code
    • โœ… Positive: Strict null checks prevent runtime errors with optional EP data fields
    • โš ๏ธ Negative: Requires build step (npm run build / tsc) before execution
    • โš ๏ธ Negative: Slightly higher learning curve for contributors unfamiliar with TypeScript

    Compliance: Aligns with Hack23 Secure Development Policy (Type Safety Principle), ISO 27001 A.8.28 (Secure Coding)


    Non-functional requirements define system qualities that are not directly related to specific features but are critical to overall system success.

    Requirement Target Measurement Current Status
    Page Load Time (Desktop) <1 second (LCP) Lighthouse (manual runs) โœ… 0.6s average
    Page Load Time (Mobile) <2 seconds (LCP) Lighthouse (manual runs) โœ… 1.2s average
    Build Time (All Languages) <15 minutes GitHub Actions logs โœ… 8-12 minutes
    Article Generation (Single) <30 seconds Script logs โœ… 15-25 seconds
    MCP API Response Time <2 seconds (p95) Client logs โœ… 1.1s average
    CDN Cache Hit Rate >95% CloudFront metrics (planned) โณ TBD โ€” instrumentation planned

    Performance Optimization Strategies:

    • Static Content: All content pre-generated, no server-side processing
    • CDN Caching: Tiered caching strategy (1 hour for HTML, 1 day for metadata, 1 year for immutable assets)
    • Image Optimization: None required (no images in MVP)
    • Minification: HTML minification (future), CSS minification (future)
    • HTTP/2: Enabled by default on Amazon CloudFront
    Dimension Current Capacity Target Capacity Scaling Strategy
    Concurrent Users Unlimited (static content) Unlimited CDN auto-scales
    Daily Visitors 10,000+ 100,000+ CDN bandwidth increase
    Articles per Day 14 (one per language) 140 (ten per language) Parallel generation, workflow optimization
    Supported Languages 14 24+ (expanded markets) Add language configs, LLM prompts
    Repository Size 150 MB 800 MB (GitHub limit) Archive old articles annually

    Scalability Constraints:

    • AWS S3: No repository size limit for static hosting; storage costs increase linearly
    • GitHub Actions: 2000 minutes/month free, unlimited for public repos
    • LLM API: Rate limits vary by provider (typically 3000 RPM for tier 2)
    Requirement Target Measurement Consequence of Failure
    Site Availability 99.9% (AWS CloudFront/S3 SLA) GitHub Status + AWS Health Dashboard Users cannot access news
    Build Success Rate >98% GitHub Actions logs No new content deployed
    MCP API Availability >99% (best effort) Health checks Fallback to placeholder events (no cached/previous data)
    LLM API Availability >99.5% (provider SLA) API logs Generation fails, retry logic
    Recovery Time Objective (RTO) <15 minutes Manual testing Time to restore service after outage
    Recovery Point Objective (RPO) <24 hours Git history Maximum data loss acceptable

    High Availability Strategies:

    • Static Architecture: No single point of failure (SPOF) in runtime
    • CDN Redundancy: Amazon CloudFront with multiple edge locations globally
    • Fallback Data: Use placeholder events if EP MCP Server unavailable (no cache/previous-data reuse)
    • Retry Logic: Exponential backoff for transient failures
    • Monitoring: GitHub Status, Dependabot alerts, workflow notifications
    Requirement Implementation Verification Compliance
    HTTPS-Only CloudFront enforces HTTPS redirect via ACM certificate Manual testing ISO 27001 A.8.24
    Content Security Policy (CSP) Planned strict CSP via CloudFront response headers (no CSP meta tag in HTML templates currently) CSP Evaluator (staging/production) ISO 27001 A.8.23
    No Secrets in Repository GitHub Secrets for API keys Git history scan ISO 27001 A.8.3
    Dependency Vulnerability Scanning Dependabot daily scans GitHub Security tab CIS Control 10
    SAST (Static Application Security Testing) CodeQL weekly + PR GitHub Code Scanning ISO 27001 A.8.28
    Access Control GitHub RBAC, branch protection Repository settings CIS Control 6
    Audit Logging GitHub audit logs, workflow logs Logs API ISO 27001 A.8.15
    Data Classification All content PUBLIC CLASSIFICATION.md ISO 27001 A.5.10
    Incident Response SECURITY.md procedures Quarterly reviews NIST CSF RS.RP

    Security Testing:

    • SAST: CodeQL (weekly + PR) - JavaScript/TypeScript, HTML
    • Dependency Scanning: Dependabot (daily) + npm audit (pre-commit)
    • Manual Penetration Testing: Not required (static site, no user input)
    • Security Reviews: Quarterly architecture review
    Criterion Requirement Implementation Testing
    Perceivable Text alternatives, adaptable content, distinguishable Semantic HTML5, alt text, contrast ratios Playwright axe tests
    Operable Keyboard accessible, enough time, navigable, input modalities Focus management, skip links, ARIA labels Manual keyboard testing
    Understandable Readable, predictable, input assistance lang attributes, consistent navigation, form labels Lighthouse accessibility
    Robust Compatible with assistive technologies Valid HTML5, ARIA roles HTML validator

    Accessibility Targets:

    • WCAG 2.1 AA Compliance: 100% (mandatory)
    • Lighthouse Accessibility Score: >95% (target 100%)
    • Keyboard Navigation: All interactive elements accessible
    • Screen Reader Support: JAWS, NVDA, VoiceOver tested quarterly

    Accessibility Testing:

    • Automated: Playwright with axe-core (every PR)
    • Manual: Quarterly screen reader testing, keyboard navigation
    • Tools: Lighthouse (manual runs), axe DevTools, HTML validator
    Metric Target Current Tool
    Code Coverage >80% lines 82% Vitest
    Branch Coverage >80% branches 83% Vitest
    Cognitive Complexity <15 per function <10 average ESLint sonarjs cognitive-complexity rule
    Code Duplication <3% <2% Manual review
    Documentation Coverage 100% public APIs 95% JSDoc, manual review
    Build Time <5 minutes (tests only) 3-4 minutes GitHub Actions

    Maintainability Practices:

    • Code Review: All PRs require approval
    • Documentation: Architecture, security, process docs maintained
    • Testing: Unit (Vitest), E2E (Playwright), manual accessibility
    • Linting: ESLint with security plugin, Prettier formatting
    • Dependencies: Minimal (1 production, 28 dev), regularly updated

    • Minimal Attack Surface: Static architecture eliminates server-side vulnerabilities
    • No Runtime Execution: Pure HTML/CSS with no backend processing
    • Content Security Policy: Strict CSP headers prevent XSS
    • HTTPS Only: All content delivered over HTTPS
    • Generation: News generation scripts (TypeScript โ†’ Node.js)
    • Presentation: Static HTML/CSS
    • Data Access: MCP Client abstraction
    • Infrastructure: GitHub-managed CI/CD and hosting
    • 14 Languages Supported: Full multi-language coverage including RTL support
    • Language-Specific Indexes: Separate navigation for each language
    • SEO Per Language: Individual sitemaps and metadata
    • Minimal Dependencies: One production dependency (european-parliament-mcp-server for build-time data access), only dev dependencies otherwise
    • Standard Technologies: HTML5, CSS3, TypeScript (compiled to ES2024 JavaScript)
    • Comprehensive Testing: Unit, integration, and E2E tests
    • Documentation: Architecture, security, and process docs
    • Static Content: Infinite scalability via CDN
    • No Database: No scaling bottlenecks
    • Cacheable: All content highly cacheable
    • GitHub Infrastructure: Leverages GitHub's global infrastructure

    • Cold Start: N/A (static site, no cold starts)
    • Page Load: < 1s (static HTML, CDN cached)
    • Build Time: ~5-10 minutes (generation for all languages)
    • Deployment Time: ~1-2 minutes (S3 sync + CloudFront invalidation)
    • Target: 99.9% (AWS CloudFront/S3 SLA)
    • Redundancy: CloudFront with multiple edge locations globally
    • Failover: Automatic via AWS infrastructure
    • Monitoring: AWS Health Dashboard, GitHub Status page
    • Attack Surface: Minimal (static files only)
    • Vulnerability Scanning: Daily (Dependabot + npm audit)
    • SAST: Weekly (CodeQL)
    • Compliance: ISO 27001, GDPR, NIS2, EU CRA aligned
    • Code Complexity: Low (TypeScript scripts, no frameworks)
    • Test Coverage: 82%+ lines, 83%+ branches
    • Documentation: Comprehensive (10+ architecture docs)
    • Dependencies: 1 production (european-parliament-mcp-server), 28 dev dependencies