EU Parliament Monitor — API Documentation - v0.9.24
    Preparing search index...

    Module Aggregator/Metadata/TextUtils

    Pure text / Markdown classification + label-stripping helpers used by the metadata resolver chain. Constants live in text-utils-constants.ts; byte-budget truncators and sentence- extraction live in text-truncate.ts. This file re-exports the full public surface so existing call-sites keep working.

    Bounded-context rules:

    • No upward imports — pure helpers, no I/O, no globals.
    • Deterministic — same input always produces same output.
    • Locale-agnostic — every helper works on raw Markdown / prose in any of the 14 publishing languages. Banner-row detection is driven by structural shape (double-bold + pipe-separator), not by a hard-coded English vocabulary.

    Functions

    shouldSkipDescriptionLine
    stripLeadingProseLabel
    stripLeadingBoldLabel
    stripInlineMarkdown
    extractFirstSentence → extractFirstSentence
    stripTrailingStopWordsAndPunctuation → stripTrailingStopWordsAndPunctuation
    truncateDescription → truncateDescription
    truncateExtendedDescription → truncateExtendedDescription
    truncateTitle → truncateTitle

    Variables

    ABBREVIATION_PREFIXES → ABBREVIATION_PREFIXES
    DESCRIPTION_MAX_LENGTH → DESCRIPTION_MAX_LENGTH
    DESCRIPTION_MIN_LENGTH → DESCRIPTION_MIN_LENGTH
    EMOJI_BANNER_CHARS → EMOJI_BANNER_CHARS
    ENRICHMENT_TRIGGER_LENGTH → ENRICHMENT_TRIGGER_LENGTH
    EXTENDED_DESCRIPTION_MAX_LENGTH → EXTENDED_DESCRIPTION_MAX_LENGTH
    EXTENDED_DESCRIPTION_MIN_LENGTH → EXTENDED_DESCRIPTION_MIN_LENGTH
    HEADLINE_CLAUSE_BOUNDARIES → HEADLINE_CLAUSE_BOUNDARIES
    HEADLINE_SOFT_MIN → HEADLINE_SOFT_MIN
    METADATA_LINE_PREFIXES → METADATA_LINE_PREFIXES
    TITLE_MAX_LENGTH → TITLE_MAX_LENGTH
    TRAILING_PUNCT → TRAILING_PUNCT
    TRAILING_STOP_WORDS → TRAILING_STOP_WORDS