Multilingual Reasoning Cascades Need More Context

Abstract

Translation cascades for reasoning translate the query from another language to English, reason in English, and translate the answer back to the original language. This is a competitive approach to multilingual reasoning, but structurally lossy, since each stage discards information later stages may need, including cues for cultural grounding, register, and disambiguation. We examine the benefits of a simple and training-free intervention: a context-aware translation cascade, which additionally provides the original question, the English translated question, and the reasoning trace to the context of the final translation module. We evaluate gains across nine multilingual benchmarks including various task types, three backbone models, and 285 high-, mid-, and low-resource languages, and demonstrate strong gains for open-ended generation across models and resource regimes. We show that the original language question carries most of the beneficial context. Our study emphasizes the need to better design information flow in machine translation cascades for mitigating error propagation, and provides a simple and actionable default strategy: preserve the original user question until the end of the pipeline.

Technical Analysis & Implementation

Summary§

The paper proposes a simple, training-free intervention for multilingual reasoning cascades: a context-aware translation cascade that preserves the original question, translated query, and reasoning trace in the final translation step. This reduces information loss and yields consistent gains across 9 benchmarks, 3 models, and 285 languages, especially for open-ended generation.

Core Methodology§

Standard translation cascade for multilingual reasoning: 1. Translate query $Q_{src}$ from source language $L_s$ to English $Q_{en}$. 2. Reason in English using LLM to produce trace $R_{en}$ and answer $A_{en}$. 3. Translate $A_{en}$ back to $L_s$ to obtain final answer $A_{src}$.

This process discards valuable contextual cues (cultural grounding, register, disambiguation) at each stage. The proposed context-aware cascade modifies step 3 by including the following in the context:

Original query $Q_{src}$
Translated query $Q_{en}$
English reasoning trace $R_{en}$

Thus the final translation prompt becomes:

Translate the following English answer to $L_s$ . Use the context for disambiguation.

Context:
Original question ( $L_s$ ): {Q_src}
English question: {Q_en}
English reasoning: {R_en}

Answer to translate: {A_en}

No training is required; only input prompt modification. The authors show that the original language question carries most of the benefit, but combining all three sources yields the best results.

Implementation Details§

A PyTorch-style pseudocode snippet:

def context_aware_cascade(model, tokenizer, query, src_lang):
    # Step 1: Translate query to English
    en_query = translate(query, src_lang, 'en')
    
    # Step 2: Reason in English
    messages = [{"role": "user", "content": en_query}]
    en_answer = model.generate(messages)
    
    # Step 3: Translate answer back with context
    prompt = (
        f"Translate the following English answer to {src_lang}. Use the context for disambiguation.\n\n"
        f"Context:\n"
        f"Original question ({src_lang}): {query}\n"
        f"English question: {en_query}\n"
        f"English reasoning: {en_answer.reasoning_trace}\n\n"
        f"Answer to translate: {en_answer.text}"
    )
    final_answer = translate(prompt, 'en', src_lang)
    return final_answer

Evaluation§

Benchmarks: 9 multilingual tasks (e.g., MMLU-X, XStoryCloze, XQuAD) covering reasoning, QA, and generation.
Models: GPT-4, LLaMA-3-70B, Mixtral-8x7B.
Languages: 285 languages across high-, mid-, and low-resource.
Metrics: Exact match, F1, or BLEU for generation.

Results show average gains of +3.2% accuracy (reasoning tasks) and +5.7 BLEU (open-ended generation) compared to standard cascade. The original language query alone recovers 80% of the gain.

Key Equations§

Information loss in standard cascade can be characterized as: $$I(Q_{src}; A_{src}) \le I(Q_{src}; Q_{en}) + I(Q_{en}; A_{en}) + I(A_{en}; A_{src})$$ where equality holds only if no information is discarded. The context-aware cascade improves the final translation step by increasing $I(A_{en}; A_{src} \mid C)$ where $C = \{ Q_{src}, Q_{en}, R_{en} \}$.

Conclusion§

The paper provides a simple, actionable default strategy: preserve the original user question until the end of the pipeline. This is cost-free and effective across diverse settings, emphasizing the need for better information flow design in cascaded reasoning systems.

Abstract

Technical Analysis & Implementation

Summary§

Core Methodology§

Implementation Details§

Evaluation§

Key Equations§

Conclusion§

Embedding Vector Similarity Visualizer

Mathematical Formulation

Related Research

When are likely answers right? On Sequence Probability and Correctness in LLMs

Beyond Surface Forms: A Comprehensive, Mechanism-Oriented Taxonomy of Indirect Linguistic Encoding for LLM-Based Coded Language Detection

On-Policy Self-Distillation with Sampled Demonstrations Reduces Output Diversity

Accelerate your workflow with Feedalyze