Technologies Used
Project Overview
TRIM-QA is a table question-answering retrieval pipeline built to improve how language models handle structured data. The system combines BM25 retrieval, table pruning, and semantic reranking so models can focus on the most relevant rows and columns instead of processing noisy table context.
Challenges
Traditional retrieval pipelines often return large tables with too much irrelevant information, which can distract downstream models and weaken answer quality on structured-data questions.
Solution
The project uses stronger tokenization for BM25, hierarchical row and column chunking, semantic pruning with SBERT, and reranking to minimize irrelevant context before question answering.
Impact & Results
Improved retrieval recall from 84.67% to 96.67% and strengthened top-ranked results on the NQ-Tables benchmark.