Plato's Cave: A Human-Centered Research Verification System
arXiv:2603.23526v1 Announce Type: new Abstract: The growing publication rate of research papers has created an urgent need for better ways to fact-check information, assess writing …
Quality follows upgrading
Category
arXiv:2603.23526v1 Announce Type: new Abstract: The growing publication rate of research papers has created an urgent need for better ways to fact-check information, assess writing …
arXiv:2603.23525v1 Announce Type: new Abstract: The economics of prompt compression depend not only on reducing input tokens but on how compression changes output length, which …
arXiv:2603.23524v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) trained on large language model activations output thousands of features that enable mapping to human-interpretable concepts. The …
arXiv:2603.23523v1 Announce Type: new Abstract: Recent 3D Large-Language Models (3D-LLMs) claim to understand 3D worlds, especially spatial relationships among objects. Yet, we find that simply …
arXiv:2603.23522v1 Announce Type: new Abstract: Evaluating large language models (LLMs) on open-ended questions is difficult because response quality depends on the question's context. Binary scores …
arXiv:2603.23521v1 Announce Type: new Abstract: Multimodal research has predominantly focused on single-image reasoning, with limited exploration of multi-image scenarios. Recent models have sought to enhance …
arXiv:2603.23520v1 Announce Type: new Abstract: Medicine is an empirical discipline refined through long-term observation and the messy, high-variance reality of clinical practice. Physicians build diagnostic …
arXiv:2603.23519v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across various specialist domains and have been integrated into high-stakes areas such …
arXiv:2603.23518v1 Announce Type: new Abstract: General-purpose embedding models excel at recognizing semantic similarities but fail to capture the characteristics of texts specified by user instructions. …
arXiv:2603.23516v1 Announce Type: new Abstract: Long-term memory is a cornerstone of human intelligence. Enabling AI to process lifetime-scale information remains a long-standing pursuit in the …
arXiv:2603.23515v1 Announce Type: new Abstract: Improving the accuracy and reliability of medical coding reduces clinician burnout and supports revenue cycle processes, freeing providers to focus …
arXiv:2603.23514v1 Announce Type: new Abstract: Large Language Models appear competent when answering general questions but often fail when pushed into domain-specific details. No existing methodology …