2024/11/22
- Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
- OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs
- UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages
- MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control
2024/11/21
- SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration
- Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
- When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
- 𝕍𝕚𝔹𝕖: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
- ORID: Organ-Regional Information Driven Framework for Radiology Report Generation
2024/11/20
2024/11/19
- Generative World Explorer
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
- Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering
- SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers
- VeGaS: Video Gaussian Splatting
- 本サイトは大規模言語モデルを用いた実験的な性質を持つものであるため、コンテンツの正確性についての保証は致しかねます。
- プライバシーポリシー