Electronic health records (EHRs) contain rich longitudinal information essential for clinical decision-making, yet large language models (LLMs) struggle to reason across patient timelines. We introduce \textbf{TIMER} (\textbf{T}emporal \textbf{I}nstruction \textbf{M}odeling and \textbf{E}valuation for Longitudinal Clinical \textbf{R}ecords), a method to improve LLMs’ temporal reasoning over multi-visit EHRs through time-aware instruction tuning. TIMER grounds LLMs in patient-specific temporal contexts by linking each instruction-response pair to specific timestamps, ensuring temporal fidelity throughout the training process. Evaluations show that TIMER-tuned models outperform conventional medical instruction-tuned approaches by 6.6% in completeness on clinician-curated benchmarks, with distribution-matched training demonstrating advantages up to 6.5% in temporal reasoning evaluation. Qualitative analyses reveal that using TIMER enhances temporal boundary adherence, trend detection, and chronological precision, which are necessary for applications such as disease trajectory modeling and treatment response monitoring. Overall, TIMER provides a methodological basis for developing LLMs that can effectively engage with the inherently longitudinal nature of data derived from patient care.