ICLR 2026 Reviewer Guide
ICLR 2026 Reviewer Guide Thank you for agreeing to serve as an ICLR 2026 reviewer. Your contribution as a reviewer is paramount to creating an exciting and high-quality program. We ask that: Your reviews are timely and substantive. You follow the reviewing guidelines below. You adhere to our Code of Ethics in your role as a reviewer. You must also adhere to our Code of Conduct . This guide is intended to help you understand the ICLR 2026 decision process and your role within it. It contains: An outline of the main reviewer tasks Step-by-step reviewing instructions (especially relevant for reviewers that are new to ICLR) Review examples An FAQ . We're counting on you As a reviewer you are central to the program creation process for ICLR 2026. Your Area Chairs (ACs), Senior Area Chairs (SACs) and the Program Chairs (PCs) will rely greatly on your expertise and your diligent and thorough reviews to make decisions on each paper. Therefore, your role as a reviewer is critical to ensuring a strong program for ICLR 2026. High-quality reviews are also very valuable for helping authors improve their work, whether it is eventually accepted by ICLR 2026, or not. Therefore it is important to treat each valid ICLR 2026 submission with equal care. As a token of our appreciation for your essential work, top reviewers will be acknowledged permanently on the ICLR 2026 website. Main reviewer tasks The main reviewer tasks and dates are as follows (subject to minor changes): Create or update your OpenReview profile (September 19 2025) Bid on papers (September 28 2025 - October 4 2025) Write a constructive, thorough and timely review (October 10 2025 - November 01 2025) Initial paper reviews released (November 11 2025) Discuss with authors and other reviewers to clarify and improve the paper (November 11 2025 - December 3 2025) Flag any potential CoE violations and/or concerns (by November 26 2025) Provide a recommendation to the area chair assigned to the paper (by December 03 2025) Reviewer/AC Discussions and virtual meeting on AC’s discretion if paper you reviewed falls into borderline papers (December 03 2025 - December 10 2025) Provide a final recommendation to the area chair assigned to the paper (after virtual meeting) Measures for excessively late or low-quality reviews Timely, high quality reviews are essential to the peer review process. We hope that all reviewers will adhere to this expectation. However, reviewers who submit late or poor quality reviews will be subject to the following penalties. Following NeurIPS 2025, reviewers who are also authors (and their co‑authors) will not see the reviews of their own submission(s) during the rebuttal period until they have completed all of their assigned reviews. If reviews are late, the reviewers (and their co-authors) will lose access to the reviews of their own papers until completion of their professional reviews (up to two days before the end of the authors rebuttal period). Furthermore, reviewers who submit low quality reviews and fail to improve them upon being warned by ACs may have their own papers desk rejected: Low quality reviews (e.g., placeholder reviews) will be flagged by ACs and SACs, and the flagged reviewers will be warned and urged to update the review. Reviewers who do not respond to these warnings will be liable to having their own papers desk rejected. Code of Ethics All ICLR participants, including reviewers, are required to adhere to the ICLR Code of Ethics ( https://iclr.cc/public/CodeOfEthics ). All reviewers are required to read the Code of Ethics and adhere to it. The Code of Ethics applies to all conference participation, including paper submission, reviewing, and paper discussion. As part of the review process, reviewers are asked to raise potential violations of the ICLR Code of Ethics. Note that authors are encouraged to discuss questions and potential issues regarding the Code of Ethics as part of their submission. This discussion is not counted against the maximum page limit of the paper and should be included as a separate section. The Use of Large Language Models (LLMs) The use of LLMs is allowed as a general-purpose writing assistance tool. However, reviewers should understand that they take full responsibility for the contents written under their name, including content generated by LLMs that could be construed as plagiarism,scientific misconduct, or low quality (e.g., fabrication of facts). Reviews that exhibit such issues may be flagged as low quality, thus putting the reviewers’ papers at risk for desk rejection (see above). Note that new this year, we are asking that authors disclose any significant usage of LLMs in research ideation/writing. If such LLM usage is uncovered during the discussion but not disclosed in the paper writing, please notify the area chair. Just as for authors, new this year, we mandate that reviewers disclose the use of LLMs in their reviews. The review form will include a field to specify how you used LLMs, if at all. Failing to disclose this usage may put the reviewers’ papers at risk for desk rejection as well. Reviewing a submission: step-by-step Summarized in one sentence, a review aims to determine whether a submission will bring sufficient value to the community and contribute new knowledge. The process can be broken down into the following main reviewer tasks: Read the paper: It’s important to carefully read through the entire paper and to look up any related work and citations that will help you comprehensively evaluate it. Be sure to give yourself sufficient time for this step. While reading, consider the following: Objective of the work: What is the goal of the paper? Is it to better address a known application or problem, draw attention to a new application or problem, or to introduce and/or explain a new theoretical finding? A combination of these? Different objectives will require different considerations as to potential value and impact. Strong points: is the submission clear, technically correct, experimentally rigorous, reproducible, does it present novel findings (e.g. theoretically, algorithmically, etc.)? Weak points: is it weak in any of the aspects listed in b.? Be mindful of potential biases and try to be open-minded about the value and interest a paper can hold for the entire ICLR community, even if it may not be very interesting for you. Answer four key questions for yourself to make a recommendation to Accept or Reject: What is the specific question and/or problem tackled by the paper? Is the approach well motivated, including being well-placed in the literature? Does the paper support the claims? This includes determining if results, whether theoretical or empirical, are correct and if they are scientifically rigorous. What is the significance of the work? Does it contribute new knowledge and sufficient value to the community? Note, this does not necessarily require state-of-the-art results. Submissions bring value to the ICLR community when they convincingly demonstrate new, relevant, impactful knowledge (incl., empirical, theoretical, for practitioners, etc). Write and submit your initial review, organizing it as follows: Summarize what the paper claims to contribute. Be positive and constructive. List strong and weak points of the paper. Be as comprehensive as possible. Clearly state your initial recommendation (accept or reject) with one or two key reasons for this choice. Provide supporting arguments for your recommendation. Ask questions you would like answered by the authors to help you clarify your understanding of the paper and provide the additional evidence you need to be confident in your assessment. Provide additional feedback with the aim to improve the paper. Make it clear that these points are here to help, and not necessarily part of your decision assessment. Complete the CoE report: ICLR has adopted the following Code of Ethics (CoE). When submitting your review, you’ll be asked to complete a CoE report for the paper. The report is a simple form with two questions. The first asks whether there is a potential violation of the CoE. The second is relevant only if there is a potential violation and asks the reviewer to explain why there may be a potential violation. In order to answer these questions, it is therefore important that you read the CoE before starting your reviews. Engage in discussion: During this phase, reviewers, authors and area chairs engage in asynchronous discussion and authors are allowed to revise their submissions to address concerns that arise. It is crucial that you are actively engaged during this phase. Maintain a spirit of openness to changing your initial recommendation (either to a more positive or more negative) rating. Borderline paper meeting: Similarly to last year, the ACs are encouraged to (virtually) meet and discuss borderline cases with reviewers. ACs will reach out to schedule this meeting. This is to ensure active discussions among reviewers and well-thought-out decisions. ACs will schedule the meeting and facilitate the discussion. For a productive discussion, it is important to familiarize yourself with other reviewers' feedback prior to the meeting. Please note that we will be leveraging information for reviewers who failed to attend this meeting (excluding emergencies). Provide final recommendation: Update your review, taking into account the new information collected during the discussion phase and any revisions to the submission. (Note that reviewers can change their reviews after the author response period.) State your reasoning and what did/didn’t change your recommendation throughout the discussion phase. For great in-depth resources on reviewing, see these resources : Daniel Dennet, Criticising with Kindness . Views from multiple reviewers: Last minute reviewing advice Perspective from instructions to Area Chairs: Dear ACs . Review Examples Here are two sample reviews from previous conferences that give an example of what we consider a good review for the case of leaning-to-accept and leaning-to-reject. Review for a Paper where Leaning-to-Accept This paper proposes a method, Dual-AC, for optimizing the actor (policy) and critic (value function) simultaneously which takes the form of a zero-sum game resulting in a principled method for using the critic to optimize the actor. In order to achieve that, they take the linear programming approach of solving the Bellman optimality equations, outline the deficiencies of this approach, and propose solutions to mitigate those problems. The discussion on the deficiencies of the naive LP approach is mostly well done. Their main contribution is extending the single step LP formulation to a multi-step dual form that reduces the bias and makes the connection between policy and value function optimization much clearer without losing convexity by applying a regularization. They perform an empirical study in the Inverted Double Pendulum domain to conclude that their extended algorithm outperforms the naive linear programming approach without the improvements. Lastly, there are empirical experiments done to conclude the superior performance of Dual-AC in contrast to other actor-critic algorithms. Overall, this paper could be a significant algorithmic contribution, with the caveat for some clarifications on the theory and experiments. Given these clarifications in an author response, I would be willing to increase the score. For the theory, there are a few steps that need clarification and further clarification on novelty. For novelty, it is unclear if Theorem 2 and Theorem 3 are both being stated as novel results. It looks like Theorem 2 has already been shown in "Randomized Linear Programming Solves the Discounted Markov Decision Problem in Nearly-Linear Running Time”. There is a statement that “Chen & Wang (2016); Wang (2017) apply stochastic first-order algorithms (Nemirovski et al., 2009) for the one-step Lagrangian of the LP problem in reinforcement learning setting. However, as we discussed in Section 3, their algorithm is restricted to tabular parametrization”. Is your Theorem 2 somehow an extension? Is Theorem 3 completely new? This is particularly called into question due to the lack of assumptions about the function class for value functions. It seems like the value function is required to be able to represent the true value function, which can be almost as restrictive as requiring tabular parameterizations (which can represent the true value function). This assumption seems to be used right at the bottom of Page 17, where U^{pi} = V^. Further, eta_v must be chosen to ensure that it does not affect (constrain) the optimal solution, which implies it might need to be very small. More about conditions on eta_v would be illuminating. There is also one step in the theorem that I cannot verify. On Page 18, how is the squared removed for difference between U and Upi? The transition from the second line of the proof to the third line is not clear. It would also be good to more clearly state on page 14 how you get the first inequality, for || V^* ||_{2,mu}^2. For the experiments, the following should be addressed. 1. It would have been better to also show the performance graphs with and without the improvements for multiple domains. 2. The central contribution is extending the single step LP to a multi-step formulation. It would be beneficial to empirically demonstrate how increasing k (the multi-step parameter) affects the performance gains. 3. Increasing k also comes at a computational cost. I would like to see some discussions on this and how long dual-AC takes to converge in comparison to the other algorithms tested (PPO and TRPO). 4. The authors concluded the presence of local convexity based on hessian inspection due to the use of path regularization. It was also mentioned that increasing the regularization parameter size increases the convergence rate. Empirically, how does changing the regularization parameter affect the performance in terms of reward maximization? In the experimental section of the appendix, it is mentioned that multiple regularization settings were tried but their performance is not mentioned. Also, for the regularization parameters that were tried, based on hessian inspection, did they all result in local convexity? A bit more discussion on these choices would be helpful. Minor comments: 1. Page 2: In equation 5, there should not be a 'ds' in the dual variable constraint Review for a Paper where Leaning-to-Reject This paper introduces a variation on temporal difference learning for the function approximation case that attempts to resolve the issue of over-generalization across temporally-successive states. The new approach is applied to both linear and non-linear function approximation, and for prediction and control problems. The algorithmic contribution is demonstrated with a suite of experiments in classic benchmark control domains (Mountain Car and Acrob
Executive Summary
The ICLR 2026 Reviewer Guide outlines the reviewer's role in the program creation process, emphasizing the importance of timely and substantive reviews. Reviewers are expected to adhere to code of ethics and conduct, and will be acknowledged on the ICLR 2026 website for their contributions. The guide outlines main reviewer tasks, reviewing instructions, and consequences for late or low-quality reviews.
Key Points
- ▸ Reviewers play a critical role in creating a high-quality program for ICLR 2026
- ▸ Timely and substantive reviews are essential for the peer review process
- ▸ Reviewers must adhere to the code of ethics and conduct
- ▸ Late or low-quality reviews will result in penalties
Merits
Clarification of Reviewer Role
The guide clearly outlines the reviewer's responsibilities and expectations, ensuring a common understanding among reviewers.
Emphasis on Timeliness and Quality
The guide emphasizes the importance of timely and substantive reviews, which is crucial for the peer review process.
Acknowledgement of Reviewer Contributions
Top reviewers will be acknowledged on the ICLR 2026 website, recognizing their essential work in creating a strong program.
Demerits
Limited Reviewer Support
The guide assumes a certain level of expertise among reviewers, which may not be the case for new reviewers.
Unclear Consequences of Late Reviews
The guide does not provide clear information on the consequences of late reviews beyond the penalties mentioned.
Expert Commentary
The ICLR 2026 Reviewer Guide provides a comprehensive outline of the reviewer's role in the program creation process. While the guide is well-structured and clear, it assumes a certain level of expertise among reviewers. The emphasis on timeliness and quality is crucial for the peer review process, and the acknowledgement of reviewer contributions is a welcome recognition of their essential work. However, the guide could benefit from more detailed information on the consequences of late reviews and clearer support for new reviewers. Overall, the guide is a valuable resource for reviewers and provides a framework for creating a high-quality program for ICLR 2026.
Recommendations
- ✓ Provide more detailed information on the consequences of late reviews.
- ✓ Offer clearer support and training for new reviewers.