In many resource-constrained project scheduling problems (RCPSP), the set of candidate projects is not fixed a priori but evolves with time. For example, while performing an initial set of projects according to a certain decision policy, a new promising project can emerge. To make an appropriate resource allocation decision for such a problem, project cancellation and resource idling decisions should complement the conventional scheduling decisions. In this study, the problem of stochastic RCPSP (sRCPSP) with dynamic project arrivals is addressed with the added flexibility of project cancellation and resource idling. To solve the problem, a Q-Learning-based approach is adopted. To use the approach, the problem is formulated as a Markov Decision Process with appropriate definitions of states, including information state and action variables. The Q-Learning approach enables us to derive an empirical state transition rules from simulation data so that analytical calculations of potentially exorbitantly complicated state transition rules can be circumvented. To maximize the advantage of using the empirically learned state transition rules, special type of actions including project cancellation and resource idling, which are difficult to incorporate into heuristics, were randomly added in the simulation. The random actions are filtered during the Q-Value iteration and properly utilized in the online decision making to maximize the total expected reward. Copyright (C) 2007 John Wiley & Sons, Ltd.