We apply deep reinforcement learning to active closed-loop control of a two-dimensional flow over a cylinder oscillating around its axis with a time-dependent angular velocity representing the only control parameter. Experimenting with the angular velocity, the neural network is able to devise a control strategy based on low frequency harmonic oscillations with some additional modulations to stabilize the Kármán vortex street at a low Reynolds number Re = 100. We examine the convergence issue for two reward functions showing that later epoch number does not always guarantee a better result. The performance of the controller provide the drag reduction of 14% or 16% depending on the employed reward function. The additional efforts are very low as the maximum amplitude of the angular velocity is equal to 8% of the incoming flow in the first case while the latter reward function returns an impressive 0.8% rotation amplitude which is comparable with the state-of-the-art adjoint optimization results. A detailed comparison with a flow controlled by harmonic oscillations with fixed amplitude and frequency is presented, highlighting the benefits of a feedback loop.