This paper addresses the problem of on-line gait learning in modular robots whose shape is not known in advance. The best algorithm for this problem known to us is a reinforcement learning method, called RL PoWER. In this study we revisit the original RL PoWER algorithm and observe that in essence it is a specific evolutionary algorithm. Based on this insight we propose two modifications of the main search operators and compare the quality of the evolved gaits when either or both of these modified operators are employed. The results show that using 2-parent crossover as well as mutation with self-adaptive step-sizes can significantly improve the performance of the original algorithm.