Lack of input validation for Rock, Paper, Scissors simulation

Hi there

I decided today that I want to start doing Kaggle competitions for real. While browsing the site, I found out about simulation competitions which seem very exciting despite me only having basic knowledge on reinforcement learning.

Luckily this specific event is over, otherwise there might be some problems. If this package is used to test user submissions, then there may have been a security vulnerability with this game and users could have cheated to win every game they played against another competitor.

While stumbling across the rock paper scissors game, I tried to see if there was a way to use variables from outside the agent function that is supplied by the user. I couldn't find one but it might still be possible with things like exec. However, I found a way to return a variable that passes all the validation tests and lets the agent win.

The return variable from the user function is validated through this:

kaggle-environments/kaggle_environments/envs/rps/rps.py

Lines 15 to 20 in a5d9f75

    
           def is_valid_action(player, sign_count): 
        
               return ( 
        
                   player.action is not None and 
        
                   isinstance(player.action, int) and 
        
                   0 <= player.action < sign_count 
        
               )

Any subclass of int could pass this.

The score is then evaluated here:

kaggle-environments/kaggle_environments/envs/rps/utils.py

Lines 4 to 11 in a5d9f75

    
           def get_score(left_move, right_move): 
        
               # This method exists in this file so it can be consumed from rps.py and agents.py without a circular dependency 
        
               delta = ( 
        
                   right_move - left_move 
        
                   if (left_move + right_move) % 2 == 0 
        
                   else left_move - right_move 
        
               ) 
        
               return 0 if delta == 0 else math.copysign(1, delta)

If the return value is 1 then the left agent wins while if it is -1, the right agent wins.

Using this, we can make an agent that returns a subclass of int but always makes itself win.

from kaggle_environments import evaluate, make, utils

def random_agent(observation, configuration):
    import random
    return random.choice(range(3))

def malicious_agent(observation, configuration):
    class Fake(int):
        def __sub__(self, other):
            if (self + other) % 2 == 0:
                # it is calculating right_move - left_move and we are the right_move
                # we want the value to be as small as possible so we win
                return -1
            else:
                # it is calculating left_move - right_move and we are the left_move
                # we want the value to be as large as possible so we win
                return 1

        def __rsub__(self, other):
            if (self + other) % 2 == 0:
                return 1
            else:
                return -1

    return Fake(1)


env = make("rps", debug=True)
env.reset()
env.run([random_agent, malicious_agent])  # also works if the arguments are [malicious_agent, random_agent]
env.render(mode="ipython")

I think this can be fixed by changing the isinstance(player.action, int) to type(player.action) == int in the following:

kaggle-environments/kaggle_environments/envs/rps/rps.py

Lines 16 to 20 in a5d9f75

    
           return ( 
        
               player.action is not None and 
        
               isinstance(player.action, int) and 
        
               0 <= player.action < sign_count 
        
           )

This will force the agent function to return an int and not anything that the user could have made themselves.

	def is_valid_action(player, sign_count):
	return (
	player.action is not None and
	isinstance(player.action, int) and
	0 <= player.action < sign_count
	)

	def get_score(left_move, right_move):
	# This method exists in this file so it can be consumed from rps.py and agents.py without a circular dependency
	delta = (
	right_move - left_move
	if (left_move + right_move) % 2 == 0
	else left_move - right_move
	)
	return 0 if delta == 0 else math.copysign(1, delta)