Greedy ‘Connect 4’ For Fun and Profit

Implementing a Simple Agent in Kaggle’s New Simulations Competition

Published in

Cotten.IO

12 min readJan 6, 2020

A review of the new ‘Simulations’ competition type in the popular Machine Learning community Kaggle, with a setup tutorial and sample code for a simplistic, greedy agent to play their first beta competition: “Connect X”

Kaggle, the popular community of researchers, data exploration, and automated Machine Learning contests has just launched an entirely new class of competition called ‘Simulations.’

Where the original Kaggle contests focused on finding the best algorithms to fit sample problem datasets and generate low-error results (based on RMSE) for new data, the new contest requires submission of an actual program that will then be run against other users.

In this case: “Connect X” — a derivative of the popular Connect 4 game.

Getting Started

The submission requirement is a simple, singular Python file called “submission.py” — so that’s exactly what we’ll be creating. However, before we proceed, let’s review Kaggle’s own Getting Started guide.

tldr; no one who isn’t already a Kaggler is going to know how

Note: Make sure you’ve created a User Account and Joined the ‘Connect X’ competition before proceeding.

The following tutorial is based off Kaggle’s guide: https://www.kaggle.com/ajeffries/connectx-getting-started/notebook

Make a new Notebook

Before we do anything else, we need to use the Notebook feature and create a new “Notebook”, which will act as our online IDE for building our submission.

Click the Notebooks tab in the navigation menu, and then the blue ‘New Notebook’ button.

For your new notebook settings, leave the default language as Python, and select the type “Script” instead of “Notebook.”

Don’t worry about the advanced settings, as in this competition they’re not necessary.

Set Up the Kernel

Once the notebook creates itself you’ll be presented with a new “kernel” which contains a Python IDE and a command line interface (at the bottom) for your virtualized development environment.

Go ahead and update the random kernel name to something more memorable like “greedy.” (5+ characters required).

Next, go to the Settings menu on the side panel to the right, set the Internet option to ON and accept the warning.

Go to the command line at the bottom of the screen and type !pip install 'kaggle-environments>=0.1.4' which will install the Kaggle environments for Python.

Installing the Python Kaggle Environments

Test the Python/ConnectX Environment

Replace the default Python code (with numpy and all that jazz) with the following lines:

from kaggle_environments import evaluate, make

env = make("connectx", debug=True)
env.render()

Then press “ » Run All” directly above the editor window.

If you see the 7x6 grid above as the output then you’re good to go! Let’s start coding our greedy little agent.

Preliminary Research

When we talk about “greed” we really mean “choose the most immediately rewarding path” amidst various choices. In other words, it would be like taking a pawn with a knight in chess because you can even though if you strategically moved a bishop and took nothing you might be able to capture the queen a few moves from now.

In other words we’re making something very simple and easy to understand.

But before we get ahead of ourselves, let’s check the requirements for how an “agent” works. Thankfully, the example Kaggle provides has everything we need.

After reading their example we update ours to look like this:

from kaggle_environments import evaluate, makedef my_agent(observation, configuration):
    from random import choice
    return choice([c for c in range(configuration.columns) if observation.board[c] == 0])env = make("connectx", debug=True)
env.run([my_agent, "random"])
env.render(mode="human", width=7, height=6)

This is very close to the Kaggle guide and all we’ve done is:

Add a new “my_agent” function to make a random choice
Set up the environment to have two players: our “my_agent” function and a “random” choice picker
Render the game result as a “human” readable 7x6 game board straight to the console

The Shape of the Board

What changes did we make? We followed the Kaggle guide EXCEPT for the env.render function when I changed from ipython and a 500x450 pixel display. We’re rendering to the console for now. Let’s stick with an easy to understand 7x6=42 unit grid.

When the board renders itself we can see the winning row (second from the bottom, left-hand side) filled with 2’s. So player 2 “random” won against our own random “my_agent.”

Obviously this disappoints us, we need a better strategy!

But wait, do we even understand our current strategy? How are we even picking a random position in the grid on our turn?

Let’s explore the underlying components of Connect X for a moment.

Kaggle Environment

A full explanation of the game environment can be found here: https://www.kaggle.com/c/connectx/overview/environment-rules

Let’s approach that information from a programming perspective. Looking back at our “my_agent” function we want to pay particular attention to the parameters being passed in, observation and configuration.

def my_agent(observation, configuration):
    from random import choice
    return choice([c for c in range(configuration.columns) if observation.board[c] == 0])

By looking at the file list under Data in the right-hand dock we can see the input files we’re using, which, as you remember from the initial setup, included adding the kaggle-environment.

The schemas.json payload has a very useful list of properties, that include things like configuration and observation. These appear to be across all Kaggle Simulation projects, however, and further details about the implementations of those objects must be found elsewhere.

Underneath the envs folder we find connectx/connectx.json which holds the actual properties of those two schema definitions we care about for the Connect X project.

{
  "name": "connectx",
  "title": "ConnectX",
  "description": "Classic Connect in a row but configurable.",
  "version": "1.0.0",
  "agents": [
    2
  ],
  "configuration": {
    "columns": {
      "description": "The number of columns on the board",
      "type": "integer",
      "default": 7,
      "minimum": 1
    },
    "rows": {
      "description": "The number of rows on the board",
      "type": "integer",
      "default": 6,
      "minimum": 1
    },
    "inarow": {
      "description": "The number of checkers in a row required to win.",
      "type": "integer",
      "default": 4,
      "minimum": 1
    }
  },
  "reward": {
    "description": "0 = Lost, 0.5 = Draw, 1 = Won",
    "enum": [
      0,
      0.5,
      1
    ],
    "default": 0.5
  },
  "observation": {
    "board": {
      "description": "Serialized grid (rows x columns). 0 = Empty, 1 = P1, 2 = P2",
      "type": "array",
      "items": {
        "enum": [
          0,
          1,
          2
        ]
      },
      "default": []
    },
    "mark": {
      "default": 0,
      "description": "Which checkers are the agents.",
      "enum": [
        1,
        2
      ]
    }
  },
  "action": {
    "description": "Column to drop a checker onto the board.",
    "type": "integer",
    "minimum": 0,
    "default": 0
  },
  "reset": {
    "status": [
      "ACTIVE",
      "INACTIVE"
    ],
    "observation": [
      {
        "mark": 1
      },
      {
        "mark": 2
      }
    ]
  }
}

Well there it is! Everything we need to understand the challenge, and some very important things to explain.

Rules of the Game

Now that we can see how Connect X is defined we can explore the rules and limitations of the game, and what constraints we need to place on our greedy agent.

  "name": "connectx",
  "title": "ConnectX",
  "description": "Classic Connect in a row but configurable.",
  "version": "1.0.0",
  "agents": [
    2
  ],

The top-level variables are all straightforward — the only interesting constraint here is that the number of agents (or players) is set to 2.

At least this greatly simplifies our task since we don’t have to manage multiple enemy agents, just 1.

Object: configuration

The configuration object holds the following properties:

columns
rows
inarow
reward

The columns variable is arguably more important than the rows in the sense that the ability to actually play a game is determined by “dropping” a marker into a column.

If only one column exists, and the game is turned based, then a “game” doesn’t exist and there can be no winner if inarow > 1. (If inarow is set to 1 then the first player always wins!)

A Very Boring 1-Column Game of Connect X

Note that the default column count is set to 7, with a minimum of at least 1, but — to reiterate — a column count of one simply means the players alternate placement until they fill up the board and tie. No strategy involved.

With row on the other hand, you can have a single row with many columns and still have a game. The default here is 6 with a minimum of 1.

The immediate — and important — take away from this is that in the actual competition the game board might be truly massive. Or tiny. Or one long row with no height. Or one super tall row.

The variability of the test space needs to be accounted for from Line 0.

No wonder they called it “Connect X” — the shape of the board could be all sorts of crazy rectangles.

The inarow variable allows the game to set how many marks a given player has to connect horizontally, vertically, or diagonally to win. The default is 4 and the minimum, for some reason, is 1 (which would give the win to the first player, every time).

Finally the reward offers a number between 0 and 1, where:

Loss: 0
Draw: 0.5
Win: 1

This will be used to judge the leaderboards for our submitted algorithms and can be also used for training any models we come up with in the future.

Object: observation

The observation object contains the current state of the game board and its definitions:

board: the serialized game board, containing values 0, 1, or 2
mark: player marker, either 1 or 2

The board contains a serialized array (in items) representing the game board. A 7x6 game board would be represented by 42 items lined up in a row, where the first row would be positions 0,1,2,3,4,5,6 and the second row would be 7,8,9,10,11,12,13 and so on.

A given position for Row y and Column x (zero-indexed) would be: (y * 6) + x or, if you needed to know the row and column from a given index i it would be y = i % 6; x = i - (y * 6).

A 7x6 Game Board in Connect X with Notated Serialized Positions

Keep in mind the stored array object is not a 2-dimensional array. As previously described it is a single row containing all the elements. It’s up to you, the agent writer, to do the math to make your agent treat it as a 2D board.

When a game board is initialized all the marker values are set to 0. When a player performs an action on a column it seeks the lowest possible spot that is still a 0 and updates that position to your player’s ID, either 1 or 2.

Thus, at any given time the board is full of 0‘s where you can move, or 1‘s and 2‘s where the players have already marked spots.

action

Understanding the Example Code

Now that we’re armed with the definitions of configuration and observation we can dissect the Python agent example and see what it’s doing.

def my_agent(observation, configuration):
    from random import choice
    return choice([c for c in range(configuration.columns) if observation.board[c] == 0])

First, we know that configuration.columns is a number greater than or equal to 1.

The choice(seq) function will return a random number in whatever sequence is passed to it.

The range(n) function will create a zero-indexed sequence of size n. For instance, range(7) will yield [0, 1, 2, 3, 4, 5, 6].

c for c in ... if observation.board[c] == 0 does the wonderful thing of iterating over the sequence provided by the range function, and building a new sequence where the output c is only assigned if that particular position in the board is currently unmarked (0).

Note that the result of the generator is surrounded by brackets [...] — it casts as an array instead of a generator object for the sake of choice().

To make that clear: this algorithm does not check every position on the board. It just checks the very top row and excludes any positions that are full.

Why is this important? We don’t want a random choice picker to keep suggesting invalid positions that have already been marked. This guarantees one valid return each time it’s called until the board is filled on the game condition is won.

All of this is a very shorthand way of saying: randomly drop a marker in a column if that column isn’t already full.

Designing a Dumb, Greedy Solution

For the purposes of this tutorial we’re only trying to achieve something better than random choice. Almost any algorithm would do, so let’s just concentrate on some basic rules. Think of this as the pre-programmed behavior of an ant.

We’re not even going to try to stop the other player (But won’t we lose a lot? Sure!).

This allows us to make no consideration for the inarow variable. We don’t care what it takes to win. We’re just going to stick to one dumb plan until it works or we fail. It’s up to you to improve it!

Rules

Find the centermost column
IF column is not full, drop marker
IF full, find next nearest centermost column, repeat

That’s it. Easy peasy.

Finding the Centermost Available Column

The Python for this will look similar to that employed in the random choice example, since we only care about the top row.

def my_agent(observation, configuration):
    middle = configuration.columns / 2.0
    dist = configuration.columns
    
    available = [c for c in range(configuration.columns) if observation.board[c] == 0]
    centermost = available[0]
    
    for c in available:
        v = abs(c - middle)
        if v <= dist:
            dist = v
            centermost = c
        elif v > dist:
            break
            
    return centermost

That may seem more complex but it’s not, it’s just a longer form because we’re doing this:

Find the exact middle of the columns based on how many there are
Set an initial dist equal to the number of columns; we’re going to compare each column’s distance from the middle so this number will shrink
Get the available columns — some might be full!
For each available column, find out how far away from the middle it is. If it’s closer, set the dist to be the new closer value and mark that column as centermost. Keep going until the dist no longer shrinks (about halfway) and the stop so we don’t waste calculation time.

Now, of course, we could streamline this much more and even work outwards from the middle of the sequence and be more efficient, but this is our quick and easy agent: did it work?

We got lucky! Player 2 did not randomly drop a mark in Column 4 and we lined up an uninterrupted vertical row.

Let’s Build and Deploy

Now we need to resume Kaggle’s guide so we can build and deploy this thing.

Following the guide to building our submission we copy/paste the following into our code and remove the env tests:

from kaggle_environments import evaluate, make
import math
import os
import inspectdef my_agent(observation, configuration):
    middle = configuration.columns / 2.0
    dist = configuration.columns
    
    available = [c for c in range(configuration.columns) if observation.board[c] == 0]
    centermost = available[0]
    
    for c in available:
        v = abs(c - middle)
        if v <= dist:
            dist = v
            centermost = c
        elif v > dist:
            break
            
    return centermostdef write_agent_to_file(function, file):
    with open(file, "a" if os.path.exists(file) else "w") as f:
        f.write(inspect.getsource(function))
        print(function, "written to", file)

write_agent_to_file(my_agent, "submission.py")

When we run this, it will build a new output file for us called submission.py.

You’ll find it on the right-hand dock under Data / output / /kaggle/working/ submission.py and once you download it you’re ready to Submit Agent on the Connect X contest page.