Greedy ‘Connect 4’ For Fun and Profit
Implementing a Simple Agent in Kaggle’s New Simulations Competition
A review of the new ‘Simulations’ competition type in the popular Machine Learning community Kaggle, with a setup tutorial and sample code for a simplistic, greedy agent to play their first beta competition: “Connect X”
Kaggle, the popular community of researchers, data exploration, and automated Machine Learning contests has just launched an entirely new class of competition called ‘Simulations.’
Where the original Kaggle contests focused on finding the best algorithms to fit sample problem datasets and generate low-error results (based on RMSE) for new data, the new contest requires submission of an actual program that will then be run against other users.
In this case: “Connect X” — a derivative of the popular Connect 4 game.
Getting Started
The submission requirement is a simple, singular Python file called “submission.py” — so that’s exactly what we’ll be creating. However, before we proceed, let’s review Kaggle’s own Getting Started guide.
Note: Make sure you’ve created a User Account and Joined the ‘Connect X’ competition before proceeding.
The following tutorial is based off Kaggle’s guide: https://www.kaggle.com/ajeffries/connectx-getting-started/notebook
Make a new Notebook
Before we do anything else, we need to use the Notebook feature and create a new “Notebook”, which will act as our online IDE for building our submission.
Click the Notebooks tab in the navigation menu, and then the blue ‘New Notebook’ button.
For your new notebook settings, leave the default language as Python, and select the type “Script” instead of “Notebook.”
Don’t worry about the advanced settings, as in this competition they’re not necessary.
Set Up the Kernel
Once the notebook creates itself you’ll be presented with a new “kernel” which contains a Python IDE and a command line interface (at the bottom) for your virtualized development environment.
Go ahead and update the random kernel name to something more memorable like “greedy.” (5+ characters required).
Next, go to the Settings menu on the side panel to the right, set the Internet option to ON and accept the warning.
Go to the command line at the bottom of the screen and type !pip install 'kaggle-environments>=0.1.4'
which will install the Kaggle environments for Python.
Test the Python/ConnectX Environment
Replace the default Python code (with numpy and all that jazz) with the following lines:
from kaggle_environments import evaluate, make
env = make("connectx", debug=True)
env.render()
Then press “ » Run All” directly above the editor window.
If you see the 7x6 grid above as the output then you’re good to go! Let’s start coding our greedy little agent.
Preliminary Research
When we talk about “greed” we really mean “choose the most immediately rewarding path” amidst various choices. In other words, it would be like taking a pawn with a knight in chess because you can even though if you strategically moved a bishop and took nothing you might be able to capture the queen a few moves from now.
In other words we’re making something very simple and easy to understand.
But before we get ahead of ourselves, let’s check the requirements for how an “agent” works. Thankfully, the example Kaggle provides has everything we need.
After reading their example we update ours to look like this:
from kaggle_environments import evaluate, makedef my_agent(observation, configuration):
from random import choice
return choice([c for c in range(configuration.columns) if observation.board[c] == 0])env = make("connectx", debug=True)
env.run([my_agent, "random"])
env.render(mode="human", width=7, height=6)
This is very close to the Kaggle guide and all we’ve done is:
- Add a new “my_agent” function to make a random choice
- Set up the environment to have two players: our “my_agent” function and a “random” choice picker
- Render the game result as a “human” readable 7x6 game board straight to the console
The Shape of the Board
What changes did we make? We followed the Kaggle guide EXCEPT for the env.render
function when I changed from ipython
and a 500x450 pixel display. We’re rendering to the console for now. Let’s stick with an easy to understand 7x6=42 unit grid.
When the board renders itself we can see the winning row (second from the bottom, left-hand side) filled with 2’s. So player 2 “random” won against our own random “my_agent.”
Obviously this disappoints us, we need a better strategy!
But wait, do we even understand our current strategy? How are we even picking a random position in the grid on our turn?
Let’s explore the underlying components of Connect X for a moment.
Kaggle Environment
A full explanation of the game environment can be found here: https://www.kaggle.com/c/connectx/overview/environment-rules
Let’s approach that information from a programming perspective. Looking back at our “my_agent” function we want to pay particular attention to the parameters being passed in, observation
and configuration
.
def my_agent(observation, configuration):
from random import choice
return choice([c for c in range(configuration.columns) if observation.board[c] == 0])
By looking at the file list under Data in the right-hand dock we can see the input files we’re using, which, as you remember from the initial setup, included adding the kaggle-environment
.
The schemas.json
payload has a very useful list of properties, that include things like configuration
and observation
. These appear to be across all Kaggle Simulation projects, however, and further details about the implementations of those objects must be found elsewhere.
Underneath the envs folder we find connectx/connectx.json which holds the actual properties of those two schema definitions we care about for the Connect X project.
{
"name": "connectx",
"title": "ConnectX",
"description": "Classic Connect in a row but configurable.",
"version": "1.0.0",
"agents": [
2
],
"configuration": {
"columns": {
"description": "The number of columns on the board",
"type": "integer",
"default": 7,
"minimum": 1
},
"rows": {
"description": "The number of rows on the board",
"type": "integer",
"default": 6,
"minimum": 1
},
"inarow": {
"description": "The number of checkers in a row required to win.",
"type": "integer",
"default": 4,
"minimum": 1
}
},
"reward": {
"description": "0 = Lost, 0.5 = Draw, 1 = Won",
"enum": [
0,
0.5,
1
],
"default": 0.5
},
"observation": {
"board": {
"description": "Serialized grid (rows x columns). 0 = Empty, 1 = P1, 2 = P2",
"type": "array",
"items": {
"enum": [
0,
1,
2
]
},
"default": []
},
"mark": {
"default": 0,
"description": "Which checkers are the agents.",
"enum": [
1,
2
]
}
},
"action": {
"description": "Column to drop a checker onto the board.",
"type": "integer",
"minimum": 0,
"default": 0
},
"reset": {
"status": [
"ACTIVE",
"INACTIVE"
],
"observation": [
{
"mark": 1
},
{
"mark": 2
}
]
}
}
Well there it is! Everything we need to understand the challenge, and some very important things to explain.
Rules of the Game
Now that we can see how Connect X is defined we can explore the rules and limitations of the game, and what constraints we need to place on our greedy agent.
"name": "connectx",
"title": "ConnectX",
"description": "Classic Connect in a row but configurable.",
"version": "1.0.0",
"agents": [
2
],
The top-level variables are all straightforward — the only interesting constraint here is that the number of agents (or players) is set to 2.
At least this greatly simplifies our task since we don’t have to manage multiple enemy agents, just 1.
Object: configuration
The configuration
object holds the following properties:
columns
rows
inarow
reward
The columns variable is arguably more important than the rows in the sense that the ability to actually play a game is determined by “dropping” a marker into a column.
If only one column exists, and the game is turned based, then a “game” doesn’t exist and there can be no winner if inarow > 1
. (If inarow
is set to 1 then the first player always wins!)
Note that the default column count is set to 7, with a minimum of at least 1, but — to reiterate — a column count of one simply means the players alternate placement until they fill up the board and tie. No strategy involved.
With row on the other hand, you can have a single row with many columns and still have a game. The default here is 6 with a minimum of 1.
The immediate — and important — take away from this is that in the actual competition the game board might be truly massive. Or tiny. Or one long row with no height. Or one super tall row.
The variability of the test space needs to be accounted for from Line 0.
No wonder they called it “Connect X” — the shape of the board could be all sorts of crazy rectangles.
The inarow variable allows the game to set how many marks a given player has to connect horizontally, vertically, or diagonally to win. The default is 4
and the minimum, for some reason, is 1
(which would give the win to the first player, every time).
Finally the reward offers a number between 0
and 1
, where:
- Loss:
0
- Draw:
0.5
- Win:
1
This will be used to judge the leaderboards for our submitted algorithms and can be also used for training any models we come up with in the future.
Object: observation
The observation
object contains the current state of the game board and its definitions:
board
: the serialized game board, containing values0
,1
, or2
mark
: player marker, either1
or2
The board contains a serialized array (in items) representing the game board. A 7x6 game board would be represented by 42 items lined up in a row, where the first row would be positions 0,1,2,3,4,5,6
and the second row would be 7,8,9,10,11,12,13
and so on.
A given position for Row y and Column x (zero-indexed) would be: (y * 6) + x
or, if you needed to know the row and column from a given index i it would be y = i % 6; x = i - (y * 6)
.
Keep in mind the stored array object is not a 2-dimensional array. As previously described it is a single row containing all the elements. It’s up to you, the agent writer, to do the math to make your agent treat it as a 2D board.
When a game board is initialized all the marker values are set to 0
. When a player performs an action
on a column
it seeks the lowest possible spot that is still a 0
and updates that position to your player’s ID, either 1
or 2
.
Thus, at any given time the board is full of 0
‘s where you can move, or 1
‘s and 2
‘s where the players have already marked spots.
action
Understanding the Example Code
Now that we’re armed with the definitions of configuration
and observation
we can dissect the Python agent example and see what it’s doing.
def my_agent(observation, configuration):
from random import choice
return choice([c for c in range(configuration.columns) if observation.board[c] == 0])
First, we know that configuration.columns
is a number greater than or equal to 1.
The choice(seq)
function will return a random number in whatever sequence is passed to it.
The range(n)
function will create a zero-indexed sequence of size n
. For instance, range(7)
will yield [0, 1, 2, 3, 4, 5, 6]
.
c for c in ... if observation.board[c] == 0
does the wonderful thing of iterating over the sequence provided by the range function, and building a new sequence where the output c
is only assigned if that particular position in the board is currently unmarked (0
).
Note that the result of the generator is surrounded by brackets [...]
— it casts as an array instead of a generator object for the sake of choice()
.
To make that clear: this algorithm does not check every position on the board. It just checks the very top row and excludes any positions that are full.
Why is this important? We don’t want a random choice picker to keep suggesting invalid positions that have already been marked. This guarantees one valid return each time it’s called until the board is filled on the game condition is won.
All of this is a very shorthand way of saying: randomly drop a marker in a column if that column isn’t already full.
Designing a Dumb, Greedy Solution
For the purposes of this tutorial we’re only trying to achieve something better than random choice. Almost any algorithm would do, so let’s just concentrate on some basic rules. Think of this as the pre-programmed behavior of an ant.
We’re not even going to try to stop the other player (But won’t we lose a lot? Sure!).
This allows us to make no consideration for the inarow variable. We don’t care what it takes to win. We’re just going to stick to one dumb plan until it works or we fail. It’s up to you to improve it!
Rules
- Find the centermost column
- IF column is not full, drop marker
- IF full, find next nearest centermost column, repeat
That’s it. Easy peasy.
Finding the Centermost Available Column
The Python for this will look similar to that employed in the random choice example, since we only care about the top row.
def my_agent(observation, configuration):
middle = configuration.columns / 2.0
dist = configuration.columns
available = [c for c in range(configuration.columns) if observation.board[c] == 0]
centermost = available[0]
for c in available:
v = abs(c - middle)
if v <= dist:
dist = v
centermost = c
elif v > dist:
break
return centermost
That may seem more complex but it’s not, it’s just a longer form because we’re doing this:
- Find the exact
middle
of the columns based on how many there are - Set an initial
dist
equal to the number of columns; we’re going to compare each column’s distance from the middle so this number will shrink - Get the
available
columns — some might be full! - For each available column, find out how far away from the
middle
it is. If it’s closer, set thedist
to be the new closer value and mark that column ascentermost
. Keep going until thedist
no longer shrinks (about halfway) and the stop so we don’t waste calculation time.
Now, of course, we could streamline this much more and even work outwards from the middle of the sequence and be more efficient, but this is our quick and easy agent: did it work?
We got lucky! Player 2 did not randomly drop a mark in Column 4 and we lined up an uninterrupted vertical row.
Let’s Build and Deploy
Now we need to resume Kaggle’s guide so we can build and deploy this thing.
Following the guide to building our submission we copy/paste the following into our code and remove the env
tests:
from kaggle_environments import evaluate, make
import math
import os
import inspectdef my_agent(observation, configuration):
middle = configuration.columns / 2.0
dist = configuration.columns
available = [c for c in range(configuration.columns) if observation.board[c] == 0]
centermost = available[0]
for c in available:
v = abs(c - middle)
if v <= dist:
dist = v
centermost = c
elif v > dist:
break
return centermostdef write_agent_to_file(function, file):
with open(file, "a" if os.path.exists(file) else "w") as f:
f.write(inspect.getsource(function))
print(function, "written to", file)
write_agent_to_file(my_agent, "submission.py")
When we run this, it will build a new output file for us called submission.py
.
You’ll find it on the right-hand dock under Data
/ output
/ /kaggle/working
/ submission.py
and once you download it you’re ready to Submit Agent on the Connect X contest page.
And once you’ve submitted it you’re ready to watch eagerly over the coming days as it enters into competition with everyone elses’ agents.
Good luck!