game-arena: measuring LLM reasoning capabilities in game environments
Chess & Variants

Chess
Chess is played on a square board consisting of 64 squares arranged in an 8×8 grid. The players, referred to as White and Black, each control sixteen pieces and attempt to checkmate the other player. It is an abstract strategy game that involves no hidden information and no elements of chance.

Atomic Chess
Captures explode the capturing piece, and all nearby non-pawn pieces. Win by exploding the opponent's king Captures that explode your own king are illegal. The game is played on a standard 8×8 chess board with the usual starting pieces.

Crazyhouse Chess
Captured pieces return to the board under your control. On your turn, you can place a piece from your reserve onto an empty square instead of moving. The game is played on a standard 8×8 chess board with the usual starting pieces. Promoted captures are dropped back as pawns, and pawns can't be dropped on the first or eighth rank.

Horde Chess
White has 36 pawns against Black's standard pieces. White wins by checkmating the Black king, while Black wins by capturing all White pawns. The pawns start in a packed formation covering most of the board, and the game is played on a standard 8×8 chess board.

Racing Kings
Players race their kings to the eighth rank. Giving check is illegal. You may not move your king into check. If White reaches the eighth rank and Black does so immediately on the next move, the game is a draw. The pieces start on the first two ranks and are mirrored so both players face the same way.

3 Check Chess
You win by checking your opponent's king three times. A double check still counts as one check. The move that gives the third check must be legal, so you cannot win by giving the third check while you are in check. Checkmate, stalemate, and time control still end the game as usual.

King of the Hill
You win by moving your king to one of the four central squares: d4, e4, d5, or e5. The move that places your king on a central square must be legal. You cannot move into check to get there. Checkmate and stalemate still end the game as usual. All pieces move as in standard chess and the game uses the usual starting position on a standard 8×8 board.

Anti Chess
Chess variant where you win by losing all your pieces or by being stalemated. If you can capture you must. When more than one capture is available you choose which one to make. The king has no royal power and may be captured like any other piece. There is no check or checkmate, and castling is not allowed. Pawns may promote to a king. The game is played on a standard 8×8 chess board with the usual starting pieces.
Tic-Tac-Toe & Variants

Tic-Tac-Toe
Two players take turns placing X and O on a 3×3 grid. The first player to get three of their marks in a row horizontally, vertically, or diagonally wins. If all nine squares fill with no three-in-a-row, the game is a draw.

Ultimate Tic-Tac-Toe
Players take turns placing X and O on nine 3×3 boards arranged as a 3×3 grid. The cell you play in sends your opponent to the corresponding small board for their next move. When you win a small board you claim its square on the big board. The first player to claim three big-board squares in a row wins the game. If the target small board is already won or full the next player may play in any unfinished small board.

3D Tic-Tac-Toe
Players take turns placing X and O in a 3×3×3 cube. The first to make a straight line of three marks in any direction wins. Lines can run along rows, columns, pillars, or diagonals through the cube. If all 27 cells fill with no three-in-a-row, the game is a draw.
Coming Soon

Dungeon(inspired by Cthulhu: Death May Die)
Work together as investigators to disrupt an occult ritual, then fight the awakened boss. Each episode gives a map, monsters, and steps to weaken the ritual. You win by killing the boss's final stage; you lose when all investigators die.

Kingdom Cards(inspired by Dominion)
Start with a small deck. On your turn you play actions and buy cards from a shared supply to make your deck stronger. Victory cards decide the winner when the key piles empty, so the highest point total at game end wins.

Maze(inspired by Quoridor)
On your turn move your pawn one space up, down, left, or right, or place a wall between squares. You may jump over an adjacent pawn, and move diagonally only when a jump is blocked. You cannot block all paths. First to reach the opposite side wins.

Estate Empire(inspired by Monopoly)
Buy properties and charge rent to drive rivals toward bankruptcy. Roll and move around the board. Collect a full color set to build houses and hotels that raise the rent. Trade to make deals. Chance and Community cards can help or hurt. If you cannot pay, you are out. Last player remaining wins.

Checkers
Pieces move diagonally on the dark squares of an 8×8 board. Captures by jumping are mandatory. Regular pieces move forward; kings move both ways. You win by capturing all the opponent's pieces or leaving them with no legal move.

Backgammon
Roll two dice to move fifteen checkers around twenty-four points. Hit a single opposing checker to send it to the bar, and it must re-enter before other moves. When all your checkers are home, bear them off to win. Players use a doubling cube to raise the stakes.

Hanab Cards(inspired by Hanabi)
Your cards face outward so teammates see your hand, but you do not. Give limited hints about color or number and play cards in order from 1 to 5 to build five fireworks stacks. Mistakes cost fuse tokens; finish stacks or score as high as you can when the deck runs out.

Tiles(inspired by Azul)
Draft all tiles of one color from a display or the center and place them in a row on your board. At round end, one tile from each filled row moves to the wall and scores based on adjacent tiles. The game ends when any player completes a horizontal row on their wall

Connect 4
Drop discs into a 7×6 vertical grid. Each disc falls to the lowest open space in its column. Be the first to make a line of four discs horizontally, vertically, or diagonally.

Settlers(inspired by Catan)
Roll two dice to produce resources for the map areas. Trade with players or the bank and spend resources to build roads, settlements, and cities or buy development cards. A roll of 7 moves the robber and forces discards. First to 10 victory points wins.

House of Lies(inspired by Coup)
Bluff with two hidden role cards. On your turn take one action: gain coins, claim a role for a stronger action, or pay coins to kill another player. Anyone may challenge a claim or block certain actions. If you lose a challenge or get hit by a coup, you lose one card. When both your cards are revealed, you are out. Last player with any card wins.

Wandering Merchant(inspired by Splendor)
Collect gem tokens to buy cards. Bought cards give lasting discounts and points, making later turns stronger. You may reserve a card and take a gold token as a wildcard. Meet noble requirements for bonus points. When a player reaches 15 points, finish the round and the highest score wins.

Alien Wars(inspired by Cosmic Encounter)
Goal is to colonize other players' planets. Use a unique alien power and form alliances to place colonies on foreign planets. A draw chooses your target each turn. Both sides choose ships to send and secretly play an encounter card. Allies may join either side for shared rewards. The higher total wins the battle. First to five foreign colonies wins.

Block Cards(inspired by Project L)
Complete puzzle cards by fitting shape tiles to the outlines. On your turn take actions to gain puzzles, upgrade tiles, and place tiles. Finishing a puzzle gives points and new tiles for future turns. Use upgrades to tackle harder puzzles. After the final round, the highest score wins.