Learn An AI to play fruit ninja
Introduction
This post marks the beginning of a series aimed at developing AI for various games, beginning with Fruit Ninja.
What is Fruit Ninja ?
Fruit Ninja is a popular mobile game developed by Halfbrick Studios, where players use a finger swipe to simulate a blade slicing through fruit. The objective is to slice as many fruits as possible while avoiding obstacles like bombs.
Setup our environment / How to play the Game
Setting Up the Environment / How to Play the Game
Our goal is to play Fruit Ninja on a PC or Mac. While emulators like Bluestacks exist, they are unfortunately not compatible with M1 Macs.
As a result, I will be using an online version of Fruit Ninja, which is available here.
Development
To build an AI to solve the game, I will break the process down into several steps:
- 1 Capture images of the game and create a dataset.
- 2 Label the images with bounding boxes around fruits and bombs.
- 3 Train an initial AI model to detect fruits and bombs.
- 4 Utilize a Python API to control the cursor movement.
- 5 Create a Backend to launch the model on a server.
- 6 Call the backend to make a prediction.
- 7 Combine everything and test our AI.
1 - Create a dataset
To build a dataset I made a simple python script to take screenshot of my screen.
Here are some images captured by the script. To build a strong dataset, it’s crucial to include variety. Therefore, it’s important to have images with only fruits, images with both fruits and bombs, instances of overlaps, slices, and even some images with nothing in them.
2 - Labelization
To label our images, I will be using Label studio, a widely used and popular tool for image annotation.
Here are some examples of image labeling using Label Studio.
To accelerate the labeling process, I will use HixLoop, a tool developed by my company, Bionomeex, which integrates with Label Studio to perform AI-assisted labeling.
First Labelized Dataset
The first dataset consists of 92 images, featuring 106 bounding boxes for fruits and 22 for bombs.
Additionally, I created a test set with 28 images, which includes 32 bounding boxes for fruits and 7 for bombs.
3 - Train an AI model to detect fruits and bombs
I decided to use Detectron2, a library developed by Facebook Research that offers a variety of pre-trained object detection models for tasks like object detection, segmentation, and more.
I began with the Faster R-CNN R101-FPN baseline model, as it offers a good balance between training time, inference time, and box average precision, making it a solid starting point for object detection tasks.
After a first training of 400 epochs, I get the follwing metrics on the test set.
4 - Utilize a Python API to control the cursor movement
We are now ready to implement the missing components to play the game.
We will use the pyautogui library, which is also used for taking screenshots. The cursor will slice from the top right corner of the bounding box to the bottom left corner. If necessary, to avoid slicing through bombs when there is an overlap between objects, the code can be further improved for better handling of such situations.
5 - Create a Backend to launch the model on a server.
For the backend, I have chosen FastAPI. The code is designed to load the model only once using the singleton pattern, ensuring efficient resource usage. It provides a prediction endpoint that accepts an image as input and returns a list of dictionaries, each containing the bounding box coordinates and the predicted class.
6 - Call the backend to make a prediction
To interact with the backend, it’s straightforward—simply make a POST request using the requests library. This allows you to send an image to the prediction endpoint and receive the bounding box coordinates and class predictions in return.
7 - Combine everything and test our AI
Here is the first loop which will launch each actions in the correct order.
- take a screenshot
- make a prediction
- iterate the boxes and slice any fruit.
I added a keybind to pause the AI’s execution and implemented a basic time calculation to track how long the process takes.
Here’s the time taken for the screenshot and prediction process. As you can see, it’s quite slow.
- Process took 2.3736 seconds.
- Process took 2.1963 seconds.
- Process took 2.2147 seconds.
- Process took 2.2771 seconds.
The slowness is due to several factors, with the two main reasons being:
- The screenshot takes too much time.
- The model doesn’t require a 2K image, as the default Detectron model uses an 800x800 image as input.
By using mss python, the screenshot process now takes 0.08 seconds instead of 2 seconds.
- Process took 0.1808 seconds.
- Process took 0.0809 seconds.
- Process took 0.0797 seconds.
- Process took 0.0819 seconds.
- Process took 0.0798 seconds.