Learn An AI to play fruit ninja #1
Introduction
This post marks the beginning of a series aimed at developing AI for various games, beginning with Fruit Ninja.
What is Fruit Ninja ?
Fruit Ninja is a popular mobile game developed by Halfbrick Studios, where players use a finger swipe to simulate a blade slicing through fruit. The objective is to slice as many fruits as possible while avoiding obstacles like bombs.
Setup our environment / How to play the Game
Setting Up the Environment / How to Play the Game
Our goal is to play Fruit Ninja on a PC or Mac. While emulators like Bluestacks exist, they are unfortunately not compatible with M1 Macs.
As a result, I will be using an online version of Fruit Ninja, which is available here.
To make the process more challenging, I decided to deploy the backend on a server while running the client on my Mac. This setup introduces latency, further complicating the task.
Development
To build an AI to solve the game, I will break the process down into several steps:
- 1 Capture images of the game and create a dataset.
- 2 Label the images with bounding boxes around fruits and bombs.
- 3 Train an initial AI model to detect fruits and bombs.
- 4 Utilize a Python API to control the cursor movement.
- 5 Create a Backend to launch the model on a server.
- 6 Call the backend to make a prediction.
- 7 Combine everything and test our AI.
1 - Create a dataset
To build a dataset I made a simple python script to take screenshot of my screen.
Here are some images captured by the script. To build a strong dataset, it’s crucial to include variety. Therefore, it’s important to have images with only fruits, images with both fruits and bombs, instances of overlaps, slices, and even some images with nothing in them.
2 - Labelization
To label our images, I will be using Label studio, a widely used and popular tool for image annotation.
Here are some examples of image labeling using Label Studio.
To accelerate the labeling process, I will use HixLoop, a tool developed by my company, Bionomeex, which integrates with Label Studio to perform AI-assisted labeling.
First Labelized Dataset
The first dataset consists of 92 images, featuring 106 bounding boxes for fruits and 22 for bombs.
Additionally, I created a test set with 28 images, which includes 32 bounding boxes for fruits and 7 for bombs.
3 - Train an AI model to detect fruits and bombs
I decided to use Detectron2, a library developed by Facebook Research that offers a variety of pre-trained object detection models for tasks like object detection, segmentation, and more.
I began with the Faster R-CNN R101-FPN baseline model, as it offers a good balance between training time, inference time, and box average precision, making it a solid starting point for object detection tasks.
After a first training of 400 epochs, I get the follwing metrics on the test set.
4 - Utilize a Python API to control the cursor movement
We are now ready to implement the missing components to play the game.
We will use the pyautogui library, which is also used for taking screenshots. The cursor will slice from the top right corner of the bounding box to the bottom left corner. If necessary, to avoid slicing through bombs when there is an overlap between objects, the code can be further improved for better handling of such situations.
5 - Create a Backend to launch the model on a server.
For the backend, I have chosen FastAPI. The code is designed to load the model only once using the singleton pattern, ensuring efficient resource usage. It provides a prediction endpoint that accepts an image as input and returns a list of dictionaries, each containing the bounding box coordinates and the predicted class.
6 - Call the backend to make a prediction
To interact with the backend, it’s straightforward—simply make a POST request using the requests library. This allows you to send an image to the prediction endpoint and receive the bounding box coordinates and class predictions in return.
7 - Combine everything and test our AI
Here is the first loop which will launch each actions in the correct order.
- take a screenshot
- make a prediction
- iterate the boxes and slice any fruit.
I added a keybind to pause the AI’s execution and implemented a basic time calculation to track how long the process takes.
Here’s the time taken for the screenshot and prediction process. As you can see, it’s quite slow.
- Process took 2.3736 seconds.
- Process took 2.1963 seconds.
- Process took 2.2147 seconds.
- Process took 2.2771 seconds.
The slowness is due to several factors, with the two main reasons being:
- The screenshot takes too much time.
- The model doesn’t require a 2K image, as the default Detectron model uses an 800x800 image as input.
Comparison of mss and pyautogui for Screenshot Capture
By using mss python, the screenshot process now takes ~0.50 second.
- Process took 0.5300 seconds.
- Process took 0.5663 seconds.
- Process took 0.5252 seconds.
- Process took 0.5059 seconds.
- Process took 0.5111 seconds.
By using mss python, the screenshot process takes ~0.10 second.
- Process took 0.1808 seconds.
- Process took 0.0809 seconds.
- Process took 0.0797 seconds.
- Process took 0.0819 seconds.
- Process took 0.0798 seconds.
Updated code to take a screenshot using mss and resize the captured image to 800x800 pixels.
With the two main issues resolved, the screenshot capture and prediction process now take less than 1 second.
- Process took 0.8128 seconds.
- Process took 0.7935 seconds.
- Process took 0.7922 seconds.
- Process took 0.8162 seconds.
- Process took 0.7968 seconds.
First result
The AI can detect fruits, but the low frame rate causes the process to be ineffective in real-time gameplay. Additionally, there are instances where multiple boxes are drawn around the same fruit. To address this, we will implement Non-Maximum Suppression (NMS) to eliminate the box with the lower confidence score in cases of overlap.
More Optimizations
- Parallelize Processing
- Slice multiple fruit at the same time
- Real Time version of fastrcnn