How AI background image removers actually work + which one is the best.
I tested a bunch of tools, and we get a little technical
At a glance AI can seem like magic.
How else do you explain the fact that you can enter a prompt like ‘Goddess of the night surrounded by fireflies, wide angle view, very detailed, detailed face, matte, tonemapping, perfection, 4 k’ and get this 👇🏻??
Sure the eyes are wonky, the hands are weird, but a computer literally took a collection of words, translated those into an image, and made something that doesn’t look half bad. Obviously magic right?
But by spending a few minutes to understand how AI works, we can not only use the tools better but understand it is not magic, but math and something we can control.
I will explain how generative art AI works in a future letter. Today, let’s look at a simpler use case, background image removers.
How do AI background image removers actually work?
We have always been able to remove backgrounds with image editing tools. From the days of razor blades and mattes to Photoshop, we have had tools to remove a subject from the background. These tools usually involve a lot of manual work. Cutting out the subject, deleting the background and blurring and blending the edges.
Now with AI enabled tools you can remove backgrounds instantly. I uploaded our generative image of the firefly woman to a background remover and it did a fantastic job. It correctly identified the women, selected all of her, and preserved the details in the hair.
So how does AI background image removal work?
How does the computer know what the background is, what the foreground is, and what we want removed?
To understand that, with apologize to C3PO, let’s get technical.
An image is a grid of numbers
Images on computers are grid of lots of little squares called pixels.
A pixel is a square that has a number in it. For black and white, 0 is dark or off, 255 is white or all the way on.
Modern pixels are made up of 3 smaller rectangles containing info for how much red, green, and blue. By combining red green and blue light you can make all colors.
But to keep things simple, we will just focus on black and white pixels.
To start removing the background, we need to find the edges of objects in our image, so we can cut along them.
Because an image is just a grid of numbers, we can do math on it. To identify the edge of an object in an image, we will use a technique called edge detection.
Edge Detection
Edge detection is a type of image processor. An image processor takes an image as input and then transforms or modifies that image in some way. The output of an image processor can be either an image or a bunch of numbers and parameters related to the image.
We put an image in, edge detection does some math, and we get an image out that shows us, and the computer, clearly where the edges are.
But how does that actually work? Edge detection uses what is called a kernel convolution. (Don’t get scared of the big name.)
Kernel Convolution
Convolution is the process going through a grid of numbers, cell by cell, doing math described by the kernel on the neighboring cells, and then putting the resulting number into a new table.
This gif shows the process.
So with Kernel Convolution, we put a grid of numbers in, do some math, and get a grid of numbers out.
Remember an image is a grid of numbers, so we put an image in, and get an image out.
Traditional image processing tools can already do this. When you use the ‘magnetic lasso’ in Photoshop it is just finding the edge of an object.
What edge detection alone can’t do well, is figure out what the object is. Edge detection just finds edges, it doesn’t find objects. It relies on the human to tell it what to keep, and what to delete.
This leads to problems, which leads to lots of work for humans. Transparent areas, hair, low contrast areas, hair, fur, fluff. Humans would spend a lot of time cleaning up and refining the edges to properly separate an image from the background.
But what if the computer could combine knowing where the edges of an image were, with knowing what the objects were? Then it could figure out what is hair, what is the subject and what is the background.
To do that, we need machine vision.
Machine Vision
Machine vision is the process of computers extracting useful information from digital images.
This information can be in the form of measurements, shapes, identification, locations, or anything that can be represented digitally.
Machine vision can let a computer not just find edges, but also figure out what an object is.
We will dive into machine vision in the future. For right now, we just need to know that machine vision allows the computer to guess what different objects in an image are.
For example when I put the picture of our firefly woman into an machine vision algorithm it returned objects it found, as well as labels.
This is super useful to the background remover. It can now guess that the subject is a person, and what the background is. It also knows there is hair so it can deal with the detailed edges that entails.
And machine vision can get super detailed. The above results were a quick test online. Other machine vision algorithms can identify smaller and smaller areas, categorizing individual pixels as parts of different objects.
Combining everything we just learned, we take our input image, and feed it to an Edge Detector image processor. The Edge Detector runs a kernel convolution on it looking at each pixel, doing math on the surrounding pixels and outputting that to a new image. We run a machine vision algorithm to identify objects in the image. Combining the objects, with their edges, the computer decides what is the background and how to remove it.
AI Background Remover Tools you can use Today
And because this newsletter is not just about the theoretical, but how to actually use it, I reviewed 10 different AI background removers. Here is the list of all of them.
The top 3 were:
AI is not magic. It is clever math. What is amazing is you can combine the different math techniques to make tools that can do in seconds what it used to take minutes or hours to accomplish.
Resources
Welch Labs How Computers See YouTube Video Series - amazing series of videos. Some of the best machine learning content on YouTube.
Sobel Edge Detection YouTube Video - shows on paper a kernel convolution.
https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution - source of the gif.
https://en.wikipedia.org/wiki/Sobel_operator - Sobel is the most commonly used edge detection in a lot of machine learning systems.
https://towardsdatascience.com/background-removal-with-deep-learning-c4f2104b3157 - How automatic background removal works. Good primer.
List of Best AI Background Image Removers
https://livecodestream.dev/post/remove-the-background-from-images-using-ai-and-python/ - code your own background remover.