AI Upscaling

By Zarxrax

Updated 7/1/2023


This is a (basic) tutorial on how to upscale an animated video using AI models.

You will need an Nvidia GPU for best results (see the section at the end if you don’t have an nvidia gpu).

Overview

Upscaling is done by using an AI “model” to process your video frames. Each model has been trained to modify an image in a particular way. There are a lot of different models to choose from. Some models don’t even do upscaling–they might do other tasks like remove certain types of artifacts from the image. Models can be found at OpenModelDB. I will recommend a few models to start with later in this guide.


Along with models, you will need a piece of software that will actually let you USE the model. In this guide, we will use a tool called chaiNNer. It is relatively easy to set up and use. (You can alternatively use vs-mlrt for Vapoursynth. This is actually quite a bit faster than chaiNNer, but it is somewhat difficult to set up and use. For that reason, I will not be covering it in this guide. For most people, I recommend sticking with chaiNNer unless you are comfortable working with Python.)


The models and tools in this guide are designed for upscaling images. They work pretty well for animated video, because animation is simply a series of images. However, it doesn’t work so well for live-action video, and maybe certain types of CG video. For live-action, the best tool currently is Topaz Video AI (commercial software).

How to Set up and use chaiNNer

Download and install the latest version of chaiNNer: https://chainner.app

There is pretty good documentation for getting started on chaiNNer’s Github page, so I recommend spending a few minutes reading over it.

After installing and launching the program for the first time, you will need to let it download the dependencies. This may take a while as it needs to download several gigabytes of data. If it doesn’t happen automatically, you can access the dependency manager in the upper right corner of the application window.


You can also find some tutorials for chaiNNer on Youtube, such as this one: ChaiNNer – Upscale a single image & a directory

Note that the video might be slightly outdated, as ChaiNNer is under heavy development.


ChaiNNer is a node-based image processing toolkit. You basically create a flowchart telling it what to do. Load an image, load a model, upscale the image using the model, save the result back into another image. ChaiNNer also has a “video frame iterator” which can let you load a video file directly, then do processing on each individual frame.


So to process a video file using a model, you would basically create a chain that looks like this:

A basic explanation of how to build this graph: On the left side panel you will see all of the nodes that you can bring in. First, select the “Video frame iterator” and drag it into the workspace. Inside the Video Frame Iterator there will already be two additional nodes–”Load frame as Image” and “Write Output Frame”. At the top of the Video Frame Iterator is an area where you can select your input video file. You also want to find the orange “Upscale Image” node and drag it inside the Video Frame Iterator.


Next, select the orange “Load Model” node and drag it into the workspace. (Note that it used to be necessary to place the Load Model node outside of the iterator, but newer versions of chainner don’t care where you put it). Finally just hook up the nodes as depicted in the image. On the “Write Output Frame” node you can specify a different directory and filename for your output file, and you can also set the encoding settings here. I recommend outputting an MP4 at the best quality (quality 0) to create a lossless file which you can then do further processing or encoding on. 

Selecting a Model

Processing a video may be somewhat slow, depending on your graphics card and the model selected. Before processing an entire video, you should first export a few individual frames from your video and test different models on them in order to find a model (or models) that make your video look good.


It is worth noting that you may see some models described as “lite” or “compact”. These models can be much faster at the expense of the model being smaller (and thus it wasn’t able to “learn” as much). Animation does not usually need a full sized model, so I highly recommend using lite or compact models when possible. In general, the smaller the file size of a model, the faster it will run. A full-sized ESRGAN model is around 64MB, while compact models may come in at just a couple MB.


You will also want to note that each model is trained to scale the image by a specific factor–typically 2x or 4x. In order to get your video to a specific resolution, you will typically upscale it and then do a standard resize to get it to the exact size you want it to be. 1x models are usually designed to fix a particular problem and then be chained together with another model to do the actual upscaling.


With that out of the way, here are a few models that I recommend you try, in no particular order. There are lots of other good ones, so feel free to try out others as well.

https://openmodeldb.info


Recommended Models


Model Name - Description

(2x) Futsuu_Anime - Upscales while doing some sharpening and line darkening. Can also clean up some minor artifacts of various types. Pretty good general purpose model.

(2x) AnimeClassics UltraLite - Handles Rainbows, Dot Crawl, MPEG/H.264 Compression, and may even assist in removing halos, and fixing blurriness in certain cases. Best when used on old anime that is grainy.

(2x) LD-Anime_Compact - Upscales while fixing numerous video problems, including: noise/grain, compression artifacts, rainbows, dot crawl, halos and color bleed. Can over-smooth some textures though.

(2x) Digitoon Lite - Meant as a versatile model for upscaling high detail digital anime and cartoons. Has debanding, MPEG-2 correction, and halo reduction.

(1x) HurrDeblur SuperUltraCompact - Very fast 1x sharpening/deblurring model

(1x) AnimeUndeint Compact - Corrects jagged lines on animation that has been deinterlaced.

(1x) BleedOut Compact - Helps repair color bleed and heavy chroma noise that may be present on some older footage

(1x) Dotzilla Compact - Wipes out dot crawl and rainbows in animation.

Of course, if none of these seem to work well for your source, feel free to try out any other models in the model database to see if there are any that work better for you.

If you don’t have an NVIDIA GPU

Most models are distributed as .pth (pytorch) files. These work best with NVIDIA GPUs. 

If you have an AMD GPU, you will get best results by converting the model to NCNN format. You can do this as follows:


Select the Orange Load Model node (orange indicates you are loading a pytorch model). Select the Orange Convert to NCNN node. Select the Pink Save Model node.

If you do not have any NCNN or ONNX nodes, go to the dependency manager in the top right of the application window, and make sure all of the dependencies are installed.

In the Convert To NCNN node, first try selecting fp16 as the data type. This will be faster than fp32. If for some reason the model doesn’t work, you can try again using the fp32 option.

By running this chain, it will convert the model into the NCNN format which is optimized for AMD graphics cards. You can then follow the upscaling guide above, but whenever it tells you to use an orange node, you will use the pink nodes instead.


If you are on Mac, ChaiNNer currently only supports CPU processing mode, which is likely too slow to be usable. Better support for Mac should hopefully arrive in the future.