I have trained an EfficientNet model to classify more than ten thousand different categories of birds, by using PyTorch. To run this model on the mobile device, I built a program by learning the PyTorch iOS demo at first, make sure it runs well, and then try to build another Android program by learning the PyTorch Android demo. But after replacing the demo model with my model, the Android program prints out the result as all ‘NaN’.
To figure out the reason step by step. I tested a few widely used deep learning models:

Model Name Result on Android demo
EfficientNet Nan
ResNet-50 Normal
ResNet-101 Normal
RegNet+SE NaN
RegNet Normal

Seems that the only difference between ‘Nan’ and ‘Normal’ is the Squeeze-Excitation module. But Squeeze-Excitation module is quite simple. It’s just:

        self.f_ex = nn.Sequential(
            nn.Conv2d(w_in, w_se, 1, bias=True),
            nn.ReLU(inplace=cfg.MEM.RELU_INPLACE),
            nn.Conv2d(w_se, w_in, 1, bias=True),
            nn.Sigmoid(),
        )

Aha, I saw the nn.Sigmoid. That’s the only layer which haven’t been used in ResNet.
Let’s dive into Sigmoid function


sigmoid

Looks the $latex e^{-x}$ is the problem: it will became underflow if the X is too big!

Until now, I still have another question: why the ‘NaN’ appears in the Android program but not the iOS program. Perhaps it’s about JVM or the emulator that Android Studio used.