Python tensorflow与pytorch的浮点运算数如何计算

1. 引言

FLOPs 是 floating point operations 的缩写，指浮点运算数，可以用来衡量模型/算法的计算复杂度。本文主要讨论如何在 tensorflow 1.x, tensorflow 2.x 以及 pytorch 中利用相关工具计算对应模型的 FLOPs。

2. 模型结构

为了说明方便，先搭建一个简单的神经网络模型，其模型结构以及主要参数如表1 所示。

表 1 模型结构及主要参数

Layers	channels	Kernels	Strides	Units	Activation
Conv2D	32	(4,4)	(1,2)	\	relu
GRU	\	\	\	96	\
Dense	\	\	\	256	sigmoid

用 tensorflow（实际使用 tensorflow 中的 keras 模块）实现该模型的代码为：

from tensorflow.keras.layers import *
from tensorflow.keras.models import load_model, Model
def test_model_tf(Input_shape):
    # shape: [B, C, T, F]
    main_input = Input(batch_shape=Input_shape, name='main_inputs')
    conv = Conv2D(32, kernel_size=(4, 4), strides=(1, 2), activation='relu', data_format='channels_first', name='conv')(main_input)
    # shape: [B, T, FC]
    gru = Reshape((conv.shape[2], conv.shape[1] * conv.shape[3]))(conv)
    gru = GRU(units=96, reset_after=True, return_sequences=True, name='gru')(gru)
    output = Dense(256, activation='sigmoid', name='output')(gru)
    model = Model(inputs=[main_input], outputs=[output])
    return model

用 pytorch 实现该模型的代码为：

import torch
import torch.nn as nn
class test_model_torch(nn.Module):
    def __init__(self):
        super(test_model_torch, self).__init__()
        self.conv2d = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=(4,4), stride=(1,2))
        self.relu = nn.ReLU()
        self.gru = nn.GRU(input_size=4064, hidden_size=96)
        self.fc = nn.Linear(96, 256)
        self.sigmoid = nn.Sigmoid()
    def forward(self, inputs):
        # shape: [B, C, T, F]
        out = self.conv2d(inputs)
        out = self.relu(out)
        # shape: [B, T, FC]
        batch, channel, frame, freq = out.size()
        out = torch.reshape(out, (batch, frame, freq*channel))
        out, _ = self.gru(out)
        out = self.fc(out)
        out = self.sigmoid(out)
        return out

3. 计算模型的 FLOPs

本节讨论的版本具体为：tensorflow 1.12.0, tensorflow 2.3.1 以及 pytorch 1.10.1+cu102。

3.1. tensorflow 1.12.0

在 tensorflow 1.12.0 环境中，可以使用以下代码计算模型的 FLOPs：

import tensorflow as tf
import tensorflow.keras.backend as K
def get_flops(model):
    run_meta = tf.RunMetadata()
    opts = tf.profiler.ProfileOptionBuilder.float_operation()
    flops = tf.profiler.profile(graph=K.get_session().graph,
                                run_meta=run_meta, cmd='op', options=opts)
    return flops.total_float_ops
if __name__ == "__main__":
    x = K.random_normal(shape=(1, 1, 100, 256))
    model = test_model_tf(x.shape)
    print('FLOPs of tensorflow 1.12.0:', get_flops(model))

3.2. tensorflow 2.3.1

在 tensorflow 2.3.1 环境中，可以使用以下代码计算模型的 FLOPs ：

import tensorflow.compat.v1 as tf
import tensorflow.compat.v1.keras.backend as K
tf.disable_eager_execution()
def get_flops(model):
    run_meta = tf.RunMetadata()
    opts = tf.profiler.ProfileOptionBuilder.float_operation()
    flops = tf.profiler.profile(graph=K.get_session().graph,
                                run_meta=run_meta, cmd='op', options=opts)
    return flops.total_float_ops
if __name__ == "__main__":
    x = K.random_normal(shape=(1, 1, 100, 256))
    model = test_model_tf(x.shape)
    print('FLOPs of tensorflow 2.3.1:', get_flops(model))

3.3. pytorch 1.10.1+cu102

在 pytorch 1.10.1+cu102 环境中，可以使用以下代码计算模型的 FLOPs（需要安装 thop）：

import thop
x = torch.randn(1, 1, 100, 256)
model = test_model_torch()
flops, _ = thop.profile(model, inputs=(x,))
print('FLOPs of pytorch 1.10.1:', flops * 2)

需要注意的是，thop 返回的是 MACs (Multiply–Accumulate Operations)，其等于 2 2 2 倍的 FLOPs，所以上述代码有乘 2 2 2 操作。

3.4. 结果对比

三者计算出的 FLOPs 分别为：

tensorflow 1.12.0：

Python tensorflow与pytorch的浮点运算数如何计算

tensorflow 2.3.1：

Python tensorflow与pytorch的浮点运算数如何计算

pytorch 1.10.1：

Python tensorflow与pytorch的浮点运算数如何计算

可以看到 tensorflow 1.12.0 和 tensorflow 2.3.1 的结果基本在同一个量级，而与 pytorch 1.10.1 计算出来的相差甚远。但如果将上述模型结构改为只包含第一层 Conv2D，三者计算出来的 FLOPs 却又是一致的。所以推断差异主要来自于 GRU 的 FLOPs。如读者知道其中详情，还请不吝赐教。

4. 总结

本文给出了在 tensorflow 1.x, tensorflow 2.x 以及 pytorch 中利用相关工具计算模型 FLOPs 的方法，但从本文所使用的测试模型来看， tensorflow 与 pytorch 统计出的结果相差甚远。当然，也可以根据网络层的类型及其对应的参数，推导计算出每个网络层所需的 FLOPs。

您可能感兴趣的文章:

Python tensorflow与pytorch的浮点运算数如何计算

1. 引言

2. 模型结构

3. 计算模型的 FLOPs

4. 总结

Jupyter Notebook安装及使用方法解析

Python操作word文档插入图片和表格的实例演示

Python+OpenCV图像处理——实现轮廓发现

Python高阶函数与装饰器函数的深入讲解

Sentry错误日志监控使用方法解析

python hmac模块验证客户端的合法性

Python数据可视化常用4大绘图库原理详解

python 下载m3u8视频的示例代码

Django执行源生mysql语句实现过程解析

python对 MySQL 数据库进行增删改查的脚本

python两种获取剪贴板内容的方法

Python操作word文档插入图片和表格的实例演示

Jupyter Notebook安装及使用方法解析

用python写一个带有gui界面的密码生成器

4款Python 类型检查工具,你选择哪个呢？

Django执行源生mysql语句实现过程解析

基于python模拟TCP3次握手连接及发送数据

利用python清除移动硬盘中的临时文件

pytorch简介

python3从网络摄像机解析mjpeg http流的示例

在线直播课程讲师介绍手机页面模板

汽车app我的询价列表页面模板

在线辅导课程商城首页模板

简洁的电商个人中心主页手机模板收藏

手机银行公众号主页模板

生鲜水果外卖app手机模板

影视文化传媒公司手机微网站模板

微信weui框架开发的手机商城模块页面

我的积分商城app首页模板

社交app关注列表页面模板

jQuery随机点名中奖后放烟花动画特效

html文字动画特效，文字虚线边框

jQuery+Layui省市区城市三级联动菜单选择特效

清爽简洁的登录页面

jQuery文字公告无限滚动轮播特效