对卷积神经网络中1D,2D和3D怎么直观理解 - 问答

卷积方向和输出形状很重要！

↑↑↑↑↑ 1D Convolutions - Basic ↑↑↑↑↑

仅1个方向（时间轴）即可计算转化
输入= [W]，滤波器= [k]，输出= [W]
例如）输入= [1,1,1,1,1]，过滤器= [0.25,0.5,0.25]，输出= [1,1,1,1,1]
输出形状为一维数组
示例）图形平滑

tf.nn.conv1d 代码示例

import tensorflow as tf
import numpy as np

sess = tf.Session()

ones_1d = np.ones(5)
weight_1d = np.ones(3)
strides_1d = 1

in_1d = tf.constant(ones_1d, dtype=tf.float32)
filter_1d = tf.constant(weight_1d, dtype=tf.float32)

in_width = int(in_1d.shape[0])
filter_width = int(filter_1d.shape[0])

input_1d   = tf.reshape(in_1d, [1, in_width, 1])
kernel_1d = tf.reshape(filter_1d, [filter_width, 1, 1])
output_1d = tf.squeeze(tf.nn.conv1d(input_1d, kernel_1d, strides_1d, padding='SAME'))
print sess.run(output_1d)

↑↑↑↑↑ 2D Convolutions - Basic ↑↑↑↑↑

2个方向（x，y）进行计算转换
输出形状为二维矩阵
输入= [W，H]，过滤器= [k，k]输出= [W，H]
示例）Sobel边缘滤波器

tf.nn.conv2d 代码示例

ones_2d = np.ones((5,5))
weight_2d = np.ones((3,3))
strides_2d = [1, 1, 1, 1]

in_2d = tf.constant(ones_2d, dtype=tf.float32)
filter_2d = tf.constant(weight_2d, dtype=tf.float32)

in_width = int(in_2d.shape[0])
in_height = int(in_2d.shape[1])

filter_width = int(filter_2d.shape[0])
filter_height = int(filter_2d.shape[1])

input_2d   = tf.reshape(in_2d, [1, in_height, in_width, 1])
kernel_2d = tf.reshape(filter_2d, [filter_height, filter_width, 1, 1])

output_2d = tf.squeeze(tf.nn.conv2d(input_2d, kernel_2d, strides=strides_2d, padding='SAME'))
print sess.run(output_2d)

↑↑↑↑↑ 3D Convolutions - Basic ↑↑↑↑↑

3个方向（x，y，z）进行计算转换
输出形状为三维体积
输入= [W，H，L]，过滤器= [k，k，d]输出= [W，H，M]
d<L很重要！对于音量输出
示例）C3D

tf.nn.conv3d 代码示例

ones_3d = np.ones((5,5,5))
weight_3d = np.ones((3,3,3))
strides_3d = [1, 1, 1, 1, 1]

in_3d = tf.constant(ones_3d, dtype=tf.float32)
filter_3d = tf.constant(weight_3d, dtype=tf.float32)

in_width = int(in_3d.shape[0])
in_height = int(in_3d.shape[1])
in_depth = int(in_3d.shape[2])

filter_width = int(filter_3d.shape[0])
filter_height = int(filter_3d.shape[1])
filter_depth = int(filter_3d.shape[2])

input_3d   = tf.reshape(in_3d, [1, in_depth, in_height, in_width, 1])
kernel_3d = tf.reshape(filter_3d, [filter_depth, filter_height, filter_width, 1, 1])

output_3d = tf.squeeze(tf.nn.conv3d(input_3d, kernel_3d, strides=strides_3d, padding='SAME'))
print sess.run(output_3d)

Input & Output in Tensorflow

转载：https://stackoverflow.com/a/44628011/15018571

2021-01-18 17:20 更新

阿托 • 17013

理工酷

首页

圈子

资源下载

邀请回答

推荐问题

推荐资源

加入组织

理工酷

首页

圈子

资源下载

站外资源

问答

网址导航

邀请回答 换一组

推荐问题

推荐资源

加入组织

邀请回答