February 17, 2017
tfdbg
), a tool that makes debugging
of machine learning models (ML) in TensorFlow easier.
Session.run()
method.Session.run()
call is effectively a single statement and does not
exposes the running graph's internal structure (nodes and their connections) and
state (output arrays or tensors of the nodes). Lower-level debuggers
such as gdb cannot organize stack
frames and variable values in a way relevant to TensorFlow graph operations. A
specialized runtime debugger has been among the most frequently raised feature
requests from TensorFlow users.
tfdbg
addresses this runtime debugging need. Let's
see tfdbg
in action with a short snippet of code
that sets up and runs a simple TensorFlow graph to fit a simple linear equation
through gradient
descent.
import numpy as np import tensorflow as tf import tensorflow.python.debug as tf_debug xs = np.linspace(-0.5, 0.49, 100) x = tf.placeholder(tf.float32, shape=[None], name="x") y = tf.placeholder(tf.float32, shape=[None], name="y") k = tf.Variable([0.0], name="k") y_hat = tf.multiply(k, x, name="y_hat") sse = tf.reduce_sum((y - y_hat) * (y - y_hat), name="sse") train_op = tf.train.GradientDescentOptimizer(learning_rate=0.02).minimize(sse) sess = tf.Session() sess.run(tf.global_variables_initializer()) sess = tf_debug.LocalCLIDebugWrapperSession(sess) for _ in range(10): sess.run(train_op, feed_dict={x: xs, y: 42 * xs})
LocalCLIDebugWrapperSession
), so the calling
the run()
method will launch the command-line interface (CLI) of
tfdbg
. Using mouse clicks or commands,
you can proceed through the successive run calls, inspect the graph's nodes and
their attributes, visualize the complete history of the execution of all
relevant nodes in the graph through the list of intermediate tensors. By using
the invoke_stepper
command, you can let the
Session.run()
call execute in the "stepper mode", in which you can
step to nodes of your choice, observe and modify their outputs, followed by
further stepping actions, in a way analogous to debugging procedural languages
(e.g., in gdb or pdb).
tfdbg
CLI and its conditional breakpoint support,
you can quickly identify the culprit node. The video below demonstrates how to
debug infinity/NaN issues in a neural network with
tfdbg
:
tfdbg
requires fewer lines of code
change, provides more comprehensive coverage of the graphs, and offers a more
interactive debugging experience. It will speed up your model development and
debugging workflows. It offers additional features such as offline
debugging of dumped tensors from server environments and integration with tf.contrib.learn.
To get started, please visit this documentation.
This research paper
lays out the design of tfdbg
in greater detail.
tfdbg
is 0.12.1. To report bugs, please open issues on TensorFlow's GitHub
Issues Page. For general usage help, please post questions on StackOverflow
using the tag tensorflow.