.. _sec_ndarray: Data Manipulation ================= In order to get anything done, we need some way to store and manipulate data. Generally, there are two important things we need to do with data: (i) acquire them; and (ii) process them once they are inside the computer. There is no point in acquiring data without some way to store it, so let us get our hands dirty first by playing with synthetic data. To start, we introduce the :math:`n`-dimensional array, which is also called the *tensor*. If you have worked with NumPy, the most widely-used scientific computing package in Python, then you will find this section familiar. No matter which framework you use, its *tensor class* (``ndarray`` in MXNet, ``Tensor`` in both PyTorch and TensorFlow) is similar to NumPy's ``ndarray`` with a few killer features. First, GPU is well-supported to accelerate the computation whereas NumPy only supports CPU computation. Second, the tensor class supports automatic differentiation. These properties make the tensor class suitable for deep learning. Throughout the book, when we say tensors, we are referring to instances of the tensor class unless otherwise stated. Getting Started --------------- In this section, we aim to get you up and running, equipping you with the basic math and numerical computing tools that you will build on as you progress through the book. Do not worry if you struggle to grok some of the mathematical concepts or library functions. The following sections will revisit this material in the context of practical examples and it will sink in. On the other hand, if you already have some background and want to go deeper into the mathematical content, just skip this section. .. raw:: html

.. raw:: html

To start, we import the ``np`` (``numpy``) and ``npx`` (``numpy_extension``) modules from MXNet. Here, the ``np`` module includes functions supported by NumPy, while the ``npx`` module contains a set of extensions developed to empower deep learning within a NumPy-like environment. When using tensors, we almost always invoke the ``set_np`` function: this is for compatibility of tensor processing by other components of MXNet. .. raw:: latex \diilbookstyleinputcell .. code:: python from mxnet import np, npx npx.set_np() .. raw:: html

.. raw:: html

To start, we import ``torch``. Note that though it's called PyTorch, we should import ``torch`` instead of ``pytorch``. .. raw:: latex \diilbookstyleinputcell .. code:: python import torch .. raw:: html

.. raw:: html

To start, we import ``tensorflow``. As the name is a little long, we often import it with a short alias ``tf``. .. raw:: latex \diilbookstyleinputcell .. code:: python import tensorflow as tf .. raw:: html

.. raw:: html

A tensor represents a (possibly multi-dimensional) array of numerical values. With one axis, a tensor is called a *vector*. With two axes, a tensor is called a *matrix*. With :math:`k > 2` axes, we drop the specialized names and just refer to the object as a :math:`k^\mathrm{th}` *order tensor*. .. raw:: html