.. raw:: html
.. raw:: latex
\diilbookstyleinputcell
.. code:: python
loss = gluon.loss.SoftmaxCrossEntropyLoss()
.. raw:: html
.. raw:: html
.. raw:: latex
\diilbookstyleinputcell
.. code:: python
loss = nn.CrossEntropyLoss(reduction='none')
.. raw:: html
.. raw:: html
.. raw:: latex
\diilbookstyleinputcell
.. code:: python
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
.. raw:: html
.. raw:: html
Optimization Algorithm
----------------------
Here, we use minibatch stochastic gradient descent with a learning rate
of 0.1 as the optimization algorithm. Note that this is the same as we
applied in the linear regression example and it illustrates the general
applicability of the optimizers.
.. raw:: html
.. raw:: html
.. raw:: latex
\diilbookstyleinputcell
.. code:: python
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.1})
.. raw:: html
.. raw:: html
.. raw:: latex
\diilbookstyleinputcell
.. code:: python
trainer = torch.optim.SGD(net.parameters(), lr=0.1)
.. raw:: html
.. raw:: html
.. raw:: latex
\diilbookstyleinputcell
.. code:: python
trainer = tf.keras.optimizers.SGD(learning_rate=.1)
.. raw:: html
.. raw:: html
Training
--------
Next we call the training function defined in
:numref:`sec_softmax_scratch` to train the model.
.. raw:: html
.. raw:: html
.. raw:: latex
\diilbookstyleinputcell
.. code:: python
num_epochs = 10
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
.. figure:: output_softmax-regression-concise_75d138_51_0.svg
.. raw:: html
.. raw:: html
.. raw:: latex
\diilbookstyleinputcell
.. code:: python
num_epochs = 10
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
.. figure:: output_softmax-regression-concise_75d138_54_0.svg
.. raw:: html
.. raw:: html
.. raw:: latex
\diilbookstyleinputcell
.. code:: python
num_epochs = 10
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
.. figure:: output_softmax-regression-concise_75d138_57_0.svg
.. raw:: html
.. raw:: html
As before, this algorithm converges to a solution that achieves a decent
accuracy, albeit this time with fewer lines of code than before.
Summary
-------
- Using high-level APIs, we can implement softmax regression much more
concisely.
- From a computational perspective, implementing softmax regression has
intricacies. Note that in many cases, a deep learning framework takes
additional precautions beyond these most well-known tricks to ensure
numerical stability, saving us from even more pitfalls that we would
encounter if we tried to code all of our models from scratch in
practice.
Exercises
---------
1. Try adjusting the hyperparameters, such as the batch size, number of
epochs, and learning rate, to see what the results are.
2. Increase the number of epochs for training. Why might the test
accuracy decrease after a while? How could we fix this?
.. raw:: html
.. raw:: html
`Discussions `__
.. raw:: html
.. raw:: html
`Discussions `__
.. raw:: html
.. raw:: html
`Discussions `__
.. raw:: html
.. raw:: html