Exploring the analog MAC operation
==================================
This example presents the non-spiking mode of the BrainScaleS-2 ASIC and
some of its characteristics. The operation of this so-called hagen mode
is explained in more detail in the matrix multiplication introduction.
In order to use the microscheduler we have to set some environment variables first:
.. include:: common_quiggeldy_setup.rst
First, we import some things needed later:
.. code:: ipython3
%matplotlib inline
import numpy as np
import torch
import hxtorch
import matplotlib as mpl
import matplotlib.pyplot as plt
from contextlib import suppress
with suppress(IOError):
plt.style.use("_static/matplotlibrc")
from _static.common.helpers import save_nightly_calibration
import ipywidgets as w
from functools import partial
IntSlider = partial(w.IntSlider, continuous_update=False)
The ``hxtorch`` API
-------------------
The hagen mode provides an analog multiply accumulate operation (MAC)
which is performed on the ASIC.
**hxtorch** provides a high-level API for this operation mode that
integrates this functionality into `PyTorch <https://pytorch.org/>`__.
In analogy to some functions of this machine-learning framework,
operations with similar API are provided, e.g. ``matmul`` for
multiplication of two matrices:
.. code:: ipython3
print(hxtorch.perceptron.matmul.__doc__)
.. parsed-literal::
:class: solution
matmul(input: at::Tensor, other: at::Tensor, num_sends: int = 1,
wait_between_events: int = 5, mock: bool = False) -> at::Tensor
Drop-in replacement for :meth:`torch.matmul` that uses HICANN-X.
The current implementation only supports ``other`` to be 1D or 2D.
:param input: First input tensor, allowed range [0, 31]
:param other: Second input tensor, allowed range: [-63, 63]
:param num_sends: How often to send the (same) input vector
:param wait_between_events: How long to wait (in FPGA cycles) between events
:returns: Resulting tensor
Before the hardware can be used, we have to allocate a connection and to
load a calibration. This can be achieved using ``hxtorch.init_hardware``:
.. code:: ipython3
# download claibration and initialize hardware configuration
save_nightly_calibration('hagen_cocolist.pbin')
hxtorch.init_hardware(hxtorch.CalibrationPath('hagen_cocolist.pbin'))
This already enables us to multiply matrices using the BSS-2 accelerator:
.. code:: ipython3
M1 = torch.full((100,), 15.)
M2 = torch.full((100, 10), 21.)
hxtorch.perceptron.matmul(M1, M2)
.. parsed-literal::
:class: solution
tensor([55., 60., 59., 56., 60., 63., 57., 58., 56., 62.])
``hxtorch`` integrates the MAC operation into PyTorch on a per-operation
basis (but also supports the combination of multiple operations) and is
executed just-in-time on the BrainScaleS-2 hardware.
.. image:: _static/tutorial/hxtorch_matmul.png
:width: 80%
:align: center
A decisive advantage of the matrix multiplication mode is the possibility
to decompose large operations and smaller parts and either multiplex them
in time or even divide them among several BrainScaleS-2 ASICs:
.. image:: _static/tutorial/hxtorch_partitioning.png
:width: 80%
:align: center
Noise and fixed-pattern deviations
----------------------------------
Despite calibration and even with the same inputs and weights, the
outputs of the different neurons are not identical. On the one hand,
each output has a statistical noise due to the analog nature of the
neuron, on the other hand, fixed-pattern deviations show up between the
individual neurons. Especially in the case of small inputs, a spatial
correlation may also become apparent, resulting from different distances
to the synapse drivers.
.. code:: ipython3
# prepare output figure
neurons = torch.arange(0, 256)
slices = [slice(0, 128), slice(128, 256)]
fig, axes = plt.subplots(1, 2, sharey=True)
for ax, s in zip(axes, slices):
ax.plot(neurons[s], torch.zeros_like(neurons[s]), ".", c="C0")
ax.set_xlim(s.start, s.stop); ax.set_ylim(-130, 130)
ax.xaxis.set_major_locator(mpl.ticker.MultipleLocator(32))
ax.set_xlabel("neuron #"); ax.set_ylabel("output"); ax.label_outer()
axes[0]; axes[0].invert_xaxis()
plt.close()
output = w.Output()
@w.interact(
num_sends=IntSlider(100, 1, 256, description="num sends"),
input_value=IntSlider(12, 0, 31, description="input value"),
weight_value=IntSlider(21, -63, 63, description="weight value"),
row_number=IntSlider(0, 0, 127, description="row number"),
)
def experiment(num_sends, input_value, weight_value, row_number):
""" Updates the plot with the outputs from the hardware """
result = hxtorch.perceptron.matmul(
torch.tensor([0.] * row_number + [input_value], dtype=torch.float),
torch.full((row_number + 1, 256), weight_value, dtype=torch.float),
num_sends=num_sends)
for ax, s in zip(axes, slices):
ax.lines[0].set_ydata(result[s])
output.clear_output(wait=True)
with output:
display(fig)
experiment(100, 12, 21, 0) # needed for testing
display(output)
.. image:: _static/tutorial/hagen_properties_fig1.png
:width: 90%
:align: center
:class: solution
.. image:: _static/tutorial/hagen_properties_sliders1.png
:width: 300px
:class: solution
Linearity of the MAC operation
------------------------------
The next plot shows the linear relationship between input, weight and
output. For this purpose, a constant input is multiplied by a linearly
increasing weight vector.
.. code:: ipython3
weight = torch.arange(-63, 64.).repeat_interleave(2)
# prepare output figure
fig, ax = plt.subplots(1, 1)
ax.plot(weight, torch.zeros_like(weight), ".", c="C0")
ax.set_xlim(-64, 64); ax.set_ylim(-130, 130)
ax.xaxis.set_major_locator(mpl.ticker.MultipleLocator(16))
ax.set_xlabel("weight"); ax.set_ylabel("output")
plt.close()
output = w.Output()
@w.interact(
num_sends=IntSlider(100, 1, 256, description="num sends"),
input_value=IntSlider(12, 0, 31, description="input value"),
row_number=IntSlider(0, 0, 127, description="row number"),
)
def experiment(num_sends, input_value, row_number):
""" Updates the plot with the outputs from the hardware """
result = hxtorch.perceptron.matmul(
torch.tensor([0.] * row_number + [input_value], dtype=torch.float),
weight.unsqueeze(0).expand(row_number + 1, -1),
num_sends=num_sends)
ax.lines[0].set_ydata(result)
output.clear_output(wait=True)
with output:
display(fig)
experiment(100, 12, 0) # needed for testing
display(output)
.. image:: _static/tutorial/hagen_properties_fig2.png
:width: 90%
:align: center
:class: solution
.. image:: _static/tutorial/hagen_properties_sliders2.png
:width: 300px
:class: solution
At output values of about -80 to 80 a good linear correlation can be
observed. For smaller or larger values, the used ADC saturates; this
happens earlier for some neurons and later for others.
Possible questions:
~~~~~~~~~~~~~~~~~~~
How does the result change with several successive calls to ``hxtorch.perceptron.matmul``?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Due to its analog nature, the BrainScaleS-2 ASIC provides slightly
different values for each call. Quantify the noise on each neuron!
What is the relationship between input and output? Is it linear?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We have seen that the relationship between weight and output is quite
linear at intermediate values. How, on the other hand, does the output
change with changing inputs and constant weight? Is the relationship
linear?
Negative inputs?
^^^^^^^^^^^^^^^^
The inputs to the multiply accumulate operation correspond to the time a
current flows on neuron membranes, which means they must be positive
only. How would it still be possible to allow negative inputs in a
calculation?
.. jupyter::
:cell-break:
The integration with PyTorch allows the MAC to be used very easily for
conventional machine learning. For this, the forward pass is computed with
the ASIC, the backward pass on the host computer. The example for training
DNNs shows such a usage.