Tutorial 1 - Introduction
Submission process:
- Submission deadline is November 2, 14:00 CET (before the lecture) .
- Commit and push your solution as single notebook file via git as ./tutorial/tutorial1/tutorial1.ipynb. Please take care of the correct subfolder/filename since submission is denied otherwise.
- During the first lecture after the deadline we will discuss a sample solution in class.
- Afterwards, you have time until November 9, 14:00 CET (before the lecture) to submit a corrected version of your submission:
- Rework your solution according to our discussion in class.
- Commit and push the corrected version as single file via git as ./tutorial/tutorial1/tutorial1.ipynb. Please take care of the correct filename since submission is denied otherwise.
Remarks:
- Grading is done based on both versions of your submission.
- If the first submission is missing your submission will not be graded.
- If the second submission contains major flaws after revision not more than half of the credits for this tutorial can be achieved.
- A sample solution is provided after November 9, 14:00 CET eventually.
- Please use acn@net.in.tum.de for questions regarding lecture, tutorial, and project of ACN.
Problem 1 GIT Access [1.5 credits]
This exercise should make you familiar with the git repositories.
Git is a distributed version control system, originally developed by Linus Torvalds for maintaining the Linux kernel. It became popular in professional software development over the past few years due its features, stability, and efficiency. We use git in this course as a version control system for both submissions of exercises as well as the project, i.e., we provide you with a git repository structure hosted on the LRZ Gitlab. You can clone your repository and push changes to the server using SSH or HTTPS with or without a deploy token.
a) [0.5 credits] Explain the differences between the git commands add, commit, and push.
According to git man:
git-add (1) Add file contents to the index.
git-commit (1) Record changes to the repository.
git-push (1) Update remote refs along with associated objects.
b) [0.5 credits] Save your current changes to this Jupyter notebook. Add the file to a new commit and push to remote, the pull again. Execute the command git tag and paste the output here.
Explain the meaning of the output.
The output looks like this:
submission/1476987868
submission/1476989062
The git server responded with a list of tags for every successful commit pushed to the server. Each of those tags points to a specific commit in the history. These tags contain a timestamp of the commit (server time, not local machine time). We use this timestamp to check if the hand-in of the solution was done before the deadline.
c) [0.5 credits] Create and push a new branch called grades. Paste the commands used to do so into your answer. Explain what happens.
A remote branch grades already exists. The remote branch is a protected branch, which means that only authorized users may push to it.
Problem 2 SSH and Virtual Machine (VM) Access [3.0 credits]
If you can see this notebook you probably followed directions from the infrastructure introduction. To hand in your answers, please write them down in the prepared cells.
a) [0.5 credits] What is SSH and what it is being used for?
Secure shell, used for secure remote shell connections, uses TCP (default: port 22).
b) [0.5 credits] What is the difference between public-key and password authentication as offered by SSH?
- password authentication: uses a shared secret (password) between ssh-server and ssh-client for authentication
- public-key authentication: ssh-server and ssh-client each have their own key pair; knowing the public key of such a key pair allows authenticating the respective private key (and therefore its owner)
c) [1.0 credits] SSH can be used to connect to your personal VM and to clone your personal git repository.
You can connect to your personal VM using the command: ssh -L localhost:1337:localhost:1337 root@svmNNNN.net.in.tum.de, where NNNN is your UID. Explain in detail what this command does.
ssh -L localhost:1337:localhost:1337 root@svmNNNN.net.in.tum.de
-
ssh -L
creates an SSH connection with local port forwarding
-
localhost:1337
forwards the port 1337 from localhost
-
:localhost:1337
to localhost port 1337 on the remote host
-
root@
login on the remote host as root
-
svmNNN0.net.in.tum.de
FQDN of the remote host
d) [1.0 credits] Connect to your virtual machine using SSH and execute the following three commands:
- whoami
- uname -a
- pwd
Paste the output of each of the three commands into your answer and explain what each command does.
whoami: display effective user id
uname -a: display information about the system
pwd: return working directory name
Problem 3 Jupyter Introduction (0 credits)
We will shortly introduce Jupyter and its functionality: Jupyter is an interactive programming interface running in your browser. We use the interactive Python interface.
This problem does not give any bonus credits and its purpose is to familiarize yourself with jupyter.
a) [0 credits] Jupyter notebooks consist of cells with different types, e.g., code and markdown. Explain the differences between those two types and what they can be used for.
To run cells (run Python code or show layouted markdown) either press the "Play/Run" button on top of the page or press Shift+Enter.
Code type cells fail to execute when the content is not valid Python code. We use automated tests to validate your exercises. If your solution fails to execute due to syntax errors, the automated test will also fail. Such hand-ins will be graded with 0 credits. Therefore, always execute your solution at least once on the provided virtual machine. This way you can make sure the provided solution works in our environment.
b) [0 credits] The following cell contains the typical format for programming exercises. This function is used to test your solution. Never change the name and only insert code in between the following comments:
# begin insert code ... your code goes here ... # end insert codeThe print command will show your result in the cell output.
Errors should never occur in your handed-in notebook. Fix this code by defining the hello_world variable, assigning it a value, and returning it.
def hello_world_text():
# begin insert code
hello_world = 'Hello World!'
return hello_world
# end insert code
return None
print(hello_world_text())
Hello World!
d) [0 credits] Code cells also allow to execute shell commands. These are executed as the user who started the jupyter server. On your virtual machine this is root. Shell commands can be executed by prefixing them with the '!' character.
# just execute this cell
!pwd # the path where jupyter has been started
!echo This user is executing the commands: $USER
!ping -c 1 net.in.tum.de
/home/zirngibl/i8/teaching/acn/exercise/2023/jupyter This user is executing the commands: zirngibl PING net.in.tum.de(unicorn.net.in.tum.de (2001:4ca0:2001:13:250:56ff:feba:37ac)) 56 data bytes 64 bytes from unicorn.net.in.tum.de (2001:4ca0:2001:13:250:56ff:feba:37ac): icmp_seq=1 ttl=63 time=0.651 ms --- net.in.tum.de ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.651/0.651/0.651/0.000 ms
e) [0 credits] In some exercise sheets we will use such shell commands to install missing Python modules.
Note that Python 3.11 (as it runs on your VM) wants you to install packages in a virtual environment (venv) to not interfere with system-site packages. You will be shown an error if you try to do so without any venv. If you do not know how to setup a venv you can follow this tutorial. Make sure that your jupyter process is executed within the venv.
Otherwise you can also just ignore the warning by adding the --break-system-packages
to your pip install command.
Or ignore it permantently by removing this file on your VM: /usr/lib/python3.11/EXTERNALLY-MANAGED
.
!pip3 install matplotlib
Requirement already satisfied: matplotlib in /home/zirngibl/.local/lib/python3.9/site-packages (3.5.1) Requirement already satisfied: numpy>=1.17 in /home/zirngibl/.local/lib/python3.9/site-packages (from matplotlib) (1.24.1) Requirement already satisfied: pyparsing>=2.2.1 in /usr/lib/python3/dist-packages (from matplotlib) (2.4.7) Requirement already satisfied: kiwisolver>=1.0.1 in /home/zirngibl/.local/lib/python3.9/site-packages (from matplotlib) (1.3.2) Requirement already satisfied: python-dateutil>=2.7 in /home/zirngibl/.local/lib/python3.9/site-packages (from matplotlib) (2.8.2) Requirement already satisfied: pillow>=6.2.0 in /usr/lib/python3/dist-packages (from matplotlib) (8.1.2) Requirement already satisfied: packaging>=20.0 in /usr/lib/python3/dist-packages (from matplotlib) (20.9) Requirement already satisfied: cycler>=0.10 in /home/zirngibl/.local/lib/python3.9/site-packages (from matplotlib) (0.11.0) Requirement already satisfied: fonttools>=4.22.0 in /home/zirngibl/.local/lib/python3.9/site-packages (from matplotlib) (4.29.1) Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.7->matplotlib) (1.16.0)
In order to visualize measurement data tools such as matplotlib are very useful. Jupyter notebook supports inline presentation of matplotlib plots. This is a convenient way to visualize data.
In this exercise you can see how matplotlib is used. Just execute the first cell. It should show you a histogram of datapoints following a normal distribution.
# This enables inline plots
%matplotlib inline
# Example taken from https://matplotlib.org/3.1.1/gallery/statistics/histogram_features.html
import numpy as np
import matplotlib.pyplot as plt
# example data
mu = 0 # mean of distribution
sigma = 1 # standard deviation of distribution
x = mu + sigma * np.random.randn(10000) # 10k data samples
num_bins = 50
fig, ax = plt.subplots()
# the histogram of the data
counts, bins, patches = ax.hist(x, num_bins)
ax.set_xlabel('Value')
ax.set_ylabel('Count')
ax.set_title(r'Histogram of a normal distribution: $\mu=0$, $\sigma=1$')
plt.show()
fig, ax = plt.subplots()
# begin insert code
ax.hist(x, num_bins, cumulative=True, density=True)
# end insert code
ax.set_xlabel('Value')
ax.set_ylabel('Cumulated Probability')
ax.set_title(r'CDF of normal distribution: $\mu=0$, $\sigma=1$')
plt.show()
Problem 4 IPv6 (3.5 credits)
IPv6 is the successor if IPv4. Instead of 32 bits, 128 bits are used for each address. This offers enough space for many different address types. However, the text representation of the addresses becomes more complex as well.
a) [1 credits] Write a function convert_ipv6.
The function receives a bytearray containing a valid IPv6 address and should return the shortest possible string representation of the address. Make sure the following requirements are fulfilled.
- remove leading zeros for each byte-pair (do not remove trailing ones)
- the longest serie of consecutive 0s can be merged with ::
- your function implementation should be able to correctly handle any given IPv6 address
Remark: For this problem you are not allowed to use any modules like ipaddress.
def convert_ipv6(address):
# begin insert code
out = []
zero_len = 0
zero_start = None
max_zero_len = 0
max_zero_start = None
for i, b in enumerate(address[::2]):
part = '{:02x}{:02x}'.format(address[i*2], address[i*2+1])
part = part.lstrip('0') or '0'
if part == '0':
if zero_start is None:
zero_start = i
zero_len = 0
zero_len += 1
if part != '0' or i == 7:
if zero_len > max_zero_len:
max_zero_len = zero_len
max_zero_start = zero_start
zero_start = None
out.append(part)
if max_zero_start is not None:
return '::'.join([
':'.join(out[:max_zero_start]),
':'.join(out[max_zero_start+max_zero_len:])
])
return ':'.join(out)
# end insert code
return str(address)
ipv6 = bytearray(b'\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xff\xd7\x6d\xa0')
# should convert to ff02::1:ffd7:6da0
convert_ipv6(ipv6)
'ff02::1:ffd7:6da0'
Stateless Address Auto Configuration (SLAAC) is a protocol to automatically set up Link-Local addresses. RFC 4291 describes in Section 2.5.6 how Link-Local addresses are computed using the device's MAC address.
b) [0.5 credits] Write a function generate_link_local.
The functions receives a bytearray containing a MAC address and should return a bytearray with the corresponding Link-Local address.
Remark: For this problem you are not allowed to use any modules like ipaddress.
def generate_link_local(mac):
# begin insert code
addr = bytearray(16) # create an empty bytearray of size 16
addr[:2] = b'\xfe\x80' # set Link-Local prefix
addr[8] = mac[0] ^ 0x02 # flip 7th bit of the first octet of the mac address
addr[9:] = mac[1:3] # set remaining 2 octets from the mac address
addr[11:] = b'\xff\xfe' # insert 0xfffe
addr[13:] = mac[3:] # set last 3 bytes of the mac address
return addr
# end insert code
return bytearray(16)
mac = bytearray(b'\x01\x02\x03\x04\x05\x06')
convert_ipv6(generate_link_local(mac ))
'fe80::302:3ff:fe04:506'
IPv6 Multicast Address spaces are defined in RFC 4291 in Section 2.7. For Neigbor Discovery the Solicited-Node Address is used.
c) [0.5 credits] Write two functions:
- compute_solicited_node_multicast, which gets as input a bytearray containing an IPv6 address and returns a bytearray containing the Solicited-Node Multicast address of the IPv6 address.
- compute_multicast_mac, which gets as input a bytearray containing an IPv6 address and returns a bytearray containing the multicast MAC address as specified in RFC 2464 in Section 7.
def compute_solicited_node_multicast(ipv6):
# begin insert code
# From RFC4291: FF02:0:0:0:0:1:FFXX:XXXX
# With XX:XXXX being the last 3 bytes of the IPv6 address
addr = bytearray(16)
addr[:2] = b'\xff\x02'
addr[11:12] = b'\x01'
addr[12:] = b'\xff'
addr[13:] = bytearray(ipv6[-3:])
return addr
# end insert code
return ipv6
def compute_multicast_mac(ipv6):
# begin insert code
solicited = compute_solicited_node_multicast(ipv6)
addr = bytearray(6)
addr[0:2] = b'\x33\x33'
addr[2:] = solicited[-4:]
return addr
# end insert code
return bytearray(6)
ipv6 = bytearray(b'\x20\x01\x4c\xa0\x20\x01\x00\x40\xe1\x14\x90\xfe\x38\x62\x55\x4f')
print(convert_ipv6(compute_solicited_node_multicast(ipv6)))
mac = compute_multicast_mac(ipv6)
print('{:02x}:{:02x}:{:02x}:{:02x}:{:02x}:{:02x}'.format(*mac))
ff02::1:ff62:554f 33:33:ff:62:55:4f
In the next subproblem you will analyze how IPv6 addresses are distributed. First you will download a file from the ACN website which contains two datasets with different IPv6 addresses.
Run the following cell to download the file and print some sample values from each dataset.
# Download IPv6 address lists from ACN website
!wget -N https://acn.net.in.tum.de/exercise/ipv6_dataset.npz
# Load file into Python
import numpy as np
data = np.load('ipv6_dataset.npz')
# Print sample data
print('Dataset 1:')
print('\n'.join(map(convert_ipv6, data['dataset1'][:3])))
print('\nDataset 2:')
print('\n'.join(map(convert_ipv6, data['dataset2'][:3])))
--2023-11-30 16:25:45-- https://acn.net.in.tum.de/exercise/ipv6_dataset.npz Resolving acn.net.in.tum.de (acn.net.in.tum.de)... 2a00:4700:0:9:f::1, 188.95.232.11 Connecting to acn.net.in.tum.de (acn.net.in.tum.de)|2a00:4700:0:9:f::1|:443... connected. HTTP request sent, awaiting response... 304 Not Modified File ‘ipv6_dataset.npz’ not modified on server. Omitting download. Dataset 1: 8961:9391:9596:a6c8:bebf:4624:befc:d2e1 b08d:a07b:88a2:5cce:e06a:ca30:84a3:ae78 6fe6:f3d0:bd90:f2b3:8cf2:e222:2a59:d4b6 Dataset 2: 19b0:3e8d:50a7:9ac4:20::1 efe4:a1b1:8bba:ca58::1 8241:fa7b:6b37:5a0d::200:0:1
d) [0.5 credits] Write a function count_ones.
The function receives a bytearray containing an IPv6 address and should return the number of bits set to 1 in the last 64 bits of the IPv6 address.
def count_ones(ipv6):
# begin insert code
ipv6 = ''.join(map(lambda i: "{0:08b}".format(i), ipv6[8:])) # convert address to bitstring
return ipv6.count('1')
# end insert code
return 0
Once you finished the previous subproblem you can run the following cell which plots the distributions of bits set to one for each dataset.
%matplotlib inline
import matplotlib.pyplot as plt
def plot_histogram(dataset1, dataset2):
fig, axis = plt.subplots(1, 2, figsize=(15, 4))
axis[0].hist(list(map(count_ones, dataset1)), bins=64, range=(0,64), density=True)
axis[1].hist(list(map(count_ones, dataset2)), bins=64, range=(0,64), density=True)
axis[0].set(xlabel='# of 1', ylabel='Density', title='Dataset 1')
axis[1].set(xlabel='# of 1', ylabel='Density', title='Dataset 2')
fig.show()
plot_histogram(data['dataset1'], data['dataset2'])
/tmp/ipykernel_2903842/3498065665.py:11: UserWarning: Matplotlib is currently using module://matplotlib_inline.backend_inline, which is a non-GUI backend, so cannot show the figure. fig.show()
e) [1 credits] Explain how the addresses in the two datasets differ. Give a reason for the differences and what kind of addresses are most likely contained in each dataset.
The number of ones of the addresses in dataset1 approximates a normal distribution with a mean of 32. This means that each individual bit is set to 1 with a probability of 0.5. The reason for this is that the addresses were generated automatically using the IPv6 privacy extension.
The addresses in dataset2 follow a different distribution. There only a few bits are set to 1, which results from these addresses being assigned manually (e.g. by network admins) and tend to a more human readable form.
Advanced Computer Networking by Prof. Dr.-Ing. Georg Carle
Teaching assistants: Sebastian Gallenmüller, Benedikt Jaeger, Max Helm, Patrick Sattler, Johannes Zirngibl, Marcel Kempf