Tutorial 6 - Network Calculus and Content Delivery Networks
Submission process:
- Submission deadline is February 1, 14:00 CET (before the lecture) .
- Commit and push your solution as single notebook file via git as ./tutorial/tutorial6/tutorial6.ipynb. Please take care of the correct subfolder/filename since submission is denied otherwise.
- During the first lecture after the deadline we will discuss a sample solution in class.
- Afterwards, you have time until February 8, 14:00 CET (before the lecture) to submit a corrected version of your submission:
- Rework your solution according to our discussion in class.
- Commit and push the corrected version as single file via git as ./tutorial/tutorial6/tutorial6.ipynb. Please take care of the correct filename since submission is denied otherwise.
Remarks:
- Grading is done based on both versions of your submission.
- If the first submission is missing your submission will not be graded.
- If the second submission contains major flaws after revision not more than half of the credits for this tutorial can be achieved.
- A sample solution is provided after February 8, 14:00 CET eventually.
- Please use acn@net.in.tum.de for questions regarding lecture, tutorial, and project of ACN.
Problem 1: Network Calculus (6 credits)
We will apply Network Calculus to the small network represented below:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
plt.figure(figsize=(5, 5))
plt.xlabel("Time")
plt.ylabel("Data")
plt.axis((-1,20,0,100))
# Use the following template:
# plt.plot([...], [...], label="Arrival curve")
# plt.plot([...], [...], label="Service curve")
# begin insert code
plt.plot([0, 0, 15], [0, 25, 100], label="Arrival curve")
plt.plot([0, 10, 20], [0, 0, 100], label="Service curve")
# end insert code
plt.legend()
plt.show()
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
plt.figure(figsize=(5, 5))
plt.xlabel("Time")
plt.ylabel("Data")
plt.axis((-1,20,0,100))
# Use the following template:
# plt.plot([...], [...], label="Arrival curve")
# plt.plot([...], [...], label="Service curve")
# plt.plot([...], [...], label="Delay bound")
# plt.plot([...], [...], label="Buffer bound")
# begin insert code
plt.plot([0, 0, 15], [0, 25, 100], label="Arrival curve")
plt.plot([0, 10, 20], [0, 0, 100], label="Service curve")
plt.plot([0, 12.5], [25, 25], label="Delay bound")
plt.plot([10, 10], [0, 75], label="Buffer bound")
# end insert code
plt.legend()
plt.show()
c) [0.5 credits] What is the latency/delay bound of the flow after S1?
- Can be determined graphically (see the previous plot)
- Or also mathematically for the general case (any curves):
$$ \begin{align*} & \sup_{t \geq 0} \left\{ \inf_{s \geq 0} \{ \gamma_{5,25}(t) \leq \beta_{10,10}(t+s) \} \right\} \\ = & \sup_{t \geq 0} \left\{ \inf_{s \geq 0} \{ 5 t + 25 \leq 10 (t + s - 10) \} \right\} \\ = & \sup_{t \geq 0} \left\{ \inf_{s \geq 0} \{ 25 \leq 2 (t + s) \} \right\} \\ = & 12.5 \end{align*} $$
- Or for the simplified case (token-bucket and rate-latency):
\begin{align*} d = & T + \frac{b}{R} \\ = & 10 + \frac{25}{10} \\ = & 12.5 \end{align*}
d) [0.5 credits] What is the arrival curve of the flow after it has traversed the server S1?
- Use the definition of the output envelope: $$\alpha^*(t) = (\alpha \oslash \beta)(t)$$
- In this case, we have: $$(\gamma_{5,25} \oslash \beta_{10,10})(t) = \gamma_{5,25 + 5 \cdot 10}(t) = \gamma_{5, 75}(t)$$
e) [0.25 credits] Using the two previous results, what is the end-to-end latency/delay bound of the flow?
- Use the result of the previous question to compute the latency for traversing S2
- The delay for S2 is:
\begin{align*} d_{S2} = & T_{S2} + \frac{b^{\star}}{R_{S2}} \\ = & 10 + \frac{75}{10} \\ = & 17.5 \end{align*}
- The end-to-end delay (S1 and S2) is: $$12.5 + 17.5 = 30$$
f) [0.5 credits] Apply the concatenation theorem to compute the end-to-end latency/delay bound of the flow and interpret the result.
The concatenation of S1 and S2 gives: $$ \begin{align*} (\beta_{10,10} \otimes \beta_{10,10})(t) & = \beta_{\min(10, 10), 10 + 10}(t) \\ & = \beta_{10,20}(t) \end{align*} $$
The end-to-end delay is then computed using $\gamma_{5,25}$ and $\beta_{10,20}$
\begin{align*} d = & 20 + \frac{25}{10} \\ = & 22.5 \end{align*}
- The result is $22.5$, which is less than the $30$ previously calculated, due to Pay-Bursts-Only-Once
Consider the following larger network with multiple flows. We are interested in computing the latency/delay bound for flow $f_1$.
Assume all servers use strict priority queuing as scheduling algorithm. Assume that $f_1$ has the lowest priority while $f_2$ and $f_3$ have the highest priority. Furthermore, assume that the scheduling is preemptive.
g) [0.25 credits] Expain the difference between preemptive and non-preemptive scheduling.
- Preemptive scheduling interrupts packet transmissions if the scheduler decides to serve a different packet.
- Non-preemptive scheduling does not interrupt the transmission of packets once started.
h) [2.5 credits] Use the three steps of the Separate Flow Analysis to calculate the latency/delay bound for $f_1$. Clearly differentiate between the three steps in your answer.
Left-over service curves:
$f_1$ at $s_1$: $\beta_{10,10}$
$f_1$ at $s_2$: $\beta_{10,10} - \gamma_{3,10} : R=7, T=\frac{110}{7}$
$f_1$ at $s_3$: $\beta_{10,10} - \gamma^{\star}_{3,10} -\gamma_{1,5} : R=6, T=\frac{145}{6}$ with $\gamma^{\star}_{3,10} = \gamma_{3,10+3\cdot10}$
Concatenation:
$$ \begin{align*} & \beta_{e2e}^{f_1} = \beta_{10,10} \otimes \beta_{7,\frac{110}{7}} \otimes \beta_{6,\frac{145}{6}} \\ & \beta_{e2e}^{f_1} \approx \beta_{6,49.88} \\ \end{align*} $$
- Bounds:
$$ \begin{align*} & d = 49.88 + \frac{25}{6} \approx 54.05 \\ \end{align*} $$
i) [0.5 credits] Assume that the priorities of the flows are switched (i.e. $f_1$ is the highest priority and $f_2$ and $f_3$ are the lowest priority). Use the Separate Flow Analysis to calculate the latency/delay bound of $f_1$ traversing the network.
$$ \begin{align*} & d = 30 + \frac{25}{10} = 32.5 \\ \end{align*} $$
j) [0.5 credits] Assume a flow with an arrival curve $\gamma_{10,5}$ traverses a server with a service curve $\beta_{7,3}$. Give an explanation why the latency/delay bound is infinity.
The arrival curve rate is larger than service curve rate, therefore the two curves diverge and the maximum horizontal distance between them is infinity.
Problem 2 Content Delivery Networks (4.0 credits)
For the following exercise an access log of a webserver is evaluated. The access logs are published by NASA for their web infrastructure back in 1995. To conserve resources on our VM servers, the log was reduced to 25k entries and can be found on the ACN homepage.
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
# Download RTT CSV from ACN website
!wget -N https://acn.net.in.tum.de/exercise/tutorial6_accesslog.npz
# Load file into Python
data = np.load('tutorial6_accesslog.npz')['data']
#print sample
print(data[:5])
--2024-02-12 13:33:47-- https://acn.net.in.tum.de/exercise/tutorial6_accesslog.npz Resolving acn.net.in.tum.de (acn.net.in.tum.de)...
2a00:4700:0:9:f::1, 188.95.232.11 Connecting to acn.net.in.tum.de (acn.net.in.tum.de)|2a00:4700:0:9:f::1|:443... connected. HTTP request sent, awaiting response...
200 OK Length: 16300262 (16M) [application/octet-stream] Saving to: ‘tutorial6_accesslog.npz’ tutorial6_accesslog 0%[ ] 0 --.-KB/s
tutorial6_accesslog 2%[ ] 351,68K 1,57MB/s
tutorial6_accesslog 4%[ ] 655,68K 1,41MB/s
tutorial6_accesslog 5%[> ] 911,68K 1,33MB/s
tutorial6_accesslog 7%[> ] 1,19M 1,32MB/s
tutorial6_accesslog 9%[> ] 1,45M 1,31MB/s
tutorial6_accesslog 11%[=> ] 1,83M 1,39MB/s
tutorial6_accesslog 14%[=> ] 2,23M 1,47MB/s
tutorial6_accesslog 16%[==> ] 2,56M 1,49MB/s
tutorial6_accesslog 19%[==> ] 2,97M 1,54MB/s
tutorial6_accesslog 21%[===> ] 3,31M 1,56MB/s
tutorial6_accesslog 23%[===> ] 3,66M 1,56MB/s
tutorial6_accesslog 23%[===> ] 3,69M 1,45MB/s
tutorial6_accesslog 27%[====> ] 4,23M 1,54MB/s
tutorial6_accesslog 29%[====> ] 4,58M 1,55MB/s
tutorial6_accesslog 31%[=====> ] 4,86M 1,54MB/s eta 7s
tutorial6_accesslog 33%[=====> ] 5,14M 1,53MB/s eta 7s
tutorial6_accesslog 35%[======> ] 5,45M 1,53MB/s eta 7s
tutorial6_accesslog 37%[======> ] 5,78M 1,53MB/s eta 7s
tutorial6_accesslog 38%[======> ] 5,97M 1,48MB/s eta 7s
tutorial6_accesslog 41%[=======> ] 6,41M 1,53MB/s eta 6s
tutorial6_accesslog 43%[=======> ] 6,75M 1,56MB/s eta 6s
tutorial6_accesslog 45%[========> ] 7,11M 1,56MB/s eta 6s
tutorial6_accesslog 47%[========> ] 7,33M 1,49MB/s eta 6s
tutorial6_accesslog 48%[========> ] 7,55M 1,45MB/s eta 6s
tutorial6_accesslog 49%[========> ] 7,70M 1,38MB/s eta 5s
tutorial6_accesslog 50%[=========> ] 7,86M 1,33MB/s eta 5s
tutorial6_accesslog 51%[=========> ] 7,94M 1,24MB/s eta 5s
tutorial6_accesslog 51%[=========> ] 8,03M 1,25MB/s eta 5s
tutorial6_accesslog 52%[=========> ] 8,23M 1,13MB/s eta 5s
tutorial6_accesslog 53%[=========> ] 8,39M 1,04MB/s eta 6s
tutorial6_accesslog 56%[==========> ] 8,76M 1,08MB/s eta 6s
tutorial6_accesslog 57%[==========> ] 9,01M 1,06MB/s eta 6s
tutorial6_accesslog 58%[==========> ] 9,17M 1,02MB/s eta 6s
tutorial6_accesslog 60%[===========> ] 9,47M 1,03MB/s eta 6s
tutorial6_accesslog 63%[===========> ] 9,80M 1,02MB/s eta 5s
tutorial6_accesslog 64%[===========> ] 10,03M 1,02MB/s eta 5s
tutorial6_accesslog 67%[============> ] 10,42M 1,03MB/s eta 5s
tutorial6_accesslog 68%[============> ] 10,58M 1017KB/s eta 5s
tutorial6_accesslog 69%[============> ] 10,84M 983KB/s eta 5s
tutorial6_accesslog 71%[=============> ] 11,08M 1012KB/s eta 4s
tutorial6_accesslog 73%[=============> ] 11,39M 1,02MB/s eta 4s
tutorial6_accesslog 74%[=============> ] 11,59M 1,05MB/s eta 4s
tutorial6_accesslog 76%[==============> ] 11,91M 1,09MB/s eta 4s
tutorial6_accesslog 77%[==============> ] 12,11M 1,13MB/s eta 4s
tutorial6_accesslog 79%[==============> ] 12,41M 1,17MB/s eta 2s
tutorial6_accesslog 80%[===============> ] 12,55M 1,21MB/s eta 2s
tutorial6_accesslog 81%[===============> ] 12,66M 1,13MB/s eta 2s
tutorial6_accesslog 82%[===============> ] 12,89M 1,21MB/s eta 2s
tutorial6_accesslog 84%[===============> ] 13,19M 1,18MB/s eta 2s
tutorial6_accesslog 86%[================> ] 13,47M 1,19MB/s eta 2s
tutorial6_accesslog 88%[================> ] 13,73M 1,17MB/s eta 2s
tutorial6_accesslog 90%[=================> ] 14,06M 1,18MB/s eta 2s
tutorial6_accesslog 92%[=================> ] 14,41M 1,18MB/s eta 2s
tutorial6_accesslog 94%[=================> ] 14,76M 1,23MB/s eta 2s
tutorial6_accesslog 96%[==================> ] 15,03M 1,28MB/s eta 0s
tutorial6_accesslog 98%[==================> ] 15,34M 1,27MB/s eta 0s
tutorial6_accesslog 100%[===================>] 15,54M 1,26MB/s in 12s 2024-02-12 13:34:00 (1,27 MB/s) - ‘tutorial6_accesslog.npz’ saved [16300262/16300262]
['199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245' 'unicomp6.unicomp.net - - [01/Jul/1995:00:00:06 -0400] "GET /shuttle/countdown/ HTTP/1.0" 200 3985' '199.120.110.21 - - [01/Jul/1995:00:00:09 -0400] "GET /shuttle/missions/sts-73/mission-sts-73.html HTTP/1.0" 200 4085' 'burger.letters.com - - [01/Jul/1995:00:00:11 -0400] "GET /shuttle/countdown/liftoff.html HTTP/1.0" 304 0' '199.120.110.21 - - [01/Jul/1995:00:00:11 -0400] "GET /shuttle/missions/sts-73/sts-73-patch-small.gif HTTP/1.0" 200 4179']
a) [1 credits] Create the access histogram from the access log. The function create_access_histogram() gets a csv reader as input and should return a list of tuples with the filename as first tuple element and the number of accesses as second element. The list should be sorted with the file with the highest number of accesses being in the first place.
Note: For the purpose of this exercise your function should return only the 50 most accessed files.
def create_access_histogram(rdr):
# begin insert code
histo = {}
for row in rdr:
# Remark:
# For modern logs one must also mind the version of the HTTP protocol,
# i.e. add the numbers for HTTP/1.0, HTTP/1.1, and HTTP/2.
# This check is unnecessary here because in 1995, when the log was
# recoreded only HTTP/1.0 was available. So in our case we check only
# for the name of the file.
row = row.split('"')
filepath = row[1].split(" ")[1]
if filepath in histo:
histo[filepath] = histo[filepath] + 1
else:
histo[filepath] = 1
return sorted(histo.items(), key=lambda x:x[1], reverse=True)[:50]
# end insert code
return []
hist = create_access_histogram(data)
print(hist)
[('/images/NASA-logosmall.gif', 1662), ('/images/KSC-logosmall.gif', 1406), ('/shuttle/countdown/count.gif', 1056), ('/shuttle/countdown/', 1053), ('/images/MOSAIC-logosmall.gif', 624), ('/images/ksclogo-medium.gif', 620), ('/images/USA-logosmall.gif', 619), ('/images/WORLD-logosmall.gif', 612), ('/shuttle/missions/sts-71/sts-71-patch-small.gif', 606), ('/shuttle/missions/sts-71/images/images.html', 567), ('/images/launch-logo.gif', 547), ('/shuttle/countdown/liftoff.html', 506), ('/shuttle/countdown/video/livevideo.gif', 502), ('/shuttle/missions/sts-71/mission-sts-71.html', 497), ('/', 442), ('/images/ksclogosmall.gif', 382), ('/ksc.html', 368), ('/shuttle/missions/missions.html', 331), ('/images/launchmedium.gif', 268), ('/htbin/cdt_main.pl', 177), ('/icons/menu.xbm', 161), ('/history/apollo/images/apollo-logo.gif', 160), ('/icons/blank.xbm', 159), ('/shuttle/missions/sts-71/movies/movies.html', 159), ('/history/apollo/images/footprint-small.gif', 153), ('/shuttle/resources/orbiters/orbiters-logo.gif', 144), ('/history/apollo/apollo.html', 144), ('/history/apollo/images/footprint-logo.gif', 138), ('/history/apollo/apollo-13/apollo-13.html', 130), ('/shuttle/countdown/countdown.html', 122), ('/icons/image.xbm', 119), ('/history/history.html', 118), ('/shuttle/technology/sts-newsref/stsref-toc.html', 117), ('/icons/text.xbm', 110), ('/shuttle/resources/orbiters/atlantis.html', 110), ('/history/apollo/apollo-13/apollo-13-patch-small.gif', 97), ('/shuttle/resources/orbiters/atlantis-logo.gif', 94), ('/shuttle/missions/sts-71/images/KSC-95EC-0918.jpg', 93), ('/history/apollo/images/apollo-small.gif', 92), ('/software/winvn/winvn.html', 91), ('/images/construct.gif', 82), ('/software/winvn/winvn.gif', 81), ('/software/winvn/bluemarb.gif', 78), ('/software/winvn/wvsmall.gif', 75), ('/htbin/cdt_clock.pl', 75), ('/shuttle/missions/sts-71/images/KSC-95EC-0917.jpg', 74), ('/shuttle/missions/sts-71/movies/crew-arrival-t38.mpg', 74), ('/images/shuttle-patch-small.gif', 73), ('/images/kscmap-tiny.gif', 71), ('/shuttle/missions/sts-71/images/index71.gif', 66)]
b) [0.5 credits] Use a bar plot to visualize the access histogram.
Hint: Use a bar chart for visualization.
def plot_histogram(lst):
plt.figure(3,figsize=(20,10))
# begin insert code
N = len(lst)
x = range(N)
plt.bar(x, [i[1] for i in lst], align='center')
plt.xticks(x)
# end insert code
plt.xlabel('File rank')
plt.ylabel('# Accesses')
plt.title('Distribution of file accesses')
plt.show()
plot_histogram(hist)
c) [0.5 credits] Create a plot displaying the access probability for every file. Consider the evaluated histogram being representative for the access probability, i.e., you would see the same distribution for a larger access log file.
For the visualization, we use a so called CDF - a cumulative density function. This type of plot shows the access probability for the most accessed file, after that the access probability for the second most accessed files and so on.
Hint: Use a bar chart for visualization.
def plot_histogram(lst):
plt.figure(3,figsize=(20,10))
# begin insert code
cuml = sum(i[1] for i in lst)
values = []
tmp = 0
for _, v in lst:
values.append((v/cuml) + tmp)
tmp = tmp + v/cuml
N = len(values)
x = range(N)
plt.bar(x, values, align='center')
plt.xticks(x)
# end insert code
plt.xlabel('File rank')
plt.ylabel('# Cumulated access probability')
plt.title('Distribution of file accesses')
plt.show()
plot_histogram(hist)
d) [0.5 credits] Using the visualization of the previous subproblem. How many of the most popular files must be cached to answer >50% of the requests.
The files with the rank 0 to 8 make up over 50% of the requests in our histogram.
e) [0.5 credits] NASA currently distributes its users to three 3 web caches ($cache_{id}$: 0, 1, 2). The users are assigned to the web cache using the given hash_function(). Create the function resulting a dictionary mapping every user to the respective cache.
def hash_function(usr, num_caches):
return usr % num_caches
def assign_cache(users, num_caches):
# begin insert code
ndict = {}
for user in users:
ndict[user] = hash_function(user, num_caches)
return ndict
# end insert code
return {u: 0 for u in users}
# range(10) creates a list of 10 users
print("10 users: ", assign_cache(range(10), 3))
10 users: {0: 0, 1: 1, 2: 2, 3: 0, 4: 1, 5: 2, 6: 0, 7: 1, 8: 2, 9: 0}
f) [0.5 credits] NASA upgrades their server infrastructure and now has 4 web caches. How many users must be reassigned to a new web cache, when reusing the previous hash function? To answer that question, create a function compare_mappings() which compares two different user to cache mappings and results the number of mappings differing.
def compare_mappings(map3, map4):
# begin insert code
inequal = 0
for k in map3:
if map3[k] != map4[k]:
inequal += 1
return inequal
# end insert code
return 0
diff = compare_mappings(assign_cache(range(10), 3), assign_cache(range(10), 4))
print('3 caches: ', assign_cache(range(10), 3))
print('4 caches: ', assign_cache(range(10), 4))
print('Remapped users: ', diff)
print('Relative distribution: ', diff/len(assign_cache(range(10), 3)))
3 caches: {0: 0, 1: 1, 2: 2, 3: 0, 4: 1, 5: 2, 6: 0, 7: 1, 8: 2, 9: 0} 4 caches: {0: 0, 1: 1, 2: 2, 3: 3, 4: 0, 5: 1, 6: 2, 7: 3, 8: 0, 9: 1} Remapped users: 7 Relative distribution: 0.7
g) [0.5 credits] Compare the value calculated in the previous subproblem to the potential reallocation when using consistent hashing. How many clients must be remapped when using consistent hashing?
If consistent hashing is used $\frac{K}{N}$ of all users must be remapped, where K is the number of users and N is the number of web caches. $\frac{10}{4} = 2.5$.
Advanced Computer Networking by Prof. Dr.-Ing. Georg Carle
Teaching assistants: Sebastian Gallenmüller, Benedikt Jaeger, Max Helm, Patrick Sattler, Johannes Zirngibl, Marcel Kempf