Cloud-init bi-weekly status
Posted on Mon 10 June 2019 in status-meeting-minutes • 11 min read
Meeting information
- #cloud-init: Cloud-init bi-weekly status, 10 Jun at 16:19 — 17:31 UTC
- Full logs at [[http://ubottu.com/meetingology/logs/cloud-init/2019/cloud-init.2019-06-10-16.19.log.html]]
Meeting summary
LINK: https://cloud-init.github.io
Previous Actions
The discussion about "Previous Actions" started at 16:23.
Recent Changes
The discussion about "Recent Changes" started at 16:24.
In Progress Development
The discussion about "In Progress Development" started at 16:30.
Office Hours
The discussion about "Office Hours" started at 16:45.
Office Hours (next ~30 mins)
The discussion about "Office Hours (next ~30 mins)" started at 16:48.
- LINK: https://netplan.io/faq#how-to-go-back-to-ifupdown
- ACTION: follow up any bugs related to Azure/netplan uninstall in favor ifupdown to see if cloud-init has actionable feature work to ensure proper network renderer is used
Vote results
Done items
- (none)
People present (lines said)
- blackboxsw (39)
- rharper (39)
- AnhVoMSFT (29)
- cyphermox (12)
- robjo (6)
- meetingology (4)
- ubot5 (3)
- paride (1)
- Odd_Bloke (1)
Full Log
16:19 <blackboxsw>
#startmeeting Cloud-init bi-weekly status
16:19 <meetingology>
Meeting started Mon Jun 10 16:19:45 2019 UTC. The chair is blackboxsw. Information about MeetBot at http://wiki.ubuntu.com/meetingology.
16:19 <meetingology>
16:19 <meetingology>
Available commands: action commands idea info link nick
16:19 <rharper>
o/
16:20 <Odd_Bloke>
o/
16:20 <blackboxsw>
hi cloud-init folks. let's kick off the bi-weekly meeting again
16:21 <blackboxsw>
our last meeting minutes are hosted on github
16:21 <blackboxsw>
#link https://cloud-init.github.io
16:22 <blackboxsw>
welcome all. Generally cloud-init upstream uses this meeting to provide a platform for status updates, raising questions or concerns and feature discussion. All are encouraged to participate as you see fit.
16:22 <blackboxsw>
our format is the following topics: Previous Actions, Recent Changes, In-progress Development, Office Hours
16:23 <blackboxsw>
interjections and additional topics are welcome
16:23 <blackboxsw>
#topic Previous Actions
16:24 <blackboxsw>
Checking last meeting's minutes we were clear of old actions.
16:24 <blackboxsw>
so we'll jump to the next topic this week.
16:24 <blackboxsw>
#topic Recent Changes
16:26 <blackboxsw>
the following commits landedd in cloud-init tip since the last status meeting
16:26 <blackboxsw>
- Allow identification of OpenStack by Asset Tag
16:26 <blackboxsw>
[Mark T. Voelker] (LP: #1669875)
16:26 <blackboxsw>
- Fix spelling error making 'an Ubuntu' consistent. [Brian Murray]
16:26 <blackboxsw>
- run-container: centos: comment out the repo mirrorlist [Paride Legovini]
16:26 <blackboxsw>
- netplan: update netplan key mappings for gratuitous-arp
16:26 <blackboxsw>
[Ryan Harper] (LP: #1827238)
16:26 <ubot5>
Launchpad bug 1669875 in OpenStack Compute (nova) "identify openstack vmware platform" [Wishlist,Confirmed]
16:26 <ubot5>
Launchpad bug 1827238 in cloud-init "Machines fail to deploy because cloud-init needs to accept both netplan spellings for grat arp" [Medium,Fix committed]
16:30 <blackboxsw>
I was poking around out trello board to see if we've moved other cloud-init related content into the done lane, but I think those commits about capture the recent work
16:30 <blackboxsw>
#link https://trello.com/b/hFtWKUn3/daily-cloud-init-curtin
16:30 <blackboxsw>
#topic In Progress Development
16:31 <blackboxsw>
our active reviews are located here (as mentioned in the topic)
16:31 <blackboxsw>
#link https://code.launchpad.net/cloud-init/+activereviews
16:32 <blackboxsw>
Goneri: thanks for all the work on freebsd branches, there has been some good momentum there
16:32 <blackboxsw>
there is ongoing work from Azure datasource that will likely land in the next week or two
16:33 <paride>
^^ "run-container: centos: comment out the repo mirrorlist", only actually relevent when using an http/https proxy, in all the other cases the mirrorlist works as usual
16:33 <blackboxsw>
and some network-related changes landing shortly
16:33 <blackboxsw>
paride: thank you paride for the extra note
16:33 <AnhVoMSFT>
blackboxsw can you share more details on the work from Azure datasource ? Any bug that we can reference?
16:33 <blackboxsw>
I was thinking https://code.launchpad.net/~jasonzio/cloud-init/+git/cloud-init/+merge/364012 AnhVoMSFT
16:35 <rharper>
related to sorting out covering the all the network related scenarios so that we configure network in a way that ensures access to IMDS and internet in the face of additional static ips on the same subnet as the primary interface, multiple dhcp interfaces with default routes,
16:35 <AnhVoMSFT>
I see - I think there potentially needs some bigger change there, as there was some issue around identifying the primary/secondary NIC. We got confirmation from our netwoking team that the first NIC returned is the primary
16:35 <rharper>
AnhVoMSFT: good to know; that was our observation
16:36 <rharper>
AnhVoMSFT: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1815254 , related as well; the plan being to put in place some source-based routing;
16:36 <ubot5>
Launchpad bug 1815254 in cloud-init (Ubuntu) "Azure multiple ips prevent access to metadata service" [Undecided,Confirmed]
16:38 <AnhVoMSFT>
thanks rharper - is that something that should be changed/fixed from cloudinit, or is this more platform related?
16:38 <rharper>
that's a good question; generally it would be great if a platform were to include source-routes and metrics in the config they send
16:38 <AnhVoMSFT>
if the latter I will file a workitem on our side to go do some research and get the right team to take a look at it
16:39 <rharper>
currently no cloud does this, rather some indicate a primary via metadata, and then the OS scripts apply a metric to all non-primary routes to ensure that default routes go to the primary
16:39 <AnhVoMSFT>
I see - so I guess we can do similarly on Azure since we know what the primary is (first nic returned in IMDS)
16:40 <rharper>
AnhVoMSFT: so in the short term, I think cloud-init should (where possible with the OS network config) provide additional tuning (likely post-scripts in some cases) to tune the routing for what cloud-init knows is the primary route
16:40 <rharper>
AnhVoMSFT: yes, I prefer a primary=True or whatever, but it's good enough to have the current behavior documented (in the code)
16:40 <AnhVoMSFT>
thanks rharper
16:40 <rharper>
so if it change/breaks, then we know
16:44 <rharper>
I think that covers our in-progress items for the moment
16:45 <rharper>
not sure if the bot will listen to me, but just in case
16:45 <robjo>
Be mindful that in Azure the metadata service may lag behind by minutes w.r.t. secondary IPs on an interface
16:45 <rharper>
#topic Office Hours
16:45 <rharper>
robjo: in general, my awareness is that the instance has to be off line to change vnets and such; and booting back up has been enough time to see IMDS updated, do you see differently ?
16:46 <AnhVoMSFT>
robjo that is good to know, I will check on that
16:46 <robjo>
We've had various issues with cloud-netconfig due to the metadata server in Azure being slow and reverted to polling, which of course got us in trouble with API rate limits
16:46 <rharper>
robjo: interesting
16:47 <rharper>
We'll here in channel so if youve;; got merges or bugs that need an eye or just questions, fire away
16:47 <AnhVoMSFT>
robjo feel free to file a bug on that and we will investigate - IMDS is our partner team so we'll get some answer quickly there
16:48 <AnhVoMSFT>
rharper, a couple things I want to ask for Office Hours
16:48 <robjo>
AnhVoMSFT: We have been working with Stephen Zarkos on the issues
16:48 <blackboxsw>
#topic Office Hours (next ~30 mins)
16:48 <AnhVoMSFT>
robjo I will ping Stephen and get more detail and see if we have any follow up items
16:48 <blackboxsw>
sorry folks got pulled away for a bit thx rharper
16:48 <robjo>
And double checked that the polling direction was OK form the Microsoft perspective before we implemented that
16:49 <AnhVoMSFT>
I see, glad you're not blocked on it
16:50 <robjo>
rharper: We always had bug reports that upon reboot not everything was always configured when secondary IP addresses were in play. But theoretically yes upon reboot everything should be there
16:50 <AnhVoMSFT>
rharper we have a customer who booted up a VM based on 18.04, which uses netplan. Cloudinit wrote a netplan file to the image. He then installed ifupdown, then had some networking change which triggered a mac address change. Upon rebooting, cloudinit tries to use eni, but netplan file was still there, which caused his VM to mess up the network config
16:50 <robjo>
putting cloud-netconfig into polling mode pretty mush addresses the issues we had reports about
16:51 <rharper>
AnhVoMSFT: yes; that sounds very likely
16:51 <rharper>
AnhVoMSFT: did they file a bug?
16:51 <rharper>
cloud-init net "detects" which service is present
16:51 <AnhVoMSFT>
I'm checking to see if this should be a bug, or that is expected behavior
16:51 <rharper>
so if they did not uninstall netplan.io then cloud-init will likely prefer that over eni
16:52 <AnhVoMSFT>
cloudinit actually prefers eni if ifupdown is installed, I think
16:52 <rharper>
AnhVoMSFT: so the etc/netplan/*.yaml would only trigger things if netplan is still present; the systemd-generator will read yaml and write out networkd files
16:53 <AnhVoMSFT>
right, I think the customer's mistake was to not uninstall netplan (or remove any netplan configuration file) after installing ifupdown
16:53 <rharper>
AnhVoMSFT: right; I think we'll need to see the log and system state, but it sounds like an incomplete uninstall of netplan
16:53 <rharper>
uninstall of netplan should be enough to make the cloud-init.yaml inert
16:54 <rharper>
https://netplan.io/faq#how-to-go-back-to-ifupdown
16:54 <rharper>
AnhVoMSFT: it should have automatically uninstall netplan.io
16:54 <AnhVoMSFT>
I'm not sure if there is much we can do from the cloudinit side - perhaps if choosing eni, disable the cloud-init netplan yaml
16:54 <rharper>
AnhVoMSFT: well, we could check writable paths of the renderers
16:54 <AnhVoMSFT>
rharper I don't think that is the behavior on 18.04 - installing ifupdown will not uninstall netplan
16:55 <rharper>
AnhVoMSFT: you're right; =(
16:55 <rharper>
that sort of feels like a bug in the packaging
16:55 <AnhVoMSFT>
yes, I share the same sentiment
16:56 <AnhVoMSFT>
I will go ahead and file a bug so even if we don't have a short term action we can still capture the discussion
16:57 <rharper>
AnhVoMSFT: thanks, I'm pinging in #netplan and the bug will be great so we can figure out the right plan
16:59 <AnhVoMSFT>
second question: We have an intern working in our team and as part of warming up in cloudinit he wrote some additional capabilities into cloud-init analyze, adding a "boot" module (in addition to show/blame/dump), which collects timestamps of phases happening during vm booting up, but before cloudinit started, such as kernel initialization, systemd initialization..
17:00 <AnhVoMSFT>
this should work for all cloud (he tested in AWE/GCP). Currently only works for distros that uses systemd. He'll try to figure out how to get those counters for freebsd and others
17:00 <AnhVoMSFT>
rharper since you were the original author of analyze, I'm trying to gauge the interest on this and we're open to suggestions/questions
17:01 <cyphermox>
rharper: they can coexist and configure each their own interface, so it's not a conflict. It's no different than coexisting ifupdown and NetworkManager, or also NetworkManager and systemd-networkd
17:01 <rharper>
AnhVoMSFT: that sounds excellent
17:01 <blackboxsw>
nice AnhVoMSFT on the commandline extensions!
17:01 <rharper>
AnhVoMSFT: happy to review branch or Work-in-Progress when it's available
17:02 <AnhVoMSFT>
thanks rharper blackboxsw we will have that in a branch very soon.
17:03 <AnhVoMSFT>
cyphermox if that is the case then either the customer or cloudinit needs to make sure the system does not have conflicting configuration for netplan/eni.
17:03 <rharper>
cyphermox: ok; would you be open to some sort of warning about having config in both or something? I dunno; it's just not a great experience to add the new package, configure it, reboot and not have networking since the same interface was configured (differenlty) in both packages
17:03 <blackboxsw>
yeah, I'm quite intterested in any additional cli functionality that cloud-init more versatile as a system debug tool
17:04 <blackboxsw>
makes cloud-init more versatile
17:04 <cyphermox>
rharper: I'm not opposed to a warning, but that's not necessarily better UX.
17:05 <cyphermox>
debconf prompts are quite annoying to have at upgrade, and just writing it out people are likely to miss it altogether
17:05 <cyphermox>
(so you wouldn't really gain much)
17:05 <AnhVoMSFT>
blackboxsw yep that was the goal - we want to be able to deploy 1000 VMs, then use cloud-init analyze output to analyze the 50th/99th percentile of where the timing was spent during system boot, and we need some more insights into phases before cloud-init started as well
17:05 <rharper>
cyphermox: agreed; having a pointer to suggest cleaning/checking/confirming configs if /etc/netplan/ is non-empty and netplan.io is installed
17:06 <cyphermox>
rharper: one option is to parse enough of /etc/network/ to catch mentions of the interface, but that's not necessarily super solid (though it's the best option), because people can rename interfaces in netplan and match by mac
17:06 <rharper>
might be helpful; though I agree that they may still ignore that; and cloud-init could do some more work to see if an image has multiple renderers available and ensure it didn't leave config for a previous boot around
17:07 <rharper>
cyphermox: yeah; cloud-init knows more about the config and both formats; we're likely in a better spot to see "you've configured this interface twice"
17:08 <cyphermox>
rharper: so in short, I'm not opposed to improving the UX, but I'm not wowed by any solution right now (even mine)
17:09 <rharper>
cyphermox: that's fair; thanks
17:09 <AnhVoMSFT>
i think a fix in cloudinit might make most stakeholders happy here. It knows which configuration file it wrote, so it can definitely look for conflicting configurations
17:09 <rharper>
cyphermox: AnhVoMSFT is going to file the customer bug with details and we can discuss what (if any) improvements are to be made; I suspect cloud-init can help most here
17:09 <cyphermox>
yes, I think so too
17:09 <rharper>
cyphermox: thanks for the input
17:09 <AnhVoMSFT>
it can't be responsible for everything the customer does though. If customer writes some my-own-netplan.yml, we can't help much
17:10 <cyphermox>
rharper: but hey, if someone was to write a check when running netplan apply that there exists config in /etc/network, I wouldn't have much issues merging it
17:10 <rharper>
AnhVoMSFT: right, we have several "maybe_delete_if" where we verify expected output before we remove things
17:10 <cyphermox>
I just know I won't have time to look into this myself in the near future
17:10 <rharper>
cyphermox: ack
17:11 <cyphermox>
I think what will help most is aggressively deprecating and removing ifupdown
17:13 <cyphermox>
that said, the best we can realistically do for the time being is to demote it to universe
17:13 <cyphermox>
(and that's not going to change anything for UX)
17:15 <AnhVoMSFT>
we had another instance of someone installing ifupdown2, which had the effect of removing cloud-init on debian/ubuntu 16.04
17:16 <AnhVoMSFT>
and totally hosed his system, but that's a different issue altogether
17:26 <blackboxsw>
s
17:27 <blackboxsw>
thanks for the good discussion folks, I guess we'll just add an action item to followup on a netplan bug for next time to see where we are at
17:31 <blackboxsw>
#action follow up any bugs related to Azure/netplan uninstall in favor ifupdown to see if cloud-init has actionable feature work to ensure proper network renderer is used
17:31 * meetingology follow up any bugs related to Azure/netplan uninstall in favor ifupdown to see if cloud-init has actionable feature work to ensure proper network renderer is used
17:31 <blackboxsw>
ok, I'll post minutes on this. thank you again rharper for driving
17:31 <blackboxsw>
and for the participation robjo cyphermox and AnhVoMSFT
17:31 <blackboxsw>
#endmeeting
Generated by MeetBot 0.1.5 (http://wiki.ubuntu.com/meetingology)