Everything you need to know about Windows logons in one blog series continues here!
I have threatened on several occasions now to do a follow-up to my previous article on Windows logon times which incorporates the findings from my “logon times masterclass” that I have presented at a few events. The time has come for me to turn these threats into reality, so this series of articles and accompanying videos will explore every trick we know on how to improve your Windows logon times. As many of you know, I work predominantly in Remote Desktop Session Host (RDSH) environments such as Citrix Virtual Apps and Desktops, VMware Horizon, Windows Virtual Desktop, Amazon Workspaces, Parallels RAS, and the like, so a lot of the optimizations discussed here will be aligned to those sorts of end-user computing areas…but even if you are managing a purely physical Windows estate, there should be plenty of material here for you to use. The aim of this is to provide a proper statistical breakdown of what differences optimizations can make to your key performance indicators such as logon time.
This series of articles is being sponsored by uberAgent, because the most important point to make about logon times is that if you can’t measure them effectively, then you will never be able to improve them! uberAgent is my current tool of choice for measuring not just logons (which it breaks down into handy sections that we are going to use widely during this series) but every other aspect of the user’s experience. All of the measurements in this series are going to be done via uberAgent, and as it comes with free, fully-featured community and consultants’ editions, there’s absolutely no reason that you can’t download it and start using it straight away to assess your own performance metrics. I’ve written plenty about uberAgent on this blog before, and I stand by it as the best monitoring tool out there for creating customized, granular, bespoke consoles that can be used right across the business. I’ve recently deployed it into my largest current client, so you can be sure I am putting my money where my mouth is – if it didn’t do the job, I wouldn’t have used it for my customers, simple as. Go and try uberAgent right now – you won’t regret it!
Part 3 – the impact of the “first logon”
Today we are moving onto an image with all applications applied and we are going to start stepping through a huge series of articles on all the factors that can potentially affect the logon process. I was first intending to start by looking at policies, but I am still in the midst of creating all my policies – trying to replicate the masses of GPOs and GPPs that you commonly see in enterprises is no small job! So for part 3, let’s actually concentrate firstly on something which has already jumped out at us in our uberAgent monitoring in the first couple of parts – the fact that a “first logon after restart” always seems to be a bit slower than all subsequent ones.
Let’s look at the statistics that we have around this. Again, we are operating on Server 2019 and Windows 10 2004 instances here for our base.
Firstly, let’s restart the instances then log on a few times via Citrix. These images have around 110 apps deployed to each, they are not optimized in any way, and they are also having their user profiles removed at logoff. They were restarted at 14:50, so there is no indication that they are still “busy” after being restarted. Now, lets open up uberAgent and see what the statistics look like. RDS02 is the Server 2019 host in use, DT002 is the Windows 10 instance.
You can see straight away that the first logons (highlighted in yellow) are substantially longer than the others. It’s nothing to do with provisioning of AppX packages – this is happening at each logon because the profiles are being removed.
Is it just because we aren’t leaving them long enough between reboots? Let’s restart them and wait for an hour this time.
It looks pretty much the same, and it is fascinating that it is the RDSH Server 2019 instance that has the worst first logon performance, rather than Windows 10. That was unexpected – I certainly thought Windows 10 would be worse, but it is considerably better.
Thinking about this, I was launching both the workloads simultaneously, so let’s test by staggering the launches after reboot this time to rule out any contention.
OK, so no real difference there anyway (in fact overall here our Server 2019 instance performed worse than normal, which again is surprising). But let’s stay focused on the specific issue we are facing here.
The first logon after a restart is taking 30-60 seconds longer than subsequent launches. That’s not good!
So how can we mitigate against this? The easiest answer is – take the first logon out of the equation, and that’s exactly what we will do.
I have to give a hat tip to fellow CTP George Spiers here as he originally blogged about this method of mitigating the issue before I even wrote my first “logon times” article way back when. In fact I actually remember using his method to shave off the last couple of seconds of logon time back at a challenging customer sometime in 2015. I have added a few bits in to cover all of the process from end to end.
In order to set up an automatic logon, firstly, you need a user account to do it. Create a domain user called autologon (or similar). Make sure it has a secure password and is a member of no other group apart from Domain Users (although if you are using one-to-one Server VDI without RDSH, it must also be given the right to log on to the target boxes).
Also note I have not given it a static password, but bear in mind if you do it this way you will need to update the password and the script that sets it when the password expires.
Next, download the SysInternals tool autologon.exe and ensure that it either exists on all of the boxes that you wish to log on automatically or is accessible from them. I opt to copy it into the \windows\system32 folder via any one of a number of centralized methods – how you achieve this is up to you. There is an x86 version of the executable (autologon.exe) and an x64 one (autologon64.exe) – use the appropriate one.
Next we need to call autologon when the device starts up. The command we need is shown below:-
autologon64 USERNAME DOMAIN PASSWORD /accepteula
autologon USERNAME DOMAIN PASSWORD /accepteula
# so for this use case our command line would simply be
autologon64 autologon JAMES-RANKIN Aut0l0g0n /accepteula
Obviously, change as required for your environment.
The next step depends on whether you are using PVS or not. If you are using PVS, add the line to your image sealing script so that the autologon is enabled as the image is finalized.
If you’re not using PVS, you need to create a Group Policy Shutdown Script on the target devices. Save the script above somewhere as a batch file or whatever language you choose to trigger it from (you can save the script on the network, or build it into your image locally) and call it from a GPO Shutdown Script. We need to run at shutdown as as running it at startup does not set the autologon in time for it to actually log on. Also we can’t use a Scheduled Task, as the Task Scheduler has no shutdown trigger capability (this is because the Scheduler service generally gets stopped before it can execute a shutdown task)
Now we need to make sure that the autologon only happens once. We need to create a GPO that edits some Registry permissions and then sets up a Scheduled Task to run when the autologon user logs in.
Firstly, create a policy setting that adds the autologon user account to the ACL on the Winlogon key in the Registry, as below
Computer Configuration | Policies | Windows Settings | Security Settings | Registry
Select the HKLM\Software\Microsoft\Windows NT\CurrentVersion\Winlogon key
Add the autologon user to the ACL with Full Control privileges
Make sure you allow the permissions to propagate
Next, create a script with the content as below (you can use PowerShell or VBScript rather than batch commands if you wish) and save it somewhere accessible (either on the network, or on the local device)
reg delete "HKLM\Software\Microsoft\Windows NT\CurrentVersion\Winlogon" /v DefaultUserName /f
reg add "HKLM\Software\Microsoft\Windows NT\CurrentVersion\Winlogon" /v AutoAdminLogon /t REG_SZ /d 0 /f
Now we need to add this script as a Scheduled Task to run when the autologon user logs in. You can create the Scheduled Task within your image or use Group Policy Preferences or another method. We’ve used GPP below.
Computer Configuration | Preferences | Control Panel Settings | Scheduled Tasks | New Task (at Least Windows 7)
Configure the options as below
Once this is done and propagated to your target devices, we should now have a machine that boots up, automatically logs on your specified user…
…and then calls the batch script which removes the DefaultUserName and resets the AutoAdminLogon Registry values (thus disabling the automatic logon), and then logs the user out ready for the next one
There are a couple of issues that you may find nagging at you about this. The first is around security.
If you’re in a PVS environment then this method is more secure because you only have the username and password in the sealing script which is generally not available to all users. However if you are not using PVS then you need the username and password stored in a Shutdown Script which is either stored on the device or the network.
There are a few things you could do to mitigate this. Firstly, you could set the permissions on the actual file the Shutdown Script calls so that only SYSTEM and Administrators have the rights to access it. Shutdown Scripts run as SYSTEM so this would be OK.
You could also restrict the autologon account from being able to log on to or access across the network all devices that are not using it to increase logon performance. (you can lock this down by using security GPOs). Additionally, you can audit the times it is used – it should only be used after reboot windows, so any usage outside of that would usually be an indicator of compromise.
You could also (thanks Francois-Xavier for reminding me) set up a local account as the autologon account rather than a domain one. You would need it on every target device, but this is easy to achieve. Using this method means that you’d need to store your Shutdown Script and the script called by the Scheduled Task locally rather than on the network, but this is again easy to achieve. This would mean that network resources would not be accessible with this account – other machines with the account created would be, but you could maybe mitigate against that with unique passwords (but that opens up a new can of worms as every script to set the autologon would be unique, unless you wanted to make a variable like the computer name part of the password). But the point stands – a local account for this is achievable if using a domain account causes ripples in your security team.
Finally, the reason we use AutoLogon from SysInternals rather than writing the DefaultPassword value into the Registry is because AutoLogon writes the password details into HKLM\SECURITY\Policy\Secrets\Passwords which is not accessible through the Registry Editor or command line tools, or even by a remote administrator. The only user that can decrypt this is a locally logged-on admin account so it adds a layer of security.
The second issue worth raising is that if you are in a Citrix Virtual Desktops environment where machines restart when they are logged off, then this will not work as it is laid out here – essentially, you will go into an endless reboot loop. If this is the sort of environment you have, you will need to find some way to log the autologon user out without triggering a restart.
So, we’ve set all of this up – now it is time to go back to our workers, restart them, and see if we have improved our statistics!
This time we’ve done ten logons in total after the machines were restarted to give us a bit more of a wider spread. Let’s fire up uberAgent and see what we shall see.
This time (Server 2019 is highlighted yellow, Windows 10 underlined in red) we can see that there isn’t as much variation going on. In fact, not only is that outlier “first logon” not skewing the stats, but they seem much more settled towards a proper average – in our initial testing, the “second” logon which quickly followed the first actually seemed to take a little longer as well. So I think we can conclude that not only does initiating an automatic logon make a big difference, but allowing it to then “settle down” after that first logon is a good idea as well.
So, what would I advise, based around these findings?
Firstly, setting up an initial automatic logon in this way seems to be a good idea. It definitely makes a big difference. These VMs were pretty unoptimized (although heavily loaded with applications), but we have dropped from logons that in some cases were 2 minutes or more down to 30-60 seconds for that first logon – a 50-75% decrease.
Also, though, I think it is very important to pre-boot machines ahead of user logon times. Even with the automatic first logon, it seems that a “second” logon very soon after the first one still encounters some contention. So I would also recommend booting your targets up (and allowing that first logon/logoff to complete) at least ten minutes before you anticipate your users requiring access. I appreciate that with one-to-one desktops where the machines are powered-up on-demand this can be tricky, but it would definitely be worth it if your testing seems to indicate that same issue exists in your environment.
Configure and secure an automatic logon, and boot your workers ahead of expected logon times if you possibly canMy advice to you!
Anyway – that covers the subject of “initial logon impact”. On Server 2019 and Windows 10, first logon impact is definitely a factor. Using the techniques and recommendations here can certainly make a difference, although the size of the difference will vary dependent on your environment and the level of optimization. I worked with a heavily optimized customer where implementing automatic logon made a difference of two seconds – but given that their logon time before that was an average of eleven seconds, it was still a significant percentage decrease. Testing and baselining will show you where you can make the savings – and that’s why you should also go and download uberAgent and use it to measure your KPIs right now because monitoring is something you can’t do without.
The video tutorial showing how to set this up is linked below.
Next up for this series will be a deep look at policies and preferences, which shouldn’t be too long because a) Dave B has finally sent me some template policies so I can overload my systems with them, and b) they’ve closed the damned pubs. Later!