Microsoft Teams’ profile bloat is truly awful. Here’s a quick guide to getting around it!
This is a basic runthrough of a problem that I ran into today whilst gathering the data for part #2 of my “ultimate guide to logon optimizations” (and apologies for the delay on getting it out there – paid work and family have an unfortunate tendency to get in the way!)
The Teams mega-bloat
Many of those out there who use FSLogix Profile Containers have noticed that when you run Teams for the first time, there is a HUGE increase in the size of the profile. The confusing thing is, in real terms, there isn’t actually that much of an increase in the DATA in the profile – it all seems to be white space. This isn’t particularly confined to FSLogix Profile Containers, either – if you are using Citrix Provisioning Services, you will also notice a dramatic uptick in the utilization of your write cache when users run Teams for the first time. Now naturally, in an FSLogix environment, this problem is only observed when you are using dynamic sizing for your containers – but I think that is a pretty common configuration, for a number of reasons. Firstly, you aren’t committing all of the storage in one go, and secondly, it is easy to tell whose profiles are getting close to the maximum size limit, as you can actually see how much space has been used.
The reason that FSLogix Profile Containers expand like this is because the VHD or VHDX file that you are using never actually shrinks when space is relinquished. Profile Containers (and Office365 Containers, for the record) expand dynamically but then remain at the “high watermark” level if some of that space is subsequently released. This sounds a bit odd, but when you remember that FSLogix is clever enough to then use that “white space” before it expands the profile further, it isn’t that much of an issue. The same sort of principles apply to the PVS write cache, although as the write cache is a “system” rather than a “user” artifact, the behaviour is not an exact replica. If you’re using FSLogix Profile Containers, you can use any one of a number of shrink scripts to compress the VHD(x) files back down when not in use (I always use Jim Moyle’s script here, although there are many equally good other ones out there in the community) and reclaim that space properly. With the PVS write cache, you simply have to wait until the next device restart in order for that to be cleared out.
You can see the dramatic increase in utilization of the Profile Container through these screenshots. Here is a snapshot of the container after I have logged on to a Windows 10 VDA for the first time:-
Now, after I launch Teams, but prior to logging in, you can see it has expanded somewhat
Once I have signed into Teams properly (as I have to supply MFA), you can now see the increase becomes rather dramatic, to say the least
But once this has all finished, checking the size of the user’s profile (either in use, or by mounting the VHD) reveals a very much lower figure
It is clear that something is writing out a huge amount of data, and then removing it. Because of how VHD-based profile solutions (and PVS) operate, this means that storage is being committed, and then left as white space.
But is this really an issue?
Well in basic terms, probably not. Because FSLogix can (and will) use that white space before anything else, you’ve simply dynamically expanded the user’s profile ahead of time. You’re eventually going to use that 4-5GB anyway, so all you’ve done is a premature expansion. In PVS terms, as long as your write cache is sized properly and can absorb the uptick, then you simply have to wait for the next reboot to get around the problem (and on most PVS systems that’s usually somewhere between 24 and 48 hours away, sadly).
But if you’re working at scale, that’s still a helluva uptick. I have one customer who saw a 17TB increase overnight after they turned on the auto-launch for Teams. If you assume some of those users aren’t going to use it much, then that’s a whacking great storage increase to absorb in one go. And this customer also paid for their storage by the GB as part of a managed service, so to swell it by 17TB in one fell swoop was a huge pill to swallow. If there is a cost benefit to the business to be had by reclaiming this storage capacity, then it can be argued that it might be an idea to find a way around this issue.
Also, there is the point that Teams really shouldn’t be behaving like this, ideally. Particularly for situations where technologies like read-only disks are in use, an application that vomits out a huge amount of temporary data and then removes it all straight afterwards isn’t a great fit. In many EUC environments, having a slick, streamlined and predictable core set of applications that fit into your management patterns is vital. Any change in behaviour can result in impact onto other aspects of the service that you are trying to provide.
Fixing it (for now!)
So how do we go about getting around this?
Firstly, make sure you are using the proper machine-based installer, otherwise you will give yourself more problems than just this. This post is a good reference, and contains links to the machine based installer downloads. Make sure you run the installer with the switches specified in the article, as below
msiexec /i <path_to_msi> /l*v <install_logfile_name> ALLUSERS=1
If you don’t get the installer switches right, you will potentially have a horrendous time as the machine-based and user-based installers get into a fight, so make sure you don’t make a mistake! For what it’s worth, there seems to be very little difference in the actual layout of the installer whether you use the ALLUSER and ALLUSERS switches or not. They both simply drop a folder into %PROGRAMFILES(x86)% which then hooks into each new user’s session. I think the only difference between them is how updating is allowed, but I don’t know that for sure – my upcoming CUGC session on Teams with Rene Bigler may shed some light on it though 🙂
Next, I had to go and do some Process Monitoring to try and find out exactly what was spitting out all of this data. However, I was a bit perturbed to find that precisely nothing showed up in my Process Monitor log to do with Teams data.
Fortunately, community always saves you in these situations. Enter Jim Moyle to tell me that the Process Monitor filter driver runs at a different altitude to the FSLogix one, so I would be unable to see what I needed to whilst using FSLogix at the same time. Rather than try and adjust the Process Monitor altitude (it’s a Friday afternoon!), I opted to remove FSLogix instead and repeat the process *normally*.
This time, we could see a lot of activity around the Teams folders in the user profile. Some in particular seemed to be changing their allocated disk space values very quickly, both increasing and decreasing. A look at Process Monitor confirmed this – over 30,000 writes into one particular folder
It appears the culprit is a folder called %APPDATA%\Microsoft\Teams\Service Worker\Cache Storage. I have no idea what this is used for, but it is definitely the source of the huge amount of data being written and then removed. Interestingly, Process Monitor didn’t show a single “deletion” during this time, but I am wondering if the act of overwriting an entire folder’s contents doesn’t log as a “delete file” action per se. Maybe someone a bit more in touch with Process Monitor can tell me 🙂
Now interestingly, the article I linked earlier (https://docs.microsoft.com/en-us/microsoftteams/teams-for-vdi) has a section that is titled “Teams cached content exclusion list for non-persistent setup”. However, this doesn’t seem to be accurate, in my opinion. Certainly the blanket exclusion for *.txt files feels like a weird flex, so be careful if using it.
However, it is clear that to get around this aggressive expansion issue, we need to remove the folders concerned from the profile. Naturally, the way to do this is by using the FSLogix redirections.xml file. Firstly, configure an FSLogix redirections file with the following text:-
<?xml version="1.0" encoding="UTF-8"?>
<!--Generated BY ME FTMALM-->
<Exclude Copy="0">AppData\Roaming\Microsoft\Teams\Service Worker\CacheStorage</Exclude>
Save this on a network share somewhere it is accessible from all your devices and make sure it is called redirections.xml
Now, configure the FSLogix GPO from Computer Config | Admin Templates | FSLogix | Profile Containers | Advanced called Provide RedirXML file to customize redirections
Set this to Enabled and put the value as the folder path (not the full path) to your XML file you saved earlier. The file must be present in this folder and called redirections.xml.
Now update the policy so it is applied to all of the target machines. Next time they log on, the folder specified in the XML file will not be directed into the Profile Container – it will be put into a local folder called c:\users\local_username and discarded when the user logs off. So all of those reads and writes will not be committed to the container – they will be directed locally.
So when we log our user on, run Teams for the first time, sign in – this time, when we have finished, we see a much healthier-looking profile size
Far, far better than the 5GB+ we saw earlier!
In summary, this is a workaround to cover for the bizarre way Teams behaves. In certain circumstances, you may want to avoid this initial expansion of the profile in such a dramatic way, and excluding the CacheStorage folder is the easiest and most effective way to do it.
Bear in mind that you are exchanging that storage hit for a big burst of local I/O as a user runs Teams for the first time, so assess the impact carefully. Unless you are logging hundreds or thousands of users onto Teams for the first time simultaneously, though, the potential for impact in this way should be limited.
Also, as always, be careful of using the redirections.xml file to aggressively trim your Profile Containers storage. FSLogix isn’t comparable to old file-based profile management solutions – leaner profiles do NOT give quicker logon times or make any performance difference. Excluding caches just means you pay the price in performance as everything re-caches at a later time. This particular example is an exception to the rule, because a) the extra storage space can be potentially HUGE, and b) it only happens at first Teams launch anyway, not every session.
This trick wouldn’t make any difference to PVS write cache impact – but again, bear in mind that this only happens at first launch of Teams, not every time, and write caches are cleared fairly frequently. Also, if you are using Citrix UPM or another file-based profile management solution, it wouldn’t hurt to exclude the CacheStorage folder from it.
Thanks to Jim Moyle and Kasper Johansen for their help on this. Also, a quick plug – look out for a session on Teams Deployment On Citrix from myself and fellow CTP Rene Bigler at the Phoenix and New Mexico CUGC event on October 7, time to be confirmed – follow us on Twitter to get the registration details.