A very dark topic for many people is CRL caching. It seems unimportant, too technical, not well documented and very difficult. Still, I think it’s important enough to embrace it and I hope you’ll see it’s a little bit easier than you probably think it is. In this article the focus is on Windows XP/Windows Server 2003 and upwards.
You can only understand the workings of CRL caching if you understand basic terms like a certificate, a PKI and a CRL. This article isn’t meant to explain everything about these terms (otherwise I’m actually writing a whole book), but I’m going to give a small introduction to CRLs nevertheless: many people have at least a decent idea about what a certificate or PKI is, but they start sweating from the point they hear terms like “CRL”. That’s why I’ll skip the theory about certificates and PKIs, but on the other hand introduce you to CRLs anyway. So if you don’t know what a certificate or PKI is, this article is above your head. If you do, don’t hesitate to continue, even if the abbreviation CRL doesn’t tell you a thing. After the CRL introduction, which should not be considered a full presentation of the subject, we’ll tackle the CRL caching itself. Be aware I’m only dealing with the X.509 certificate standard here, which is the normal and most common certificate standard abroad.
What the hell are CRLs?
If a certificate is presented to an application or OS for authentication or some other validation purpose, the application or OS should try to know if the certificate is valid (note it’s very, very common for applications to delegate validation to the underlying OS). There are a few checks that should be performed, like checking if the certificate isn’t expired (a certificate has a limited lifetime, indicated by a start and end time which are indicated in the certificate’s “Valid from” and “Valid to” fields, respectively). Other necessary checks include the verification of the certificate’s revocation status. For example, suppose you request and receive a certificate with its private key, but after a few weeks you think your private key has been obtained by an unauthorized person. You warn the issuer of your certificate of this suspicion and your certificate will (at least: should) be revoked (that’s how it’s called) and hopefully you will get a new certificate and private key instead 🙂 By revoking the certificate the issuer marks the certificate invalid, meaning it shouldn’t be used anymore for successful validation. Of course there are many other reasons why a certificate needs to be revoked, but let’s keep it with one example right now. Anyway, how can an application or OS know if a certain presented certificate has been revoked? Well, the issuer maintains a list of certificates it has issued, but revoked later. Such a list is called a Certificate Revocation List (CRL), which is actually just a file with the .crl extension.
The CRL Distribution Points (CDP) field in a certificate tells us where the CRL for that certificate can be downloaded and may include multiple addresses, typically HTTP(S) addresses, but LDAP is often supported too. If the certificate is mentioned in the CRL, the certificate is revoked and thus invalid; authentication with such a certificate should definitely fail! If the certificate is not present in the CRL, it’s not revoked, but this still doesn’t mean it’s valid of course (for example, it could be a certificate that’s not revoked, but is expired). You can configure some applications and operating systems to skip the revocation checking, but this is not a safe way of working of course, as it means revoked certificates could still be used for authentication or other validation purposes, meaning the unauthorized person from my example could still use your certificate to authenticate in your name, even after you have warned the issuer and your certificate has been revoked! So configuring applications and operating systems not to check the revocation status of certificates is not a good idea. This is especially true when the server side checks client certificates and when the client side checks server certificates: this way a revoked certificate won’t be accepted. Revocation checking of client certificates client side and server certificates server side isn’t really necessary then, but it’s an extra layer of security which is also a little bit sooner (a client will check its own client certificate before the request will occur, so before it will check the server certificate). Also, a server using a revoked server certificate or a client using a revoked client certificate is a waste of resources and not really “polite” (in a way you could compare it with handing over an invalid cheque to someone, even though the bank will never accept it). This politeness is especially “important” for servers, because they offer services (it’s more “acceptable” for a person to show an invalid ID than for a bank to prove its identity incorrectly). Also, typically a server is consumed by more users than one user consumes servers. Because it’s less necessary for clients to revocation check their own client certificates, quite a lot of clients have default settings skipping the revocation check for their client certificates. IMHO even at this level revocation checking should be enabled.
Typically CRLs are downloaded and used automatically. In theory you could always download CRLs manually and import them manually (except when some application doing the validation itself doesn’t support such an import), but this is not recommended, because this usually implies a lot of time and work as you typically need to do this for different PKIs (and as we’re seeing shortly CRLs get new versions (very) quickly, so you would need to repeat this process very, very often). You could also easily forget to do this task, making this very error prone. The manual way is only interesting for testing and in very specific scenarios. Note that proxy settings must be configured correctly for a user to be able to successfully download CRLs; this also counts when CRLs are downloaded automatically of course: the user account used for the component downloading the CRL must have correct proxy settings. Be aware of the fact that many server side components use LOCAL SYSTEM, meaning proxy settings may also need to be configured correctly at machine level, especially on servers.
Obviously most CRLs grow in size over time, as typically more and more certificates get revoked and must be mentioned in the CRL. Normally CRLs are renewed and republished periodically, for example every 3 hours. In this example every 3 hours the address in the CDP field of a related certificate refers to an updated CRL. But what happens if an application or OS uses a certain CRL and an updated one is available in the meanwhile? Will it use the first retrieved CRL version forever or till the system is restarted, not noticing all the further possible updates to this file? Of course not! A CRL has, just like a certificate, a certain lifetime, for example 1 week. So if an application or OS uses a CRL, it will typically use this CRL till the end of its lifetime (so maximum 1 week in my example). When the CRL is expired, a well programmed and configured application or OS won’t use it again and will try to get a new at some point. Note that I said the word “maximum” concerning my example. Well, consider the following. A CRL is published on day 1 (3 PM) and has a lifetime of 7 days. On day 3 (2 PM) the application uses the CRL for the first time. As the certificate expires on day 8 (3 PM), this means the application can use the CRL for 5 days and 1 hour after its first usage, not 7 days! If the application had started using the CRL just after it was published, then it could have used the CRL for exactly 7 days. I just want to stress the fact that the lifetime has a fixed end time and is not a duration value starting from the point an application or OS uses it the first time.
You could ask yourself when exactly an application or OS would retrieve a CRL? It can’t automatically retrieve all the possible CRLs in the world of course. First of all this is a huge job as there are many, many CRLs around. This means many, many downloads, having an impact on performance, bandwidth usage, etc. Secondly, such a list doesn’t exist, which is quite logic if you know a large percentage of the PKIs (and thus CRLs) are for internal purposes, so those shouldn’t appear in public lists. Normally CRLs are retrieved only at the moment they are needed and no valid version is already in use; on Windows this is certainly the case. This means a CRL is never retrieved if a related certificate is never presented. The first time such a certificate is presented, the CRL will be retrieved though. When the CRL is expired, normally it’s thrown away (like Windows does), but no new version is retrieved automatically. A new version is only retrieved at the moment it’s needed again (which is perhaps never!). The drawback is the certificate validation process could take longer, because of a possible CRL retrieval. But on the other hand no obsolete retrievals will occur, improving overall performance, bandwidth usage, etc.
Hmmm… but what happens if an application or operating system retrieves a just published CRL with a lifetime of 1 week, 1 hour later a certificate is revoked, 2 hours after this (so 3 hours after the previous publication) a new CRL is published containing the newly revoked certificate and in the soonest case the application only downloads a new CRL 1 week minus 1 hour later? Well, nothing actually… Even with newer CRLs, and thus possibly extra revoked certificates, the application will keep using the “old” CRL till it’s expired and only after this point the system will attempt to get an updated one the next time it’s needed again. This means that the revoked certificate from our example can still be seen as valid by the application or operating system for 1 week minus 1 hour after its revocation! This is not a great story security wise. There are a few workarounds and solutions to this problem:
• Delta CRLs: a delta CRL is based on a “normal” CRL, the kind we were already talking about (which are called base CRLs by the way). Delta CRLs contain differences/updates related to the base CRL they build on and therefore they are smaller, making them ideal to download them more than base CRLs. Whena base CRL expires no new delta CRLs are published for this base CRL. Not every PKI distributes delta CRLs though and even if they exist, it’s still required for certificates to contain a field value indicating delta CRLs exist and where they can be downloaded (otherwise an application or OS doesn’t have a clue of their existence, let alone their download location), except when you would download and import delta CRLs manually, but again, this is a task you really don’t want to do! Delta CRLs also have a limited lifetime (for example 3 hours), so an application or OS knows when it should retrieve a new one; the mechanism works exactly the same as with base CRLs. If delta CRLs can be used, this still implies there can be a small delay between the revocation of a certificate and the moment an application or OS gets aware of this, because there is still a certain amount of time between 2 publications.
• OCSP: this mechanism lets an application or OS contact an OCSP server (OCSP responder) to check “live” for a certificate’s revocation status, which is always the most up to date status. With OCSP you don’t really need CRLs anymore: the application or OS just contacts the OCSP responder of the PKI of the certificate and gets the revocation status of a certificate. It’s obvious that a system can’t get updated sooner concerning revocation statuses than with OCSP. For a system to use OCSP the address of the OCSP responder should be mentioned in the certificate. Also, an OCSP response is smaller than a CRL, making this more downloadable, even at a very frequent rate. Not every PKI contains such a responder and even if it does, the client still needs to support this. Windows Server 2003 for example doesn’t support OCSP (client nor server side), except when you install 3rd party software to take care of this. On the other hand starting from Vista Windows is able to act as an OCSP client (and even server, i.e. for your own PKI, although you need a server Windows for this, i.e. Windows Server 2008). You must be aware though some systems, like Windows Vista and upwards, cache OCSP responses (oh oh :-)) and in some cases a combination of OCSP and CRLs is used for performance reasons, which means we still don’t have a perfect “zero delay” security solution on Vista and later, but I’m not going to clarify this topic in this article. By the way, OCSP response caching cannot be disabled, which IMHO is a pity…:-(
• Forcing a system to retrieve newer CRLs sooner, so before their end times. This is not always possible though, for example with Windows XP and Windows Server 2003 (well, strictly spoken it depends, but I’ll explain in a minute). But even when it is possible, it still means you can’t get a “live” zero delay revocation status, as if you were using OCSP (well, if without caching, that is…). First of all you can’t force an update constantly and secondly a base or delta CRL isn’t published with every change (OCSP responders can update their information at every possible moment).
I guess the best way to minimize or solve the “CRL lifetime problem” is through OCSP, but if this isn’t an option, delta CRLs are the best solution in “OCSP-less land”: it’s a standard solution, it has been proven to be working, you get “only” small delays, it’s very acceptable in terms of resource cost, performance and time and you don’t need to do a thing (except when you exploit your own PKI). If delta CRLs are not an option too, the only thing you can do is to force the system to use more recent CRLs, within the boundaries of what’s possible (see further for details).
You could ask yourself why a CRL isn’t downloaded every time when needed, so at least the latest CRL version is used for revocation checking (thus creating a more secure environment). Well, suppose you have published a service that should check thousands of client certificates a minute, from many several PKIs… Would you like your system to check their status with for example caching-less OCSP? Well, perhaps… OCSP concentrates on the status of only 1 certificate, so perhaps your system can deal with thousands of OCSP requests/responses in 1 minute. But if your service needs to download thousands of CRLs in 1 minute, then we have a different story! Typically base CRLs are bigger and sometimes even “quite large” (delta CRLs are obviously smaller though, but still…). Downloading thousands of these files in such a short time is a hell of a job, especially when you want your application service to behave performantly. Heck, sometimes your application won’t even be able to answer requests before a time-out occurs! And for what? For downloading the same CRL over and over again, most of the time without even a minor change (don’t forget that CRLs are not published constantly, but normally only periodically, so most of the time there is not even a change at all!).
Okay, I must admit this story seems quite black-white. If the application would know how frequent a CRL is renewed, it could adapt itself, so it keeps using the same CRL till the moment a new CRL is published. If this period is 3 hours for example, this means your application could retrieve a new version every 3 hours (supposing it “constantly” uses a certificate related to this CRL, because if doesn’t deal with a certificate, no CRL is retrieved). The problem is that this period isn’t known by the application or OS using a certificate, as this information isn’t included in a certificate. Does this mean the X.509 certificate standard has a bad design? Well, yes and no. No, because the vision is that such a period could be changed. For example, you can issue certificates and think you need to publish a new CRL every 3 hours. But after a few months you see (for whatever reason) you should lower this to 1 hour. The certificates released in the first few months will never reflect this change! This is not a disaster of course, which is why a recommended period still could be part of a certificate’s field IMHO. On the other hand it’s not only the fault of the standard’s design. Even without information from a certificate an application or OS should be able to decide on its own, perhaps based on an administrator’s wishes, to download a new CRL sooner; but as you probably know already, this isn’t the case for Windows. Also, the standard actually provides something similar: delta CRLs, as I’ve told already. And those are even better, because they are smaller. But… not every PKI provides delta CRLs and I still think a recommended value for base CRLs could have its benefits.
The bottom line concerning base CRLs is that it certainly makes sense to keep using them for a while (and not downloading them every time over and over again), which is a balance between security and limiting resource usage (bandwidth for example) and performance. In many environments this means they are used till they are expired. This could be improved by the use of delta CRLs, which are also used till the end of their (typically shorter) lifetime. The whole thing could be improved by using OCSP, on Windows side by side with CRLs. But as I’ve told already, Windows also keeps OCSP responses for a while.
The keeping of CRLs and OCSP responses for a while is called caching: CRL caching and OCSP caching. Without OCSP and delta CRLs many scenarios exist though where this caching is not secure enough. Let’s take a look at this caching and how we can influence it.
What’s up with CRL caching in Windows?
To be honest, there are 2 caches. If you’re new to CRLs and OCSP, I’m sorry for that 😉 You probably thought the hard stuff was dealt with in the previous section, well, it isn’t. No, seriously, if a process needs abase CRL, delta CRL or OCSP response it takes a look in its own per process memory cache to see if it can find it. If so, well, that’s good news, isn’t it? If not, it has to get it from somewhere else and put it into its memory cache. This “somewhere else” is the per user disk cache.
Every user, including the system, possesses such a disk cache. So every process has its own memory cache and all processes running under the same user account share the same disk cache (or otherwise stated: every user has its own disk cache). Anyway, if the CRL or OCSP response is found in this place, it’s copied to the process’ memory cache (but not deleted from the disk cache) and used of course.
The interesting part begins when it’s is not found here either… The only thing left to do is to retrieve it from a non cache location. You are probably thinking of the web server or LDAP server that publishes the CRL (or the OCSP responder for OCSP), but there is another step first in the case of CRLs: per user certificate stores. Although it’s possible for an administrator or user to act on the caches till a certain (quite limited) level, as we’ll see shortly, those caches are actually more seen as locations “not seriously managed by the administrator in normal situations”. Certificate stores, on the other hand, are meant for this purpose: they are intended to be managed and used by an administrator or user (including the system itself, which can be considered a user too). Certificate stores obviously can contain certificates, as the name implies, but also CRLs and other non-yet-mentioned certificate related stuff which I’m not going to talk about here. One way to manage certificate stores is through the Microsoft Management Console (MMC) snap-in Certificates. Be aware of the fact that your user certificate store isn’t the one used by the “system user” (LOCAL SYSTEM, representing the machine). If you want to peep into the machine’s store, tell the Certificates snap-in so. Attention: I said certificate stores are per user and that’s correct as long as we’re not talking about Windows services, as those have their own certificate stores too. If you’re running an application like IE as USER, USER’s certificate store can be used by this IE instance. But if you’re running a Windows service SERVICE as USER, the certificate store of SERVICE can be used, not the one from USER. For the rest of the article let’s assume we’re talking about non Windows services, just to make the text setup a little bit simpler.
Anyway, if a CRL isn’t found in the right disk cache, the corresponding certificate store is checked (that is, the certificate store of the same user as the disk cache’s one). If the required CRL can be found here, it’s copied to the user’s disk cache and process’ memory cache. If even this location doesn’t contain the required CRL, the last option is downloading the CRL again, based on the content of the CDP field (I’m not going to discuss which addresses are used in which way if multiple addresses are present in this field, that’s food for another story). Remember the certificate store step is skipped for OCSP responses: if an OCSP response can’t be found in the disk cache, an OCSP request is sent to the OCSP responder mentioned in the certificate.
If a CRL or OCSP response is downloaded through HTTP(S), this occurs through WinHttp and like every file downloaded via HTTP(S) it’s put in the user’s IE cache. But in this case it’s also put into the user’s disk cache and of course also the process’ memory cache, but not into the certificate store. The store isn’t used for automatic management or caching, so it’s only filled by explicit approval of the user or administrator. Note that expired CRLs and OCSP responses are removed from memory caches as well as disk caches, but not from certificate stores.
It’s interesting to know the caches are also used for certificates. A user’s certificate store contains the user’s personal certificates, normally with their private key. In a client side scenario those certificates are typically client certificates, in a server side scenario they are typically server certificates. But I also see other certificates when I sniff around cert stores… CA certificates? What do they do there? Well, I’m not going to talk about this in detail in this article, but the thing is a so-called certificate path should be built up, starting from the presented certificate till a root CA. Every certificate in this certificate chain should be valid and thus for every certificate the same checks should be performed, like the expiration check and revocation status check. The top of this chain, the root CA’s certificate, should be trusted, meaning it should be present in the trusted root CA list of the user. This list is visible in your certificate store. Your store is personal, but some pieces are shared over every store on the system, including a part of the trusted root CA list: a user’s trusted root CA list is actually a superset of the system’s trusted root CA list and the user’s personally trusted root CAs.
It’s obvious a whole chain can only be built up if every certificate in the chain is available (a certificate refers to its issuing CA, except when it’s a root CA’s certificate of course). Certificates between a non CA certificate and a root CA certificate are called intermediate, i.e. a certificate from an intermediate CA, a non root CA. How an intermediate CA’s certificate can be obtained, is out of this article’s scope, but one way is through the certificate store. A piece of this store thus contains intermediate CA certificates and this piece is also partly shared over the different certificates stores at a system: a user’s intermediate CA list is the superset of the system’s intermediate CA list and the user’s personally added intermediate CAs. As root CA and possibly intermediate CA certificates are clearly needed for validation of a presented certificate, it isn’t surprising at all to find a place for them in certificate stores, and yes, also in the caches for CRLs and OCSP! Yes, yes, those caches also contain certificates. Just remember that it’s not always the case that an application or OS downloads certificates from intermediate CAs automatically (again, this topic is out of the scope of this article), meaning in some scenarios an administrator or user should add such intermediate CA certificates explicitly to a certificate store. Also remember that root CAs should be trusted too and it’s obvious this doesn’t happen automatically! It is a fact though that Windows does contain a default list of intermediate and root CA certificates and a list of which root CAs are trusted, which can be updated through Windows updates for example. By using Windows you actually accept those lists, but you can always customize them later and oh yes, this can improve performance and security (hey, I’ll tell you more, sometimes you must customize the trusted root CA list to avoid a bug, but please read my article about the installation rollback of the Symantec Endpoint Protection (SEP) client to find a piece of information about this, although the bug I’m talking about has nothing to do with SEP itself).
The whole algorithm used by Windows concerning certificate validation, including the way (CRL) caching works, is performed and implemented by an API called CryptoAPI (CAPI). This means that normally applications delegate all this work to the underlying OS, i.e. Windows. Be aware that if you download CRL files manually through a browser like IE, a script,… this has nothing to do with the “managed” CAPI component of Windows. This has an important consequence: the CRL is placed in the browser cache, but NOT in the disk cache, as in such a scenario it’s just considered as any other, normal file. When CAPI downloads it, it also places the CRL file in the right disk cache, which is obviously an extra step. Does this mean you can’t download the CRL manually and put it in the disk cache on demand? Well, yes, but then you need to do it with an application that forces CAPI to do this. The easiest way to achieve this is, is by using the certutil command line tool (which can also be used for other certificate related purposes, like sniffing around a certificate store) if used on Vista and upwards. Don’t forget though that this only works for CRLs that don’t yet exist in the disk cache! If it already exists, it’s NOT updated (but the CRL file is downloaded to the IE cache, although this doesn’t influence which or how CRLs are effectively used).
It’s also important to be aware of the fact that there are no means (not even programmatically!) to update existing CRLs in the memory cache on demand on Windows XP or Windows Server 2003. The only thing you can do here is restart the process to flush the memory cache, but technically spoken we’re talking about a different process then (and a thus different memory cache); so flushing the same cache (partly or wholly) isn’t possible technically spoken. At a higher level though we could accept restarting the process as a workaround, but don’t forget that in many cases applications (especially server side applications) are not allowed to restart often… What would you say if you couldn’t use your PC banking website once every hour because the server applications of your bank had to restart? (OK, you could find a not-that-simple solution for this kind of situations, using load balanced servers with a central state server (bla bla, bla bla…), but I bet you know what I mean, right?) Starting with Windows Vista though, a process’ memory cache can be updated, but only by cleaning it completely, consequently resulting in new CRLs being retrieved. This cleaning doesn’t occur at a per CRL level, so you have to clear the memory cache as a whole. Actually this command works on every (!) process on the system, so the need to flush 1 CRL in 1 process’ cache results in cleaning every CRL in every process’ cache on that system. This is far from ideal, but it’s at least better than on pre-Vista systems.
On the other hand you can’t update an existing CRL in the disk cache, except by first deleting it (which can occur on a per cache and even per CRL basis). Note that this doesn’t influence CRLs already existing in memory caches though! So what should you do if you want to force a certain application to use fresh, newly downloaded CRLs? Yup, you flush the disk cache of the application and then restart the application’s process (or instead of the latest step you can clean the memory cache starting with Windows Vista): both the disk and memory cache will become empty, resulting in new, fresh CRLs being downloaded and placed in disk and memory cache when needed. For example, if you have a web application published on the Internet and the users use certificates with CRLs with a 1 week lifetime, you could (perhaps, if allowed by management and policies) clean the disk cache at night and then restart the process hosting the web application or clear its memory cache (starting with Windows Vista). This way your web application will use CRLs of maximum 1 day old instead of 1 week. This isn’t perfect yet (especially when new CRL versions are published every hour for instance), but it’s already much better. Note that you should first clean the disk cache and then the memory cache. If you do it the other way around it’s always possible a (not yet refreshed) CRL is retrieved from the disk cache after the memory cache cleaning, but just before the disk cache cleaning.
Tuning the CRL caches is not only important for “real” security, i.e. to deny users with revoked certificates as soon as possible. Sometimes it’s possible that CRLs contain suspended certificates: already existing certificates which may/will be handed over later (to a user for example), but as this is not the case yet, they shouldn’t be used. They are not really revoked (because they should have been active before they could be revoked), but they are so-called suspended. It’s obvious these kind of certificates can be a member of CRLs too; actually we should see a CRL as a list of any certificate that shouldn’t be used, whether it’s a suspended or a revoked one. Some governments use the technique of suspended certificates. For example, in Belgium every citizen gets an electronic ID (eID) smart card (SC), containing a few certificates (don’t worry, I’m not going to discuss the details of eID here :-)). Those certificates already exist before they are handed over to the citizen (although they are still abstract at that time as the citizen’s name isn’t filled in yet), but before this hand-over it’s clear they shouldn’t be used yet (by a malicious person having access to these not yet handed over certificates for example), even if it’s quite unrealistic a not yet fully created certificate would be achieved by a malicious user. These suspended certificates are also mentioned in the CRLs. So far so good, but what happens if someone gets his eID? His/her certificate gets “unsuspended” and will be removed from the CRL. Web applications requiring authentication with the eID could take 1 week before they see this change (the eID CRLs have a 1 week lifetime) if they were already using an “older” CRL. For citizens it’s difficult to understand why “it is possible they can’t use their eID on some government sites for maximum 1 week”. But it’s far easier and more understandable that it could take 24 hours, isn’t it (although this isn’t perfect yet)?
It’s always possible of course that an application uses its own mechanism related to validation and thus CRL/OCSP caching, although this is very rare as this type of stuff is a very typical example of something “that happens at a lower level”, that can be reused and that can be built on, just like most applications also reuse, for example, the HTTP implementation of the OS and don’t implement their own HTTP package.
One last thing: why the hell do we need 2 caches and why are there memory and disk caches? Well, the cache used by a process should be available in such a way it’s very quickly accessible by the process (as the process could need CRLs quite often), which implies it resides in memory (which is of course faster to access than a disk cache) and is a per process thing, which is also useful to shield this cache from other processes (thus for security reasons). Because it would be a stupid thing to let multiple processes download the same CRL, there is a per user cache too: a certain CRL version needs to be downloaded only once per user. At the same time this implies a user’s cache is shielded from other users. As such a cache isn’t accessed constantly and it would be a good idea to make this cache persistent (surviving reboots), this cache is a disk cache. There is no central cross-user memory and/or disk cache (for example a cross-user memory cache backed by a cross-user disk cache), so if 5 users need a certain CRL version, it needs to be downloaded 5 times. Perhaps this could be improved in a future Windows release? 😉 Another enhancement could be that the user cache would be a combo of a memory cache (for faster access), backed by a disk cache (to make it persistent). This idea could possibly be extended to the process cache: the current memory cache, backed by a per process (persistent) disk cache. Probably this sequence of many cache levels would slow everything, so that’s why it has been stripped down to only 2 levels: a fast and very private one (per process memory cache) and a less fast and less private, but persistent one (per user disk cache). I have no idea which combination of caches would offer the best balance between all the aspects (security, performance, resource cost,…) in different scenarios; on the other hand giving an administrator the freedom to finetune this would be great (I know, I’m “too” critical now :-)).
The last theory
For the techie lovers amongst us I have some nitty gritty details left. First of all you probably would like to know where you could find those disk caches, isn’t it? Well, go ahead: copy the paths and paste them in your Windows Explorer:
• For every user except the system: %APPDATA%\Microsoft\CryptnetUrlCache
• For the system: %WINDIR%\System32\config\SystemProfile\Application Data\Microsoft\CryptnetUrlCache
It won’t surprise you won’t find this folder in the default user or all users profile, as this doesn’t make much sense for caching purposes.
You should be aware of the fact that CAPI itself doesn’t download the CRL file: this task is delegated to WinHttp. A download from an LDAP address is also supported by CAPI.
In the past there was no separate disk cache: CRLs were downloaded by WinInet and put into the IE cache only, where CAPI took the CRL from. Also, please don’t spam me with the question if CRL caching can’t be toggled off. First of all, I already explained why this would be a bad idea, although I must admit that in some cases it could be acceptable (for example, for a service which is seldomly used). Secondly, it’s just not possible, although in my opinion the administrator should always have the possibility to do so (for those cases where it’s acceptable or when he executes his right to be stubborn and stupid).
If you use certutil you will meet the parameter “urlcache”. The URL cache is normally interpreted as the per user IE cache, but it should actually be extended with the per user disk cache.
Here are a few examples of certutil commands based on the urlcache switch:
Get a list of the content of the URL cache.
Certutil –v –urlcache
Get a more detailed list of the content of the URL cache. “v” stands for “verbose”.
Certutil –v –urlcache FILE
Get details about the file FILE, which resides in the URL cache. FILE can be a CRL, including a CRL from the disk cache.
Certutil –urlcache CRL
Get a list of the content of the CRL disk cache.
Certutil –urlcache OCSP
Get a list of the content of the OCSP disk cache.
Certutil –urlcache –f FILE
Download the file FILE, which may be a CRL, to the IE cache. If the file was already there, it’s overwritten, potentially with a new version. If the CRL wasn’t yet in the disk cache, the CRL is added to the disk cache too. If the CRL was already in the disk cache, it depends. On pre-Vista systems the disk cache isn’t updated, but starting from Vista it does force an update, so this is a way to force an update of a CRL in a disk cache in Vista and upwards! Remember that this doesn’t influence the process’ memory cache though!
Perhaps you wonder what’s the purpose of this? Well, as an administrator you could use this for testing, troubleshooting or very rare performance tuning (for example, via scripting some CRLs could be downloaded even before they were actually needed, speeding up later CRL searches (as the CRL doesn’t need to be downloaded anymore when a user presents his certificate for instance)).
Certutil –urlcache FILE delete
Remove file FILE from the IE cache and disk cache if it resided there. FILE can of course be a CRL file (.crl).
Certutil –urlcache CRL delete
Remove every CRL from the IE cache and disk cache.
Certutil –urlcache OCSP delete
Remove every cached OCSP response from the disk cache.
Certutil –urlcache * delete
Remove every item from the IE cache and disk cache.
Note that certutil acts on caches from the user under whose user context it is running. It doesn’t affect caches from other users. If you need to act on the system’s caches, you need to run certutil under the system context.
If you need to flush every process’ memory cache on a system (containing CRLs and OCSP responses) you can do so starting from Windows Vista, again with certutil:
certutil –setreg chain\ChainCacheResyncFiletime @now
This flushes every process cache as a whole.
certutil –getreg chain\ChainCacheResyncFiletime
This provides you with the last time the process caches were flushed.
What if you need to access the disk cache of LOCAL/NETWORK SERVICE?
You can run a command line shell as the LOCAL SYSTEM user, actually representing the machine itself. From this command line shell you can run certutil and manage the disk cache of the LOCAL SYSTEM user (besides other things of course).
If you want to do the same for the NETWORK SERVICE though, things get quite complicated. You may wonder why you would do this, but don’t forget that a lot of stuff is running under this account. One example is the Microsoft Firewall service, used by ISA Server. If ISA Server is used as a reverse proxy server publishing web applications requiring client certificate authentication, there could be a need to flush the cache of the service’s process. As said before this must occur through a cleaning of the disk cache of the user, thus NETWORK SERVICE. We have to run the command line shell as NETWORK SERVICE, so certutil can be run to manage NETWORK SERVICE’s disk cache.
The problem to run cmd.exe as NETWORK SERVICE is the fact that NETWORK SERVICE cannot run interactively. So you can’t really run something as NETWORK SERVICE and interact with it through something visual. What you can do however, is create a Windows service. The following creates such a Windows service with the name RunAsNetworkService, although you can provide it with a different name of course:
sc create RunAsNetworkService binpath= “cmd /K start C:\scripts\RunAsNetworkService\RunAsNetworkService.bat” type= own type= interact
Just run this command as an administrator. The actual command is “sc” and the create parameter creates a Windows service with the name RunAsNetworkService. The binpath switch should tell sc what should be run when the service runs. In our example this is the command line shell (cmd.exe) with a few parameters. One parameter is a batch file which I have called RunAsNetworkService.bat, placed in the folder C:\Scripts\RunAsNetworkService. With 2 type switches you attach the properties “own” and “interact” to the service; “interact” means interaction with the Windows service is possible. Huh? Yup, indeed, Windows services can be interactive, although not everybody is aware of this. After the creation of the service, the service runs as LOCAL SYSTEM by default. Next thing to do is to open the Services console (services.msc) and change the user the service should run under. On the “Log On” tab change the setting from LOCAL SYSTEM to another account, enter “NT AUTHORITY\NETWORK SERVICE” as the user name, leave the password field blank and apply the changes. Voila, we have created a Windows service running as NETWORK SERVICE 🙂 Again, you can adapt the name of the Windows service, the name of the batch file and the location of the batch file to your desires. Also, there is no obligation to let cmd run a batch file, although you will see soon why I work this way. Last but not least, the service can run other executables than cmd of course, but in this example we need to run certutil, so through cmd 🙂
If we start the service it will run the batch file under the NETWORK SERVICE user. It’s very easy to change the batch file to whatever you want and just run the service to get the content of the batch file run under NETWORK SERVICE. That’s why I work this way: you don’t have to change a thing to the service, you can run more than 1 command (a batch file can contain a whole bunch of commands) and it’s very manageable. Just adapt the batch file and start the service et voila, you run in NETWORK SERVICE whatever and whenever you want! In my example I fill the script with the following:
certutil -v -urlcache >C:\logs\RunAsNetworkService\urlcacheBeforeCleaning.txt
certutil -urlcache * delete
certutil -v -urlcache >C:\logs\RunAsNetworkService\urlcacheAfterCleaning.txt
The 2nd command deletes everything in the URL cache, but first logs the content of the URL cache. Afterwards we log again. This way we can see what we have deleted and we can check the cache is empty after the deletion. This is just an example; you can log more or less or you can only delete CRLs from the disk cache. It all depends on what exactly you want.
Then we can automate a little bit more. Let’s create another script with the following content:
sc start RunAsNetworkService
taskkill /IM cmd.exe /F
net stop fwsrv
net start fwsrv
This script starts the service, waits a little bit till the service has finished (to be sure all the certutils or other things to run as NETWORK SERVICE have completed) and kills the cmd.exe process of NETWORK SERVICE by force (the /F switch). Then the script stops the Microsoft Firewall service and starts it again. “net stop” only finishes when the service has been stopped, so the “net start” command will never run too early. With “sc stop” this is different and you would need some waiting time between the stop and start command:
sc stop fwsrv
sc start fwsrv
It’s obvious this mechanism takes an extra command, is slower and more error prone. So use “net stop” instead of “sc stop”.
Running this second script cleans the URL cache of the NETWORK SERVICE user and flushes the process cache of Microsoft Firewall service. Otherwise stated: just start the 2nd script and ISA Server will use more recent CRLs for client certificate checking for your published web applications! Schedule it with a scheduled task and you don’t have to do anything to make this happen automatically! Extremely simple, very flexible and greatly automated, although the underlying layer took some research and thinking (which you shouldn’t do anymore, just adapt this mechanism for your own needs).
Oh yes, strictly spoken our Windows service is not compliant with the Windows service rules and an error will be returned and the service will be stopped. However, that’s no big deal, because in the meanwhile Windows has tried to start the service, meaning the cmd.exe from binpath has started. This cmd.exe is ended because of the error, but our service’s command also contains some parameters for cmd.exe, i.e. to create another cmd.exe and this one is not killed after the error! So we still have a cmd.exe left after the service has stopped and this one executes our first batch file (and thus certutil). This is also the reason our 2nd script has to kill this 2nd cmd.exe. You see, it seems unlogic and complicated at first, but once you know the story behind, things aren’t that bad 🙂
Of course the above can be done for LOCAL SERVICE too.
More related information can be found here: http://technet.microsoft.com/en-us/library/bb457027.aspx.
I hope this article was useful for you. If you have comments you can leave them here. Thanks and CU!