Tuesday, January 6, 2026

Enumerating Processes on Windows

Enumerating processes is a common task on a computer.  Whether you are using something like Task Manager, Process Explorer, or System Informer on Windows, or 'ps' or 'top' Linux/Unix/MacOS - sometimes you just need to know what processes are running on your system.  As a developer, sometimes you need to know programmatically what processes are running.  However, enumerating the running processes on Windows is not straightforward, or more accurately, there are multiple ways to accomplish the task.  There are no fewer than 5 different ways to enumerate running processes on Windows.  But this raises questions like which should I use?  Are there benefits to different methods?  Which method is the most efficient?  In this post we will look at the different methods, discuss the differences, as well as the pros and cons.

To start off, the 5 methods of enumerating processes programmatically on Windows are...

  1. EnumProcesses Win32 API
  2. Toolhelp library
  3. WTSEnumerateProcesses WTS API
  4. Win32_Process table in WMI
  5. NtQuerySystemInformation (an undocumented API)

 

EnumProcesses Win32 API

Let's start off by looking at the EnumProcesses() API.  I suspect this is one of the oldest methods on Windows, if not the original method.  The API is simple, you provide a buffer and the API fills it with a list of PIDs for all running processes.  It is up to you to then call additional APIs to get useful info about each PID such as the process filename.

EnumProcesses() is not a very graceful API in that if the buffer you provide is too small, the API does not tell you how large it should be.  So you may end up calling the API multiple times until the buffer was larger than the amount of data it returned.

Here is a sample function demonstrating the use of the EnumProcess API. 

std::map<uint32_t, std::wstring> EnumProcessesWin32(void)
{
    std::map<uint32_t, std::wstring> mapProcesses;

    size_t nAllocCount = 1024;
    auto pBuffer = std::make_unique<DWORD[]>(nAllocCount);

    size_t nLoopCount = 0;
    DWORD dwReturnedSize;
    EnumProcesses(pBuffer.get(), (DWORD)nAllocCount * sizeof(DWORD), &dwReturnedSize);
    while(dwReturnedSize / sizeof(DWORD) == nAllocCount && nLoopCount++ < 10)
    {
        nAllocCount += 1024;    // Increase the allocation and try again
        pBuffer = std::make_unique<DWORD[]>(nAllocCount);
        EnumProcesses(pBuffer.get(), (DWORD)nAllocCount * sizeof(DWORD), &dwReturnedSize);
    }

    if(dwReturnedSize)
    {
        size_t nCount = dwReturnedSize / sizeof(DWORD);
        for(size_t n = 0; n < nCount; ++n)
            mapProcesses.emplace(pBuffer.get()[n], GetProcessFilename(pBuffer.get()[n]));
    }

    return mapProcesses;
}


Toolhelp library

The toolhelp library is another fairly old method for enumerating processes.  This library works by first calling CreateToolhelp32Snapshot() which takes a snapshot of the processes at that moment.  You then call various APIs to analyze the snapshot one record at a time.  And when you are done you close the handle which frees the resources.  This may sound complex, but the code can be fairly simple.

std::map<uint32_t, std::wstring> EnumProcessesToolhelp(void)
{
    std::map<uint32_t, std::wstring> mapProcesses;

    HANDLE hSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
    if(hSnapshot != INVALID_HANDLE_VALUE)
    {
        PROCESSENTRY32W pe32 = {0};
        pe32.dwSize = sizeof(pe32);

        if(Process32FirstW(hSnapshot, &pe32))
        {
            do{
                mapProcesses.emplace(pe32.th32ProcessID, pe32.szExeFile);
            }while(Process32NextW(hSnapshot, &pe32));
        }

        CloseHandle(hSnapshot);
    }

    return mapProcesses;
}


WTSEnumerateProcesses WTS API

The WTS APIs are newer, having first appeared in Windows Vista.  This API is designed for services running on multi-user systems, but the API is available on all versions of Windows.  In my opinion, the WTS API is probably the easiest and cleanest code for general use.  You call a single API which returns the results in an allocated block of memory.  After you analyze the results you call a second API to free the memory.  Here is an example function.

std::map<uint32_t, std::wstring> EnumProcessesWts(void)
{
    std::map<uint32_t, std::wstring> mapProcesses;

    PWTS_PROCESS_INFOW pwtspi;
    DWORD dwCount;
    if(WTSEnumerateProcessesW(WTS_CURRENT_SERVER_HANDLE, 0, 1, &pwtspi, &dwCount))
    {
        for(DWORD dw = 0; dw < dwCount; ++dw)
            mapProcesses.emplace(pwtspi[dw].ProcessId, pwtspi[dw].pProcessName);

        WTSFreeMemory(pwtspi);
    }

    return mapProcesses;
}


Win32_Process table in WMI

Windows Management Instrumentation is a horrible API to interact with, at least from C++.  The amount of overhead required makes this method ugly and slow.  WMI is best from scripted languages (e.g. PowerShell or VisualBasic).  WMI does have one benefit that the others lack - you can call WMI from a remote machine (assuming the firewall does not block it).  Therefore WMI is the only method to remotely enumerate processes.

The following function shows an example, but note that this example uses a helper class to handle the complexity of WMI.  So the actual code is far more complicated.

std::map<uint32_t, std::wstring> EnumProcessesWmi(void)
{
    std::map<uint32_t, std::wstring> mapProcesses;

    CWmiService wmiSvc;
    if(SUCCEEDED(wmiSvc.Open()))
    {
        CWmiClass wmiClass = wmiSvc.GetClassFormat(L"Win32_Process", L"ProcessId", L"Name");

        CWmiInstance wmiInst = wmiClass.GetFirstInstance();
        while(wmiInst.IsOpen())
        {
            uint32_t nPid = (uint32_t)wmiInst.GetAsUInteger(L"ProcessId");
            std::wstring str(wmiInst.GetAsString(L"Name"));
            mapProcesses.emplace(nPid, str);

            wmiInst = wmiClass.GetNextInstance();
        }
    }

    return mapProcesses;
}

 

NtQuerySystemInformation (an undocumented API)

The last method is the undocumented Windows API NtQuerySystemInformation.  I call the API "undocumented" but this is a little misleading.  The API is documented, Microsoft details what parameters to pass in and the structs that are returned.  But Microsoft would like to discourage the use of this API so 1) you cannot statically link to it, you must dynamically load it, 2) Microsoft says to use alternate APIs (though they don't list alternate recommended APIs), and 3) they claim the API may change at anytime in the future.  So use of this API carries some risk.  I suspect the undocumented API is the singular method to enumerate processes on Windows.  All of the other methods are wrappers around this API, providing their own set of flags and differing memory management requirements.

Because you cannot statically link to the API, the function to use this method appears more complicated than the rest.

std::map<uint32_t, std::wstring> EnumProcessesUndocumented(void)
{
    const auto SystemBasicProcessInformation = 252;
    constexpr auto STATUS_INFO_LENGTH_MISMATCH = 0xC0000004;
    using PFNNTQUERYSYSTEMINFORMATION = NTSTATUS(NTAPI*)(ULONG, PVOID, ULONG, PULONG);

    std::map<uint32_t, std::wstring> mapProcesses;

    HMODULE hNtDll = GetModuleHandleW(L"ntdll.dll");
    PFNNTQUERYSYSTEMINFORMATION pfnNtQuerySystemInformation = (PFNNTQUERYSYSTEMINFORMATION)GetProcAddress(hNtDll, "NtQuerySystemInformation");

    NTSTATUS status;
    ULONG nBufferSize = 0x80000;
    auto pProcessBuf = std::make_unique<BYTE[]>(nBufferSize);
    PSYSTEM_PROCESS_INFORMATION pspi = (PSYSTEM_PROCESS_INFORMATION)pProcessBuf.get();

    ULONG nRequired = 0;
    while((status = pfnNtQuerySystemInformation(SystemProcessInformation, pspi, nBufferSize, &nRequired)) == STATUS_INFO_LENGTH_MISMATCH && nRequired > nBufferSize)
    {    // Increase the buffer and try again
        nBufferSize = nRequired + 4096;
        pProcessBuf = std::make_unique<BYTE[]>(nBufferSize);
        pspi = (PSYSTEM_PROCESS_INFORMATION)pProcessBuf.get();
    }

    if(NT_SUCCESS(status))
    {
        while(1)
        {
            if(pspi->ImageName.Buffer)
                mapProcesses.emplace((uint32_t)pspi->UniqueProcessId, pspi->ImageName.Buffer);
            else
                mapProcesses.emplace((uint32_t)pspi->UniqueProcessId, std::wstring());    // C++ strings cannot be constructed from NULL

            if(pspi->NextEntryOffset == 0)
                break;

            pspi = (PSYSTEM_PROCESS_INFORMATION)((size_t)pspi + pspi->NextEntryOffset);
        }
    }

    return mapProcesses;
}


The undocumented method actually has a variant.  The above code uses the flag "SystemProcessInformation"  You can also call the API with the flag "SystemBasicProcessInformation" which returns less information, but does so quicker and more efficiently.


Comparing the APIs and their performance

Anytime you have multiple APIs that do the same thing, the question should be raised how do they compare performance-wise.  So I wrote test code to compare the 5 different methods.  You could call each API once using additional code to time the API call.  Or you could call each API a set number of times while timing the overall loop.  Both of these work, but I decided to go with a different method.  I created a 5 second kernel timer, and then called each API repeatedly in a loop as many times as possible.  You simply count the number of times the API was called during the 5 seconds, the more times it was able to call the API the more efficient that method is.

No surprise, WMI is the least performant of the methods at only 144 calls in 5 seconds.  That's an average of 35 milliseconds per API call.  As you'll see compared to the other methods, this is horrible.

EnumProcesses is the next worst API at 1287 calls in 5 seconds.  Almost 10x better than WMI, but still pretty bad.

The Toolhelp library did slightly better at 1664 calls in 5 seconds.

The WTS API bested them all at 2061 calls in 5 seconds.

If you are willing to use the undocumented API, then the performance increased to 3838 calls in 5 seconds.  That's almost double the performance of the WTS API.

But this is nothing compared to the undocumented API variant.  The simplified SystemBasicProcessInformation version clocked in at 102155 calls in 5 seconds.  That's 50x better than the WTS and over 700x faster than WMI.


Conclusion

I feel like the WTS method is the best method for most uses.  It's both fast and simple to use.  Both the EnumProcesses and Toolhelp are old and have been superseded.  WMI should only be considered if you need remote access.

Which leaves the undocumented API.  If you are comfortable calling an undocumented API, then it is the most performant.  With the variant being far and away the fastest method - with one big caveat.  The variant requires Windows 11 version 26100.4770 or newer.  So you would likely need to code multiple methods and fallback to supported older OSes.


You can find the full code sample on my Github page.


One final note, the code and performance numbers in this post are for demonstration purposes only.  They do not represent the maximum performance possible.  For example, the undocumented API dynamically loads the function pointer and allocates a buffer with each call.  But if you need to repeatedly call this API then it would be more efficient to perform that work outside of the loop.

 

No comments:

Post a Comment