How to calculate the frequency of CPU cores
Asked Answered
C

1

7

I am trying to use RDTSC but it seems like my approach may be wrong to get the core speed:

#include "stdafx.h"
#include <windows.h>
#include <process.h>
#include <iostream>

using namespace std;

struct Core
{
    int CoreNumber;
};

static void startMonitoringCoreSpeeds(void *param)
{
    Core core = *((Core *)param);
    SetThreadAffinityMask(GetCurrentThread(), 1 << core.CoreNumber);
    while (true)
    {
        DWORD64 first = __rdtsc();
        Sleep(1000);
        DWORD64 second = __rdtsc();
        cout << "Core " << core.CoreNumber << " has frequency " << ((second - first)*pow(10, -6)) << " MHz" << endl;
    }
}

int GetNumberOfProcessorCores()
{
    DWORD process, system;
    if (GetProcessAffinityMask(GetCurrentProcess(), &process, &system))
    {
        int count = 0;
        for (int i = 0; i < 32; i++)
        {
            if (system & (1 << i))
            {
                count++;
            }
        }
        return count;
    }
    SYSTEM_INFO sysinfo;
    GetSystemInfo(&sysinfo);
    return sysinfo.dwNumberOfProcessors;
}

int _tmain(int argc, _TCHAR* argv[])
{
    for (int i = 0; i < GetNumberOfProcessorCores(); i++)
    {
        Core *core = new Core {0};
        core->CoreNumber = i;
        _beginthread(startMonitoringCoreSpeeds, 0, core);
    }
    cin.get();
}

It always prints out values around 3.3 GHz, which is wrong because things like Turbo Boost are on from time to time and my cores jump to 4.3 GHz for sure. Let me cross-reference some articles behind this idea.

Firstly (http://users.utcluj.ro/~ancapop/labscs/SCS2.pdf): "The TSCs on the processor’s cores are not synchronized. So it is not sure that if a process migrates during execution from one core to another, the measurement will not be affected. To avoid this problem, the measured process’s affinity has to be set to just one core, to prevent process migration." This tells me that RDTSC should return a different value per core my thread is on using the affinity mask I set, which is great.

Secondly, and please check this article (http://randomascii.wordpress.com/2011/07/29/rdtsc-in-the-age-of-sandybridge/): "If you need a consistent timer that works across cores and can be used to measure time then this is good news. If you want to measure actual CPU clock cycles then you are out of luck. If you want consistency across a wide range of CPU families then it sucks to be you. Update: section 16.11 of the Intel System Programming Guide documents this behavior of the Time-Stamp Counter. Roughly speaking it says that on older processors the clock rate changes, but on newer processors it remains uniform. It finishes by saying, of Constant TSC, “This is the architectural behavior moving forward." Okay, this tells me that RDTSC stays consistent, which makes my above results make sense since my CPU cores are rated at a standard 3.3 GHz...

Which REALLY begs the question, how do applications like Intel's Turbo Boost Technology Monitor and Piriform's Speccy and CPUID's CPU-Z measure a processor's clock speed while undergoing turbo boost, realtime?

Cloudcapped answered 23/4, 2014 at 17:55 Comment(32)
They directly access the BIOS to read the bus speeds/settings and do their own arithmetic. This requires having a database of all the different systems. Which is why CPUz needs to be updated every time a new processor comes out.Ilia
Suggestion : If you are not bound to use RDTSC, give try a try using Win32_Processor class of WMI for your approachZaid
possible duplicate of Finding out the CPU clock frequency (per core, per processor)Emalia
@Zaid I tried that but that's also always constant.Cloudcapped
@Mysticial: Not the BIOS so much as the model-specific registers in the CPU.Emalia
@Ilia Do you know of any examples to do this?Cloudcapped
@Alexandru: Yes, search this site. That exact "rdtsc in the age of Sandy Bridge" has been cited before.Emalia
@BenVoigt Ah. Then you know more than me. :)Ilia
@BenVoigt "read frequency from model-specific registers" didn't turn up anything as a search query :)Cloudcapped
@Alexandru: Most of these look relevant.Emalia
Something like this? #65595Cloudcapped
@Alexandru: As a side note, in your startMonitoringCoreSpeeds function and after you Sleep, you should check the actual passed time (using an alternate method, e.g. timeGetTime, etc.) to have more accurate measurements. Sleep is not guaranteed to actually sleep for the given time. (I do realize that my point is moot, but you will run into the problem I'm talking about if you solve the RDTSC issue.)Guillemot
@yzt: See my "has been cited before" link above, which does that correctly.Emalia
@BenVoigt Just ran ProcSpeedCalc() from that answer; it always returns 3300 MHz for me. :(Cloudcapped
@Alexandru: Yes, it measures the RDTSC frequency, and does that correctly. On new processors that won't be the same as the dynamic frequency.Emalia
@BenVoigt But I am looking to get the current, dynamic frequencyCloudcapped
@Alexandru: Did you read the answers on the question I marked this as a duplicate of?Emalia
@BenVoigt You're really tempting me to try out these special registers. :) But I can't do that in VS since it doesn't allow inline assembly on x64. :(Ilia
@BenVoigt Yes, that's the first Stack Overflow article I consulted with for the past while, while trying to implement an algorithm to do this.Cloudcapped
@Mysticial: You don't need assembly, there's an intrinsic But the whole instruction only works from privileged code (drivers).Emalia
@BenVoigt Damn... Ring 0. That sucks... :(Ilia
So the practical answer to this question is that you need to find a driver that reads MSRs and returns them to your program, and recommendation questions are off-topic. Problems encountered while writing such a driver would be on-topic, but you'd need a lot of background in driver development. OTOH, you can figure out what driver is used by CPU-Z or the TurboBoost monitor.Emalia
Also relevant: software.intel.com/en-us/forums/topic/278056Emalia
@BenVoigt No, this can't be the only way to do it...can I play devil's advocate for a second? I just queried my system for drivers: driverquery > "%userprofile%\drivers.txt", nothing shows up with Speccy or Piriform and yet this application still seems to get it right when it detects CPU core speeds.Cloudcapped
My justification is that if this was the only way to measure the real frequency, I would expect some of these applications to actually piggyback off of a kernel mode driver that reads the MSR values...Cloudcapped
@Alexandru: I'm sure they do call into a kernel-mode driver to get those values. The fact that you don't recognize anything in the driverquery output is not very convincing. I'd suggest that you profile the program for CreateFile calls (drivers are exposed to user-mode as pseudo-files). Try Process Explorer to see what pseudo-files are open, and Process Monitor or API Monitor to capture calls to CreateFile.Emalia
Or you could just read the source code of OpenHardwareMonitor: code.google.com/p/open-hardware-monitor/source/browse/trunk/… Look, I see a function in there named Ring0.Rdmsr() which is used for everything. Here's the implementation of that class, which drops a driver on the system. You could use it from your application too, via IOCTL_OLS_READ_MSR, if you abide by the licenseEmalia
@BenVoigt Ah, thanks man! I was able to get what I wanted. Actually, that led me to reading the copyright: Copyright (C) 2010-2012 Michael Möller <[email protected]>. Then I went to openhardwaremonitor.org, and checked out and built the SVN branch of it at openhardwaremonitor.org/downloads using TortoiseSVN: open-hardware-monitor.googlecode.com/svn/trunkCloudcapped
Alright so from this sample code I can see that they install either WinRing0x64.sys or WinRing0.sys on runtime (included inside the project) which are kernel level drivers; they use them to call Ring0.Rdmsr() like you said to calculate the actual frequency.Cloudcapped
If you install WDK you can use this article to create an IO device and run communications between your kernel mode driver and the app: ericasselin.com/…Cloudcapped
@BenVoigt Full solution below.Cloudcapped
@Ilia Full solution below.Cloudcapped
C
4

Full solution follows. I've adapted the IOCTL sample driver on MSDN to do this. Note, the IOCTL sample is the only relative WDM sample skeleton driver I could find and also the closest thing I could find to a WDM template because most kernel mode templates out of the box in WDK are WDF-based drivers (any WDM driver template is actually blank with absolutely no source code), yet the only sample logic I've seen to do this input/output was through a WDM-based driver. Also, some fun facts I've learned along the way: kernel drivers don't like floating arithmetic and you can't use "windows.h" which really limits you to "ntddk.h", a special kernel-mode header. This also means I can't do all of my computations inside of kernel mode because I can't call functions like QueryPerformanceFrequency in there, so I had to get the mean performance ratio between timestamps and return them back to user mode for some computations (without QueryPerformanceFrequency, the values you get from CPU registers that store ticks like what QueryPerformanceCounter uses are useless because you don't know the step size; maybe there's a workaround to this but I opted to just use the mean since it works pretty damn well). Also, as per the one second sleep, the reason I used that is because otherwise you're almost spin-computing shit on multiple threads, which really messes up your calculations because your frequencies will go up per core constantly checking results from QueryPerformanceCounter (you drive your cores up as you do more computations) - NOT TO MENTION - its a ratio...so the delta time is not that important since its cycles per time...you can always increase the delta, it should still give you the same ratio relative to the step size. Furthermore, this is as minimalistic as I could get it to be. Good luck making it much smaller or shorter than this. Also, if you want to install the driver, you have two options unless you want to buy a Code Signing certificate from some third party, both suck, so pick one and suck it up. Let's start with the driver:

driver.c:

//
// Include files.
//

#include <ntddk.h>          // various NT definitions
#include <string.h>
#include <intrin.h>

#include "driver.h"

#define NT_DEVICE_NAME      L"\\Device\\KernelModeDriver"
#define DOS_DEVICE_NAME     L"\\DosDevices\\KernelModeDriver"

#if DBG
#define DRIVER_PRINT(_x_) \
                DbgPrint("KernelModeDriver.sys: ");\
                DbgPrint _x_;

#else
#define DRIVER_PRINT(_x_)
#endif

//
// Device driver routine declarations.
//

DRIVER_INITIALIZE DriverEntry;

_Dispatch_type_(IRP_MJ_CREATE)
_Dispatch_type_(IRP_MJ_CLOSE)
DRIVER_DISPATCH DriverCreateClose;

_Dispatch_type_(IRP_MJ_DEVICE_CONTROL)
DRIVER_DISPATCH DriverDeviceControl;

DRIVER_UNLOAD DriverUnloadDriver;

VOID
PrintIrpInfo(
    PIRP Irp
    );
VOID
PrintChars(
    _In_reads_(CountChars) PCHAR BufferAddress,
    _In_ size_t CountChars
    );

#ifdef ALLOC_PRAGMA
#pragma alloc_text( INIT, DriverEntry )
#pragma alloc_text( PAGE, DriverCreateClose)
#pragma alloc_text( PAGE, DriverDeviceControl)
#pragma alloc_text( PAGE, DriverUnloadDriver)
#pragma alloc_text( PAGE, PrintIrpInfo)
#pragma alloc_text( PAGE, PrintChars)
#endif // ALLOC_PRAGMA


NTSTATUS
DriverEntry(
    _In_ PDRIVER_OBJECT   DriverObject,
    _In_ PUNICODE_STRING      RegistryPath
    )
/*++

Routine Description:
    This routine is called by the Operating System to initialize the driver.

    It creates the device object, fills in the dispatch entry points and
    completes the initialization.

Arguments:
    DriverObject - a pointer to the object that represents this device
    driver.

    RegistryPath - a pointer to our Services key in the registry.

Return Value:
    STATUS_SUCCESS if initialized; an error otherwise.

--*/

{
    NTSTATUS        ntStatus;
    UNICODE_STRING  ntUnicodeString;    // NT Device Name "\Device\KernelModeDriver"
    UNICODE_STRING  ntWin32NameString;    // Win32 Name "\DosDevices\KernelModeDriver"
    PDEVICE_OBJECT  deviceObject = NULL;    // ptr to device object

    UNREFERENCED_PARAMETER(RegistryPath);

    RtlInitUnicodeString( &ntUnicodeString, NT_DEVICE_NAME );

    ntStatus = IoCreateDevice(
        DriverObject,                   // Our Driver Object
        0,                              // We don't use a device extension
        &ntUnicodeString,               // Device name "\Device\KernelModeDriver"
        FILE_DEVICE_UNKNOWN,            // Device type
        FILE_DEVICE_SECURE_OPEN,        // Device characteristics
        FALSE,                          // Not an exclusive device
        &deviceObject );                // Returned ptr to Device Object

    if ( !NT_SUCCESS( ntStatus ) )
    {
        DRIVER_PRINT(("Couldn't create the device object\n"));
        return ntStatus;
    }

    //
    // Initialize the driver object with this driver's entry points.
    //

    DriverObject->MajorFunction[IRP_MJ_CREATE] = DriverCreateClose;
    DriverObject->MajorFunction[IRP_MJ_CLOSE] = DriverCreateClose;
    DriverObject->MajorFunction[IRP_MJ_DEVICE_CONTROL] = DriverDeviceControl;
    DriverObject->DriverUnload = DriverUnloadDriver;

    //
    // Initialize a Unicode String containing the Win32 name
    // for our device.
    //

    RtlInitUnicodeString( &ntWin32NameString, DOS_DEVICE_NAME );

    //
    // Create a symbolic link between our device name  and the Win32 name
    //

    ntStatus = IoCreateSymbolicLink(
                        &ntWin32NameString, &ntUnicodeString );

    if ( !NT_SUCCESS( ntStatus ) )
    {
        //
        // Delete everything that this routine has allocated.
        //
        DRIVER_PRINT(("Couldn't create symbolic link\n"));
        IoDeleteDevice( deviceObject );
    }


    return ntStatus;
}


NTSTATUS
DriverCreateClose(
    PDEVICE_OBJECT DeviceObject,
    PIRP Irp
    )
/*++

Routine Description:

    This routine is called by the I/O system when the KernelModeDriver is opened or
    closed.

    No action is performed other than completing the request successfully.

Arguments:

    DeviceObject - a pointer to the object that represents the device
    that I/O is to be done on.

    Irp - a pointer to the I/O Request Packet for this request.

Return Value:

    NT status code

--*/

{
    UNREFERENCED_PARAMETER(DeviceObject);

    PAGED_CODE();

    Irp->IoStatus.Status = STATUS_SUCCESS;
    Irp->IoStatus.Information = 0;

    IoCompleteRequest( Irp, IO_NO_INCREMENT );

    return STATUS_SUCCESS;
}

VOID
DriverUnloadDriver(
    _In_ PDRIVER_OBJECT DriverObject
    )
/*++

Routine Description:

    This routine is called by the I/O system to unload the driver.

    Any resources previously allocated must be freed.

Arguments:

    DriverObject - a pointer to the object that represents our driver.

Return Value:

    None
--*/

{
    PDEVICE_OBJECT deviceObject = DriverObject->DeviceObject;
    UNICODE_STRING uniWin32NameString;

    PAGED_CODE();

    //
    // Create counted string version of our Win32 device name.
    //

    RtlInitUnicodeString( &uniWin32NameString, DOS_DEVICE_NAME );


    //
    // Delete the link from our device name to a name in the Win32 namespace.
    //

    IoDeleteSymbolicLink( &uniWin32NameString );

    if ( deviceObject != NULL )
    {
        IoDeleteDevice( deviceObject );
    }



}

NTSTATUS
DriverDeviceControl(
    PDEVICE_OBJECT DeviceObject,
    PIRP Irp
    )

/*++

Routine Description:

    This routine is called by the I/O system to perform a device I/O
    control function.

Arguments:

    DeviceObject - a pointer to the object that represents the device
        that I/O is to be done on.

    Irp - a pointer to the I/O Request Packet for this request.

Return Value:

    NT status code

--*/

{
    PIO_STACK_LOCATION  irpSp;// Pointer to current stack location
    NTSTATUS            ntStatus = STATUS_SUCCESS;// Assume success
    ULONG               inBufLength; // Input buffer length
    ULONG               outBufLength; // Output buffer length
    void                *inBuf; // pointer to input buffer
    unsigned __int64    *outBuf; // pointer to the output buffer

    UNREFERENCED_PARAMETER(DeviceObject);

    PAGED_CODE();

    irpSp = IoGetCurrentIrpStackLocation( Irp );
    inBufLength = irpSp->Parameters.DeviceIoControl.InputBufferLength;
    outBufLength = irpSp->Parameters.DeviceIoControl.OutputBufferLength;

    if (!inBufLength || !outBufLength || outBufLength != sizeof(unsigned __int64)*2)
    {
        ntStatus = STATUS_INVALID_PARAMETER;
        goto End;
    }

    //
    // Determine which I/O control code was specified.
    //

    switch ( irpSp->Parameters.DeviceIoControl.IoControlCode )
    {
    case IOCTL_SIOCTL_METHOD_BUFFERED:

        //
        // In this method the I/O manager allocates a buffer large enough to
        // to accommodate larger of the user input buffer and output buffer,
        // assigns the address to Irp->AssociatedIrp.SystemBuffer, and
        // copies the content of the user input buffer into this SystemBuffer
        //

        DRIVER_PRINT(("Called IOCTL_SIOCTL_METHOD_BUFFERED\n"));
        PrintIrpInfo(Irp);

        //
        // Input buffer and output buffer is same in this case, read the
        // content of the buffer before writing to it
        //

        inBuf = (void *)Irp->AssociatedIrp.SystemBuffer;
        outBuf = (unsigned __int64 *)Irp->AssociatedIrp.SystemBuffer;

        //
        // Read the data from the buffer
        //

        DRIVER_PRINT(("\tData from User :"));
        //
        // We are using the following function to print characters instead
        // DebugPrint with %s format because we string we get may or
        // may not be null terminated.
        //
        PrintChars(inBuf, inBufLength);

        //
        // Write to the buffer
        //

        unsigned __int64 data[sizeof(unsigned __int64) * 2];
        data[0] = __readmsr(232);
        data[1] = __readmsr(231);

        DRIVER_PRINT(("data[0]: %d", data[0]));
        DRIVER_PRINT(("data[1]: %d", data[1]));

        RtlCopyBytes(outBuf, data, outBufLength);

        //
        // Assign the length of the data copied to IoStatus.Information
        // of the Irp and complete the Irp.
        //

        Irp->IoStatus.Information = sizeof(unsigned __int64)*2;

        //
        // When the Irp is completed the content of the SystemBuffer
        // is copied to the User output buffer and the SystemBuffer is
        // is freed.
        //

       break;

    default:

        //
        // The specified I/O control code is unrecognized by this driver.
        //

        ntStatus = STATUS_INVALID_DEVICE_REQUEST;
        DRIVER_PRINT(("ERROR: unrecognized IOCTL %x\n",
            irpSp->Parameters.DeviceIoControl.IoControlCode));
        break;
    }

End:
    //
    // Finish the I/O operation by simply completing the packet and returning
    // the same status as in the packet itself.
    //

    Irp->IoStatus.Status = ntStatus;

    IoCompleteRequest( Irp, IO_NO_INCREMENT );

    return ntStatus;
}

VOID
PrintIrpInfo(
    PIRP Irp)
{
    PIO_STACK_LOCATION  irpSp;
    irpSp = IoGetCurrentIrpStackLocation( Irp );

    PAGED_CODE();

    DRIVER_PRINT(("\tIrp->AssociatedIrp.SystemBuffer = 0x%p\n",
        Irp->AssociatedIrp.SystemBuffer));
    DRIVER_PRINT(("\tIrp->UserBuffer = 0x%p\n", Irp->UserBuffer));
    DRIVER_PRINT(("\tirpSp->Parameters.DeviceIoControl.Type3InputBuffer = 0x%p\n",
        irpSp->Parameters.DeviceIoControl.Type3InputBuffer));
    DRIVER_PRINT(("\tirpSp->Parameters.DeviceIoControl.InputBufferLength = %d\n",
        irpSp->Parameters.DeviceIoControl.InputBufferLength));
    DRIVER_PRINT(("\tirpSp->Parameters.DeviceIoControl.OutputBufferLength = %d\n",
        irpSp->Parameters.DeviceIoControl.OutputBufferLength ));
    return;
}

VOID
PrintChars(
    _In_reads_(CountChars) PCHAR BufferAddress,
    _In_ size_t CountChars
    )
{
    PAGED_CODE();

    if (CountChars) {

        while (CountChars--) {

            if (*BufferAddress > 31
                 && *BufferAddress != 127) {

                KdPrint (( "%c", *BufferAddress) );

            } else {

                KdPrint(( ".") );

            }
            BufferAddress++;
        }
        KdPrint (("\n"));
    }
    return;
}

driver.h:

//
// Device type           -- in the "User Defined" range."
//
#define SIOCTL_TYPE 40000
//
// The IOCTL function codes from 0x800 to 0xFFF are for customer use.
//
#define IOCTL_SIOCTL_METHOD_IN_DIRECT \
    CTL_CODE( SIOCTL_TYPE, 0x900, METHOD_IN_DIRECT, FILE_ANY_ACCESS  )

#define IOCTL_SIOCTL_METHOD_OUT_DIRECT \
    CTL_CODE( SIOCTL_TYPE, 0x901, METHOD_OUT_DIRECT , FILE_ANY_ACCESS  )

#define IOCTL_SIOCTL_METHOD_BUFFERED \
    CTL_CODE( SIOCTL_TYPE, 0x902, METHOD_BUFFERED, FILE_ANY_ACCESS  )

#define IOCTL_SIOCTL_METHOD_NEITHER \
    CTL_CODE( SIOCTL_TYPE, 0x903, METHOD_NEITHER , FILE_ANY_ACCESS  )


#define DRIVER_FUNC_INSTALL     0x01
#define DRIVER_FUNC_REMOVE      0x02

#define DRIVER_NAME       "ReadMSRDriver"

Now, here is the application that loads up and uses the driver (Win32 Console Application):

FrequencyCalculator.cpp:

#include "stdafx.h"
#include <iostream>
#include <windows.h>
#include <winioctl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <strsafe.h>
#include <process.h>
#include "..\KernelModeDriver\driver.h"

using namespace std;

BOOLEAN
ManageDriver(
_In_ LPCTSTR  DriverName,
_In_ LPCTSTR  ServiceName,
_In_ USHORT   Function
);

HANDLE hDevice;
TCHAR driverLocation[MAX_PATH];

void InstallDriver()
{
    DWORD errNum = 0;
    GetCurrentDirectory(MAX_PATH, driverLocation);
    _tcscat_s(driverLocation, _T("\\KernelModeDriver.sys"));

    std::wcout << "Trying to install driver at " << driverLocation << std::endl;

    //
    // open the device
    //

    if ((hDevice = CreateFile(_T("\\\\.\\KernelModeDriver"),
        GENERIC_READ | GENERIC_WRITE,
        0,
        NULL,
        CREATE_ALWAYS,
        FILE_ATTRIBUTE_NORMAL,
        NULL)) == INVALID_HANDLE_VALUE) {

        errNum = GetLastError();

        if (errNum != ERROR_FILE_NOT_FOUND) {

            printf("CreateFile failed!  ERROR_FILE_NOT_FOUND = %d\n", errNum);

            return;
        }

        //
        // The driver is not started yet so let us the install the driver.
        // First setup full path to driver name.
        //

        if (!ManageDriver(_T(DRIVER_NAME),
            driverLocation,
            DRIVER_FUNC_INSTALL
            )) {

            printf("Unable to install driver. \n");

            //
            // Error - remove driver.
            //

            ManageDriver(_T(DRIVER_NAME),
                driverLocation,
                DRIVER_FUNC_REMOVE
                );

            return;
        }

        hDevice = CreateFile(_T("\\\\.\\KernelModeDriver"),
            GENERIC_READ | GENERIC_WRITE,
            0,
            NULL,
            CREATE_ALWAYS,
            FILE_ATTRIBUTE_NORMAL,
            NULL);

        if (hDevice == INVALID_HANDLE_VALUE){
            printf("Error: CreatFile Failed : %d\n", GetLastError());
            return;
        }
    }
}

void UninstallDriver()
{
    //
    // close the handle to the device.
    //

    CloseHandle(hDevice);

    //
    // Unload the driver.  Ignore any errors.
    //
    ManageDriver(_T(DRIVER_NAME),
        driverLocation,
        DRIVER_FUNC_REMOVE
        );
}

double GetPerformanceRatio()
{
    BOOL bRc;
    ULONG bytesReturned;

    int input = 0;
    unsigned __int64 output[2];
    memset(output, 0, sizeof(unsigned __int64) * 2);

    //printf("InputBuffer Pointer = %p, BufLength = %d\n", &input, sizeof(&input));
    //printf("OutputBuffer Pointer = %p BufLength = %d\n", &output, sizeof(&output));

    //
    // Performing METHOD_BUFFERED
    //

    //printf("\nCalling DeviceIoControl METHOD_BUFFERED:\n");

    bRc = DeviceIoControl(hDevice,
        (DWORD)IOCTL_SIOCTL_METHOD_BUFFERED,
        &input,
        sizeof(&input),
        output,
        sizeof(unsigned __int64)*2,
        &bytesReturned,
        NULL
        );

    if (!bRc)
    {
        //printf("Error in DeviceIoControl : %d", GetLastError());
        return 0;

    }
    //printf("    OutBuffer (%d): %d\n", bytesReturned, output);
    if (output[1] == 0)
    {
        return 0;
    }
    else
    {
        return (float)output[0] / (float)output[1];
    }
}

struct Core
{
    int CoreNumber;
};

int GetNumberOfProcessorCores()
{
    SYSTEM_INFO sysinfo;
    GetSystemInfo(&sysinfo);
    return sysinfo.dwNumberOfProcessors;
}

float GetCoreFrequency()
{
    // __rdtsc: Returns the processor time stamp which records the number of clock cycles since the last reset.
    // QueryPerformanceCounter: Returns a high resolution time stamp that can be used for time-interval measurements.
    // Get the frequency which defines the step size of the QueryPerformanceCounter method.
    LARGE_INTEGER frequency;
    QueryPerformanceFrequency(&frequency);
    // Get the number of cycles before we start.
    ULONG cyclesBefore = __rdtsc();
    // Get the Intel performance ratio at the start.
    float ratioBefore = GetPerformanceRatio();
    // Get the start time.
    LARGE_INTEGER startTime;
    QueryPerformanceCounter(&startTime);
    // Give the CPU cores enough time to repopulate their __rdtsc and QueryPerformanceCounter registers.
    Sleep(1000);
    ULONG cyclesAfter = __rdtsc();
    // Get the Intel performance ratio at the end.
    float ratioAfter = GetPerformanceRatio();
    // Get the end time.
    LARGE_INTEGER endTime;
    QueryPerformanceCounter(&endTime);
    // Return the number of MHz. Multiply the core's frequency by the mean MSR (model-specific register) ratio (the APERF register's value divided by the MPERF register's value) between the two timestamps.
    return ((ratioAfter + ratioBefore) / 2)*(cyclesAfter - cyclesBefore)*pow(10, -6) / ((endTime.QuadPart - startTime.QuadPart) / frequency.QuadPart);
}

struct CoreResults
{
    int CoreNumber;
    float CoreFrequency;
};

CRITICAL_SECTION printLock;

static void printResult(void *param)
{
    EnterCriticalSection(&printLock);
    CoreResults coreResults = *((CoreResults *)param);
    std::cout << "Core " << coreResults.CoreNumber << " has a speed of " << coreResults.CoreFrequency << " MHz" << std::endl;
    delete param;
    LeaveCriticalSection(&printLock);
}

bool closed = false;

static void startMonitoringCoreSpeeds(void *param)
{
    Core core = *((Core *)param);
    SetThreadAffinityMask(GetCurrentThread(), 1 << core.CoreNumber);
    while (!closed)
    {
        CoreResults *coreResults = new CoreResults();
        coreResults->CoreNumber = core.CoreNumber;
        coreResults->CoreFrequency = GetCoreFrequency();
        _beginthread(printResult, 0, coreResults);
        Sleep(1000);
    }
    delete param;
}

int _tmain(int argc, _TCHAR* argv[])
{
    InitializeCriticalSection(&printLock);
    InstallDriver();
    for (int i = 0; i < GetNumberOfProcessorCores(); i++)
    {
        Core *core = new Core{ 0 };
        core->CoreNumber = i;
        _beginthread(startMonitoringCoreSpeeds, 0, core);
    }
    std::cin.get();
    closed = true;
    UninstallDriver();
    DeleteCriticalSection(&printLock);
}

It uses install.cpp which you can get from the IOCTL sample. I will post a working, fully working and ready solution (with code, obviously) on my blog over the next few days, if not tonight.

Edit: Blogged it at http://www.dima.to/blog/?p=101 (full source code available there)...

Cloudcapped answered 10/6, 2014 at 0:15 Comment(3)
If you're going to make it public, it might be worth cleaning up a little. For example, is it named KernelModeDriver or ReadMSRDriver. The code uses both.Emalia
Also, skip the performance ratio averaging stuff. Use the broken method with __rdtsc() once to calculate the base rate, thereafter you can get instantaneous clock speed by checking the performance ratio and precalculated base rate.Emalia
rdtsc will not report real cpu cycles in most chips, there is 16.11.1 Invariant TSC feature in intel ("Processor's support for invariant TSC is indicated by CPUID.80000007H:EDX[8]. The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states. ..the OS may use the TSC for wall clock timer services (instead of ACPI or HPET timers).", more at https://mcmap.net/q/19854/-rdtsc-accuracy-across-cpu-cores). If it can be used to provide wall clock, it is not dynamic cpu frequency counter. To get real cycle counter, ask for hardware PMU (in linux perf stat ./any_program does it) UNHALTED_CORE_CYCLESMacadam

© 2022 - 2024 — McMap. All rights reserved.