Is it possible to check in Java if the CPU is hyper threading?

Asked 31/7, 2012 at 10:31 Answered 24/7, 2018 at 13:52

Solved java multithreading hyperthreading

I would like to know the optimal number of threads I can run. Normally, this equals to Runtime.getRuntime().availableProcessors().

However, the returned number is twice as high on a CPU supporting hyper threading. Now, for some tasks hyper threading is good, but for others it does nothing. In my case, I suspect, it does nothing and so I wish to know whether I have to divide the number returned by Runtime.getRuntime().availableProcessors() in two.

For that I have to deduce whether the CPU is hyper threading. Hence my question - how can I do it in Java?

Thanks.

EDIT

OK, I have benchmarked my code. Here is my environment:

Lenovo ThinkPad W510 (i.e. i7 CPU with 4 cores and hyperthreading), 16G of RAM
Windows 7
84 zipped CSV files with zipped sizes ranging from 105M to 16M
All the files are read one by one in the main thread - no multithreading access to the HD.
Each CSV file row contains some data, which is parsed and a fast context-free test determines whether the row is relevant.
Each relevant row contains two doubles (representing longitude and latitude, for the curious), which are coerced into a single Long, which is then stored in a shared hash set.

Thus the worker threads do not read anything from the HD, but they do occupy themselves with unzipping and parsing the contents (using the opencsv library).

Below is the code, w/o the boring details:

public void work(File dir) throws IOException, InterruptedException {
  Set<Long> allCoordinates = Collections.newSetFromMap(new ConcurrentHashMap<Long, Boolean>());
  int n = 6;
  // NO WAITING QUEUE !
  ThreadPoolExecutor exec = new ThreadPoolExecutor(n, n, 0L, TimeUnit.MILLISECONDS, new SynchronousQueue<Runnable>());
  StopWatch sw1 = new StopWatch();
  StopWatch sw2 = new StopWatch();
  sw1.start();
  sw2.start();
  sw2.suspend();
  for (WorkItem wi : m_workItems) {
    for (File file : dir.listFiles(wi.fileNameFilter)) {
      MyTask task;
      try {
        sw2.resume();
        // The only reading from the HD occurs here:
        task = new MyTask(file, m_coordinateCollector, allCoordinates, wi.headerClass, wi.rowClass);
        sw2.suspend();
      } catch (IOException exc) {
        System.err.println(String.format("Failed to read %s - %s", file.getName(), exc.getMessage()));
        continue;
      }
      boolean retry = true;
      while (retry) {
        int count = exec.getActiveCount();
        try {
          // Fails if the maximum of the worker threads was created and all are busy.
          // This prevents us from loading all the files in memory and getting the OOM exception.
          exec.submit(task);
          retry = false;
        } catch (RejectedExecutionException exc) {
          // Wait for any worker thread to finish
          while (exec.getActiveCount() == count) {
            Thread.sleep(100);
          }
        }
      }
    }
  }
  exec.shutdown();
  exec.awaitTermination(1, TimeUnit.HOURS);
  sw1.stop();
  sw2.stop();
  System.out.println(String.format("Max concurrent threads = %d", n));
  System.out.println(String.format("Total file count = %d", m_stats.getFileCount()));
  System.out.println(String.format("Total lines = %d", m_stats.getTotalLineCount()));
  System.out.println(String.format("Total good lines = %d", m_stats.getGoodLineCount()));
  System.out.println(String.format("Total coordinates = %d", allCoordinates.size()));
  System.out.println(String.format("Overall elapsed time = %d sec, excluding I/O = %d sec", sw1.getTime() / 1000, (sw1.getTime() - sw2.getTime()) / 1000));
}

public class MyTask<H extends CsvFileHeader, R extends CsvFileRow<H>> implements Runnable {
  private final byte[] m_buffer;
  private final String m_name;
  private final CoordinateCollector m_coordinateCollector;
  private final Set<Long> m_allCoordinates;
  private final Class<H> m_headerClass;
  private final Class<R> m_rowClass;

  public MyTask(File file, CoordinateCollector coordinateCollector, Set<Long> allCoordinates,
                Class<H> headerClass, Class<R> rowClass) throws IOException {
    m_coordinateCollector = coordinateCollector;
    m_allCoordinates = allCoordinates;
    m_headerClass = headerClass;
    m_rowClass = rowClass;
    m_name = file.getName();
    m_buffer = Files.toByteArray(file);
  }

  @Override
  public void run() {
    try {
      m_coordinateCollector.collect(m_name, m_buffer, m_allCoordinates, m_headerClass, m_rowClass);
    } catch (IOException e) {
      e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
    }
  }
}

Please, find below the results (I have slightly changed the output to omit the repeating parts):

Max concurrent threads = 4
Total file count = 84
Total lines = 56395333
Total good lines = 35119231
Total coordinates = 987045
Overall elapsed time = 274 sec, excluding I/O = 266 sec

Max concurrent threads = 6
Overall elapsed time = 218 sec, excluding I/O = 209 sec

Max concurrent threads = 7
Overall elapsed time = 209 sec, excluding I/O = 199 sec

Max concurrent threads = 8
Overall elapsed time = 201 sec, excluding I/O = 192 sec

Max concurrent threads = 9
Overall elapsed time = 198 sec, excluding I/O = 186 sec

You are free to draw your own conclusions, but mine is that hyperthreading does improve the performance in my concrete case. Also, having 6 worker threads seems to be the right choice for this task and my machine.

Sylvestersylvia answered 31/7, 2012 at 10:31 Comment(13)

Interesting question. +1. I found something which may interest you although it may not answer your question. https://mcmap.net/q/225819/-threads-per-processor – Demandant 31/7, 2012 at 10:37

Hyperthreading adds "logical processors", you can turn off Hyperthreading if you don't want to use it. – Lenardlenci 31/7, 2012 at 10:37

Will it make any noticeable, or significant, difference to your app performance if, on some machines, you have twice as many threads as cores? – Toolmaker 31/7, 2012 at 10:37

Note that hyperthreading does not give you two cpu cores - it "just" utilizes the single cpu better. – Ineludible 31/7, 2012 at 10:50

My code does no videa/audio encoding/decoding, which if I understood it correctly are the only type of code that actually benefits from hyperthreading. If I/O is not involved, then having twice as many threads as there are actual cores seems only to slow down the computation. – Sylvestersylvia 31/7, 2012 at 10:57

Just curious - how much does it slow it down? You have any figures/timings? – Toolmaker 31/7, 2012 at 11:4

You could run a short (5-10 second) test computation to determine the best threading setup for the machine you're running on – Tuppeny 31/7, 2012 at 11:15

Nope, I personally do not have the figures. I have read about it here and there. I know, of course, that one has to do performance tests before jumping to optimizations, so the question is rather theoretical - I will test my code with several thread counts to discover the optimum. Still, I am curious whether I can deduce the presence of hyper threading. – Sylvestersylvia 31/7, 2012 at 11:16

It's practically impossible for a developer to foresee the performance - for example if your app has a lot of cache misses(eg. if it randomly reads memory) it will benefit from multiple HW threads per core. You can only measure it and check that resources are properly utilized, there is no magic formula like dividing by 2 or multiplying by Pi. – Belfast 31/7, 2012 at 11:31

+1 for Boris, (Treukhov, not Johnson), 'I have read about it here and there' - that's what I suspected <g> Lot of FUD-spreading re. multithreads.. – Toolmaker 31/7, 2012 at 12:4

How would you guarantee that your T/2 threads each end up on different cores, rather than using just half of the cores and still incurring the HT overheads? – Spinks 31/7, 2012 at 12:28

@Spinks - I don't think that the scheduler/dispatcher would do that, (but there again, not tested:). – Toolmaker 31/7, 2012 at 12:57

As @ThorbjørnRavnAndersen pointed out, Hyper-threading is all about utilization. If you CPU is already well utilized, you will not get much performance boost. Do not fall for marketing. Also, for some code, you may get better performance with it off. – Pedigree 21/11, 2013 at 2:44

For Windows, if the number of logical cores is higher than the number of cores, you have hyper-threading enabled. Read more about it here.

You can use wmic to find this information:

C:\WINDOWS\system32>wmic CPU Get NumberOfCores,NumberOfLogicalProcessors /Format:List


NumberOfCores=4
NumberOfLogicalProcessors=8

Hence, my system has hyper-threading. The amount of logical processors is double the cores.

But you may not even need to know. Runtime.getRuntime().availableProcessors() already returns the amount of logical processors.

A full example on getting the physical cores count (Windows only):

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class PhysicalCores
{
    public static void main(String[] arguments) throws IOException, InterruptedException
    {
        int physicalNumberOfCores = getPhysicalNumberOfCores();
        System.out.println(physicalNumberOfCores);
    }

    private static int getPhysicalNumberOfCores() throws IOException, InterruptedException
    {
        ProcessBuilder processBuilder = new ProcessBuilder("wmic", "CPU", "Get", "NumberOfCores");
        processBuilder.redirectErrorStream(true);
        Process process = processBuilder.start();
        String processOutput = getProcessOutput(process);
        String[] lines = processOutput.split(System.lineSeparator());
        return Integer.parseInt(lines[2]);
    }

    private static String getProcessOutput(Process process) throws IOException, InterruptedException
    {
        StringBuilder processOutput = new StringBuilder();

        try (BufferedReader processOutputReader = new BufferedReader(
                new InputStreamReader(process.getInputStream())))
        {
            String readLine;

            while ((readLine = processOutputReader.readLine()) != null)
            {
                processOutput.append(readLine);
                processOutput.append(System.lineSeparator());
            }

            process.waitFor();
        }

        return processOutput.toString().trim();
    }
}

Faddish answered 24/7, 2018 at 13:52 Comment(1)

If you show how to obtain the number of physical cores in Java, then your response is the answer to my question. – Sylvestersylvia 25/7, 2018 at 15:48

Unfortunately, this is not possible from java. If you know that the app will run on a modern linux variant, you can read the file /proc/cpuinfo and infer if HT is enabled.

Reading the output of this command does the trick:

grep -i "physical id" /proc/cpuinfo | sort -u | wc -l

Garett answered 31/7, 2012 at 10:38 Comment(5)

+1 for the info, but alas, the code is in Java, so it may be running on Windows as well as on Linux... – Sylvestersylvia 31/7, 2012 at 10:54

There is no platform independent way to do this. If you have a CPU which support hyper-threading, but this has been disabled, it can still look like you have hyper-threading. – Seabrook 31/7, 2012 at 11:14

No need to sort. Under some virtual configurations all ids are 0, which means sort will reduce the count to 1. – Euphemie 13/12, 2013 at 11:30

If you don't sort you might count HT cores, again, which defeats the purpose. But still I have the same issue, that physical id always returns 0 for my i7-3930K. What works for me though is: grep -i "cpu cores" /proc/cpuinfo | sort -u – Earle 18/8, 2016 at 8:33

If you have multiple sockets each with multiple cores, this may not work. superuser.com/a/932418/294432 seems like a good solution that essentially multiplies the number of sockets by the cores per socket. That may not work if you have sockets with different numbers of cores, but I think that is an unlikely scenario. – Sybarite 19/4, 2023 at 16:47

Few more musings:

Hyperthreading may have more than 2 threads per code (Sparc can have 8)
Garbage collector needs CPU time to work as well.
Hyperthreading may help a concurrent GC - or may not; or the JVM may request to be exclusive (not hyperthreading) owner of the core. So hampering the GC to get your better results during a test could be hurting in the long run.
Hyperthreading is usually useful if there are cache-misses, so the CPU is not stalled but switched to another task. Hence, "to hyperthreading or not" would depend both on the workload and the CPU L1/L2 cache size/memory speed, etc.
OS's may have bias towards/against some threads and Thread.setPriority may not be honored (on Linux it's usually not honored).
It's possible to set affinity of the process, disallowing some cores. So knowing that there is hyperthreading won't be of any significant virtue in such cases.

That being said: you should have a setting for the size of the worker threads and recommendation how to set up given the specifics of the architecture.

Coretta answered 31/7, 2012 at 10:31 Comment(0)

The is no reliable way to determine whether you have hyper threading which is on, hyper threading which is off or no hyper threading.

Instead a better approach is to do a first calibration the first time you run (or each time) which runs a first test which determines which approach to use.

Another approach is to use all the processors even if hyper threading doesn't help (provided it doesn't make the code dramatically slower)

Seabrook answered 31/7, 2012 at 11:15 Comment(3)

I'd not advise running benchmarks to calibrate as they can be skewed by the GC or class-compilation and during the 1st run usually the process doesn't have other tasks, i.e. doing a benchmark in isolation would be counter-productive. – Coretta 1/8, 2012 at 10:10

If you run multiple small (~20ms - 200ms) samples for a CPU bound task and take the best or the median, you can easily eliminate GCs or warmup time. – Seabrook 1/8, 2012 at 14:31

if it's truly CPU bound - hyperthreading is likely to lose. The benchmark needs some form of memory access too and that's the noisy part. No doubt - it can be estimated via benchmarks but unless the process is expected to have very similar workload (incl memory bandwidth) during its course, I won't put my trust in a benchmark even if I hand-optimize it (myself) – Coretta 1/8, 2012 at 15:30

There is no way to determine that from pure Java (after all a logical core is a core, if its implemented using HT or not). Beware that the solutions proposed so far can solve your requirement (as you asked), but not only Intel CPU's offer a form of hyperthreading (Sparc comes to mind and I'm sure there are others as well).

You also did not take into account that even if you determine the system uses HT, you will not be able to control a threads affinity with the cores from Java. So you are still at the mercy of the OS's thread scheduler. While there are plausible scenarios where less threads could perform better (because of reduced cache trashing) there is no way to determine statically how many threads should be used (after all CPU's do have very different cache sizes (a range from 256KB on the low end to >16MB in servers can be reasonably expected nowadays. And this is bound to change with new each generation).

Simply make it a configurable setting, any attempt to determine this without exactly knowing the target system is futile.

Waterborne answered 31/7, 2012 at 11:48 Comment(0)

For Windows, if the number of logical cores is higher than the number of cores, you have hyper-threading enabled. Read more about it here.

You can use wmic to find this information:

C:\WINDOWS\system32>wmic CPU Get NumberOfCores,NumberOfLogicalProcessors /Format:List


NumberOfCores=4
NumberOfLogicalProcessors=8

Hence, my system has hyper-threading. The amount of logical processors is double the cores.

But you may not even need to know. Runtime.getRuntime().availableProcessors() already returns the amount of logical processors.

A full example on getting the physical cores count (Windows only):

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class PhysicalCores
{
    public static void main(String[] arguments) throws IOException, InterruptedException
    {
        int physicalNumberOfCores = getPhysicalNumberOfCores();
        System.out.println(physicalNumberOfCores);
    }

    private static int getPhysicalNumberOfCores() throws IOException, InterruptedException
    {
        ProcessBuilder processBuilder = new ProcessBuilder("wmic", "CPU", "Get", "NumberOfCores");
        processBuilder.redirectErrorStream(true);
        Process process = processBuilder.start();
        String processOutput = getProcessOutput(process);
        String[] lines = processOutput.split(System.lineSeparator());
        return Integer.parseInt(lines[2]);
    }

    private static String getProcessOutput(Process process) throws IOException, InterruptedException
    {
        StringBuilder processOutput = new StringBuilder();

        try (BufferedReader processOutputReader = new BufferedReader(
                new InputStreamReader(process.getInputStream())))
        {
            String readLine;

            while ((readLine = processOutputReader.readLine()) != null)
            {
                processOutput.append(readLine);
                processOutput.append(System.lineSeparator());
            }

            process.waitFor();
        }

        return processOutput.toString().trim();
    }
}

Faddish answered 24/7, 2018 at 13:52 Comment(1)

If you show how to obtain the number of physical cores in Java, then your response is the answer to my question. – Sylvestersylvia 25/7, 2018 at 15:48

There is no way to do that, One thing you can do is Create a thread pool of Runtime.getRuntime().availableProcessors() Threads in your application and use as in when request comes in.

This way you can have 0 - Runtime.getRuntime().availableProcessors() number of threads.

Byerly answered 31/7, 2012 at 11:3 Comment(0)

You may not be able to query the OS or Runtime reliably, but you could run a quick benchmark.

Progressively increase spin-lock threads, testing to see if each new thread iterates as well as the previous. Once the performance of one of the threads is less than around half each of the previous tests (at least for intel, I don't know about SPARC), you know you have started sharing a core with a hyperthread.

Polychromy answered 2/5, 2018 at 5:8 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags