How can I detect a hang in QEventLoop?
Asked Answered
G

2

8

I am not sure if the title of my question is formulated correctly, so to explain what I really mean, consider the following example:

I create a QApplication and a QWidget with a QPushButton on it. Then I attach a handler to the click signal from the button that looks like this:

void MyWidget::on_pushButton_clicked(){
    //Never return
    while(true);
}

Finally I start the event loop for the application and when I run the program and the window shows up click the button.

This will in my case stall the entire application. So my question at this point is how can I "detect" that this kind of hangup has occurred in my application from the code?

I know writing code that does not return in the signal handler is bad practice, I ask this question because I want to detect errors and recover from them, possibly by restarting the application altogether in an effort to improve resilience when in production.

Thanks!

Gelid answered 30/7, 2014 at 14:0 Comment(6)
Are you asking if you can detect that a particular piece of code terminates its execution?Ganley
@Ganley Ahh, not the reductio-ad-halting-problem again :) No, I see it as a rather pragmatic issue: detect when code halts the event loop for long periods of time, where long is subject to parametrization.Madelina
Internally: periodically post events carrying a timestamp to yourself, and check how long it takes before they got handled. Too much => event loop got stuck somewhere. Externally: same thing, f.i. implementing a "ping" via a TCP socket (which again gets picked up and handled via your app's event loop). No reply => your app is stuck.Ganley
@Ganley This doesn't handle an immediate notification that things got stuck. It only works when, by definition, the loop has been "unstuck".Madelina
Well, it works if you do it externally (no reply to ping => show an immediate alert; i.e. a watchdog). If the app is stuck it can't, by definition, be doing something else. As a side note: this "ping" thing is exactly how X11 WMs detect that your app has frozen (other OSes might require one to use their own event loops, where they could've implemented something similar).Ganley
@peppe: not terminating, just spending a long time. I already have provisions in place to handle unexpected termination. And how do you suggest I implement the watchdogs? I actually have watchdogs for all my threads, but this is for the main eventloop in the application so I am a bit unsure on how to proceed with implementing such a watchdog. Thanks!Gelid
S
17

To detect a hung event loop, you need to wrap the operation that can hang the event loop, to detect when it takes too long. The "operation" is the QCoreApplication::notify member. It is called to deliver events to all event loops, in all threads. An event loop hang occurs when the code that processes an event takes too long.

When notify is entered for a given thread, you can note the time of entry. A scanner running in a dedicated thread can then iterate the list of such times, and pick up the threads that have been stuck for too long.

The example below illustrates this, with histograms, too. A thread with an event loop that was stuck longer than the timeout threshold will be highlighted in red. I also illustrate how one can wrap a viewmodel around a data source. Qt 5 and a C++11 compiler are required.

Note: The runtime warnings QBasicTimer::stop: Failed. Possibly trying to stop from a different thread are not a real issue. They are a Qt bug of only cosmetic consequence and can be ignored in this particular case. You can work around them -- see this question.

screenshot

// https://github.com/KubaO/stackoverflown/tree/master/questions/eventloop-hang-25038829
#include <QtWidgets>
#include <QtConcurrent>
#include <random>

std::default_random_engine reng;

int ilog2(qint64 val) {
  Q_ASSERT(val >= 0);
  int ret = -1;
  while (val != 0) { val >>= 1; ret++; }
  return ret;
}

/// The value binned to contain at most \a binaryDigits significant digits.
/// The less significant digits are reset to zero.
qint64 binned(qint64 value, int binaryDigits)
{
  Q_ASSERT(binaryDigits > 0);
  qint64 mask = -1;
  int clrBits = ilog2(value) - binaryDigits;
  if (clrBits > 0) mask <<= clrBits;
  return value & mask;
}

/// A safely destructible thread for perusal by QObjects.
class Thread final : public QThread {
  Q_OBJECT
  void run() override {
    connect(QAbstractEventDispatcher::instance(this),
            &QAbstractEventDispatcher::aboutToBlock,
            this, &Thread::aboutToBlock);
    QThread::run();
  }
  QAtomicInt inDestructor;
public:
  using QThread::QThread;
  /// Take an object and prevent timer resource leaks when the object is about
  /// to become threadless.
  void takeObject(QObject *obj) {
    // Work around to prevent
    // QBasicTimer::stop: Failed. Possibly trying to stop from a different thread
    static constexpr char kRegistered[] = "__ThreadRegistered";
    static constexpr char kMoved[] = "__Moved";
    if (!obj->property(kRegistered).isValid()) {
      QObject::connect(this, &Thread::finished, obj, [this, obj]{
        if (!inDestructor.load() || obj->thread() != this)
          return;
        // The object is about to become threadless
        Q_ASSERT(obj->thread() == QThread::currentThread());
        obj->setProperty(kMoved, true);
        obj->moveToThread(this->thread());
      }, Qt::DirectConnection);
      QObject::connect(this, &QObject::destroyed, obj, [obj]{
        if (!obj->thread()) {
          obj->moveToThread(QThread::currentThread());
          obj->setProperty(kRegistered, {});
        }
        else if (obj->thread() == QThread::currentThread() && obj->property(kMoved).isValid()) {
          obj->setProperty(kMoved, {});
          QCoreApplication::sendPostedEvents(obj, QEvent::MetaCall);
        }
        else if (obj->thread()->eventDispatcher())
          QTimer::singleShot(0, obj, [obj]{ obj->setProperty(kRegistered, {}); });
      }, Qt::DirectConnection);

      obj->setProperty(kRegistered, true);
    }
    obj->moveToThread(this);
  }
  ~Thread() override {
    inDestructor.store(1);
    requestInterruption();
    quit();
    wait();
  }
  Q_SIGNAL void aboutToBlock();
};

/// An application that monitors event loops in all threads.
class MonitoringApp : public QApplication {
  Q_OBJECT
  Q_PROPERTY(int timeout READ timeout WRITE setTimeout MEMBER m_timeout)
  Q_PROPERTY(int updatePeriod READ updatePeriod WRITE setUpdatePeriod MEMBER m_updatePeriod)
public:
  using Histogram = QMap<qint64, uint>;
  using Base = QApplication;
private:
  struct ThreadData {
    /// A saturating, binned histogram of event handling durations for given thread.
    Histogram histogram;
    /// Number of milliseconds between the epoch and when the event handler on this thread
    /// was entered, or zero if no event handler is running.
    qint64 ping = 0;
    /// Number of milliseconds between the epoch and when the last histogram update for
    /// this thread was broadcast
    qint64 update = 0;
    /// Whether the thread's event loop is considered stuck at the moment
    bool stuck = false;
    /// Whether the thread is newly detected
    bool newThread = true;
  };
  using Threads = QMap<QThread*, ThreadData>;
  QMutex m_mutex;
  Threads m_threads;
  int m_timeout = 1000;
  int m_updatePeriod = 250;

  class StuckEventLoopNotifier : public QObject {
    MonitoringApp *m_app;
    QBasicTimer m_timer;
    struct State { QThread *thread; qint64 elapsed; };
    QVector<State> m_toEmit;
    void timerEvent(QTimerEvent * ev) override {
      if (ev->timerId() != m_timer.timerId()) return;
      int timeout = m_app->m_timeout;
      auto now = QDateTime::currentMSecsSinceEpoch();
      m_toEmit.clear();
      QMutexLocker lock(&m_app->m_mutex);
      for (auto it = m_app->m_threads.begin(); it != m_app->m_threads.end(); ++it) {
        if (it->ping == 0) continue;
        qint64 elapsed = now - it->ping;
        it->stuck = elapsed > timeout;
        m_toEmit.push_back({it.key(), it->stuck ? elapsed : 0});
      }
      lock.unlock();
      for (auto &sig : qAsConst(m_toEmit)) emit m_app->loopStateChanged(sig.thread, sig.elapsed);
    }
  public:
    explicit StuckEventLoopNotifier(MonitoringApp * app) : m_app(app) {
      m_timer.start(100, Qt::CoarseTimer, this);
    }
  };
  StuckEventLoopNotifier m_notifier{this};
  Thread m_notifierThread;
  void threadFinishedSlot() {
    auto const thread = qobject_cast<QThread*>(QObject::sender());
    QMutexLocker lock(&m_mutex);
    auto it = m_threads.find(thread);
    if (it == m_threads.end()) return;
    auto const histogram(it->histogram);
    bool stuck = it->stuck;
    m_threads.erase(it);
    lock.unlock();
    emit newHistogram(thread, histogram);
    if (stuck) emit loopStateChanged(thread, 0);
    emit threadFinished(thread);
  }
  Q_SIGNAL void newThreadSignal(QThread*, const QString &);
protected:
  bool notify(QObject * receiver, QEvent * event) override {
    auto const curThread = QThread::currentThread();
    QElapsedTimer timer;
    auto now = QDateTime::currentMSecsSinceEpoch();
    QMutexLocker lock(&m_mutex);
    auto &thread = m_threads[curThread];
    thread.ping = now;
    bool newThread = false;
    std::swap(newThread, thread.newThread);
    lock.unlock();
    if (newThread) {
      connect(curThread, &QThread::finished, this, &MonitoringApp::threadFinishedSlot);
      struct Event : QEvent {
        QThread *thread;
        QPointer<MonitoringApp> app;
        explicit Event(QThread *thread, MonitoringApp *app) :
          QEvent(QEvent::None), thread(thread), app(app) {}
        ~Event() override {
          // objectName() can only be invoked from the object's thread
          emit app->newThreadSignal(thread, thread->objectName());
        }
      };
      QCoreApplication::postEvent(curThread, new Event(curThread, this));
    }
    timer.start();
    auto result = Base::notify(receiver, event); // This is where the event loop can get "stuck".
    auto duration = binned(timer.elapsed(), 3);
    now += duration;
    lock.relock();
    if (thread.histogram[duration] < std::numeric_limits<Histogram::mapped_type>::max())
      ++thread.histogram[duration];
    thread.ping = 0;
    qint64 sinceLastUpdate = now - thread.update;
    if (sinceLastUpdate >= m_updatePeriod) {
      auto const histogram = thread.histogram;
      thread.update = now;
      lock.unlock();
      emit newHistogram(curThread, histogram);
    }
    return result;
  }
public:
  explicit MonitoringApp(int & argc, char ** argv);
  /// The event loop for a given thread has gotten stuck, or unstuck.
  /** A zero elapsed time indicates that the loop is not stuck. The signal will be
    * emitted periodically with increasing values of `elapsed` for a given thread as long
    * as the loop is stuck. The thread might not exist when this notification is received. */
  Q_SIGNAL void loopStateChanged(QThread *, int elapsed);
  /// The first event was received in a newly started thread's event loop.
  /** The thread might not exist when this notification is received. */
  Q_SIGNAL void newThread(QThread *, const QString & threadName);
  /// The thread has a new histogram available.
  /** This signal is not sent more often than each updatePeriod().
    * The thread might not exist when this notification is received. */
  Q_SIGNAL void newHistogram(QThread *, const MonitoringApp::Histogram &);
  /// The thread has finished.
  /** The thread might not exist when this notification is received. A newHistogram
    * signal is always emitted prior to this signal's emission. */
  Q_SIGNAL void threadFinished(QThread *);
  /// The maximum number of milliseconds an event handler can run before the event loop
  /// is considered stuck.
  int timeout() const { return m_timeout; }
  Q_SLOT void setTimeout(int timeout) { m_timeout = timeout; }
  int updatePeriod() const { return m_updatePeriod; }
  Q_SLOT void setUpdatePeriod(int updatePeriod) { m_updatePeriod = updatePeriod; }
};
Q_DECLARE_METATYPE(MonitoringApp::Histogram)

MonitoringApp::MonitoringApp(int &argc, char **argv) :
  MonitoringApp::Base(argc, argv)
{
  qRegisterMetaType<MonitoringApp::Histogram>();
  connect(this, &MonitoringApp::newThreadSignal, this, &MonitoringApp::newThread,
          Qt::QueuedConnection);
  m_notifierThread.setObjectName("notifierThread");
  m_notifierThread.takeObject(&m_notifier);
  m_notifierThread.start();
}

QImage renderHistogram(const MonitoringApp::Histogram &h) {
  const int blockX = 2, blockY = 2;
  QImage img(1 + h.size() * blockX, 32 * blockY, QImage::Format_ARGB32_Premultiplied);
  img.fill(Qt::white);
  QPainter p(&img);
  int x = 0;
  for (auto it = h.begin(); it != h.end(); ++it) {
    qreal key = it.key() > 0 ? log2(it.key()) : 0.0;
    QBrush b = QColor::fromHsv(qRound(240.0*(1.0 - key/32.0)), 255, 255);
    p.fillRect(QRectF(x, img.height(), blockX, -log2(it.value()) * blockY), b);
    x += blockX;
  }
  return img;
}

class MonitoringViewModel : public QStandardItemModel {
  Q_OBJECT
  struct Item {
    bool set = false;
    QStandardItem *caption = 0, *histogram = 0;
    void setCaption(QThread* thread, const QString &name) {
      auto text = QStringLiteral("0x%1 \"%2\"").arg(std::intptr_t(thread), 0, 16).arg(name);
      caption->setText(text);
    }
  };
  QMap<QThread*, Item> m_threadItems;
  Item &itemFor(QThread *thread, bool set = true) {
    Item &item = m_threadItems[thread];
    if (set && !item.set) {
      item.caption = new QStandardItem;
      item.histogram = new QStandardItem;
      item.caption->setEditable(false);
      item.histogram->setEditable(false);
      int row = rowCount() ? 1 : 0;
      insertRow(row);
      setItem(row, 0, item.caption);
      setItem(row, 1, item.histogram);
      item.set = true;
      newHistogram(thread, MonitoringApp::Histogram());
    }
    return item;
  }
  void newThread(QThread *thread, const QString &name) {
    itemFor(thread).setCaption(thread, name);
  }
  void newHistogramImage(QThread *thread, const QImage &img) {
    auto &item = itemFor(thread, false);
    if (!item.set) return;
    item.histogram->setSizeHint(img.size());
    item.histogram->setData(img, Qt::DecorationRole);
  }
  Q_SIGNAL void newHistogramImageSignal(QThread *thread, const QImage &img);
  void newHistogram(QThread *thread, const MonitoringApp::Histogram &histogram) {
    QtConcurrent::run([this, thread, histogram]{
      emit newHistogramImageSignal(thread, renderHistogram(histogram));
    });
  }
  void loopStateChanged(QThread *thread, int elapsed) {
    auto &item = itemFor(thread);
    item.caption->setData(elapsed ? QColor(Qt::red) : QColor(Qt::transparent), Qt::BackgroundColorRole);
  }
  void threadFinished(QThread *thread) {
    auto &item = itemFor(thread);
    item.caption->setText(QStringLiteral("Finished %1").arg(item.caption->text()));
    item.set = false;
  }
public:
  MonitoringViewModel(QObject *parent = 0) : QStandardItemModel(parent) {
    connect(this, &MonitoringViewModel::newHistogramImageSignal,
            this, &MonitoringViewModel::newHistogramImage);
    auto app = qobject_cast<MonitoringApp*>(qApp);
    connect(app, &MonitoringApp::newThread, this, &MonitoringViewModel::newThread);
    connect(app, &MonitoringApp::newHistogram,  this, &MonitoringViewModel::newHistogram);
    connect(app, &MonitoringApp::threadFinished, this, &MonitoringViewModel::threadFinished);
    connect(app, &MonitoringApp::loopStateChanged, this, &MonitoringViewModel::loopStateChanged);
  }
};

class WorkerObject : public QObject {
  Q_OBJECT
  int m_trials = 2000;
  double m_probability = 0.2;
  QBasicTimer m_timer;
  void timerEvent(QTimerEvent * ev) override {
    if (ev->timerId() != m_timer.timerId()) return;
    QThread::msleep(std::binomial_distribution<>(m_trials, m_probability)(reng));
  }
public:
  using QObject::QObject;
  Q_SIGNAL void stopped();
  Q_SLOT void start() { m_timer.start(0, this); }
  Q_SLOT void stop() { m_timer.stop(); emit stopped(); }
  int trials() const { return m_trials; }
  void setTrials(int trials) { m_trials = trials; }
  double probability() const { return m_probability; }
  void setProbability(double p) { m_probability = p; }
};

int main(int argc, char *argv[])
{
  MonitoringApp app(argc, argv);
  MonitoringViewModel model;
  WorkerObject workerObject;
  Thread workerThread;
  workerThread.setObjectName("workerThread");

  QWidget w;
  QGridLayout layout(&w);
  QTableView view;
  QLabel timeoutLabel;
  QSlider timeout(Qt::Horizontal);
  QGroupBox worker("Worker Thread");
  worker.setCheckable(true);
  worker.setChecked(false);
  QGridLayout wLayout(&worker);
  QLabel rangeLabel, probabilityLabel;
  QSlider range(Qt::Horizontal), probability(Qt::Horizontal);

  timeoutLabel.setMinimumWidth(50);
  QObject::connect(&timeout, &QSlider::valueChanged, &timeoutLabel, (void(QLabel::*)(int))&QLabel::setNum);
  timeout.setMinimum(50);
  timeout.setMaximum(5000);
  timeout.setValue(app.timeout());
  view.setModel(&model);
  view.verticalHeader()->setSectionResizeMode(QHeaderView::ResizeToContents);

  layout.addWidget(&view, 0, 0, 1, 3);
  layout.addWidget(new QLabel("Timeout"), 1, 0);
  layout.addWidget(&timeoutLabel, 1, 1);
  layout.addWidget(&timeout, 1, 2);
  layout.addWidget(&worker, 2, 0, 1, 3);

  QObject::connect(&range, &QAbstractSlider::valueChanged, [&](int p){
    rangeLabel.setText(QString("Range %1 ms").arg(p));
    workerObject.setTrials(p);
  });
  QObject::connect(&probability, &QAbstractSlider::valueChanged, [&](int p){
    double prob = p / (double)probability.maximum();
    probabilityLabel.setText(QString("Probability %1").arg(prob, 0, 'g', 2));
    workerObject.setProbability(prob);
  });
  range.setMaximum(10000);
  range.setValue(workerObject.trials());
  probability.setValue(workerObject.probability() * probability.maximum());

  wLayout.addWidget(new QLabel("Sleep Time Binomial Distribution"), 0, 0, 1, 2);
  wLayout.addWidget(&rangeLabel, 1, 0);
  wLayout.addWidget(&range, 2, 0);
  wLayout.addWidget(&probabilityLabel, 1, 1);
  wLayout.addWidget(&probability, 2, 1);

  QObject::connect(&workerObject, &WorkerObject::stopped, &workerThread, &Thread::quit);
  QObject::connect(&worker, &QGroupBox::toggled, [&](bool run) {
    if (run) {
      workerThread.start();
      QMetaObject::invokeMethod(&workerObject, "start");
    } else
      QMetaObject::invokeMethod(&workerObject, "stop");
  });
  QObject::connect(&timeout, &QAbstractSlider::valueChanged, &app, &MonitoringApp::setTimeout);
  workerThread.takeObject(&workerObject);
  w.show();
  app.exec();
}

#include "main.moc"
Sexpot answered 4/8, 2014 at 12:6 Comment(0)
G
7

Some ideas, the actual solution really depends on what you need to do and what kind of feedback you need to have (a UI popping up? something recorded in logs? a debugging feature?)

Use a QTimer and record the time between invocations of a slot of yours

If the slot invocation is delayed by a significant w.r.t. the expected timer timeout, your event loop has been stuck somewhere. This will let you know there has been a problem, but won't tell you where it got stuck.

Use a separate QThread

Send periodically a signal to an object living in the main thread (or send an event; a cross-thread signal is implemented via events anyhow); the slot connected to that signal sends a signal back to the thread. If the "pong" takes too much (you can have a separate timer in the thread) do something -- abort(), raise(), i.e. an action which will cause a debugger to stop and you to see the main's thread stack trace, to deduce where you got stuck.

Note that since you're running a separate thread you can't just pop up messageboxes or similar -- no UI in other threads! At most, log the event.

Use a separate thread (2)

Qt's event loop emits some signals (see QAbstractEventLoop), in theory you could attach to those in a separate thread and detect if control is not returning to it any more. Or, subclass QAEL to the same means.

Use a QTcpServer / QLocalServer

Same ping/pong concept, but using separate processes -- write a small TCP / local socket client which periodically sends a ping to your application, if the pong doesn't get back in a short while act (now you can also use an UI).

Ganley answered 1/8, 2014 at 9:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.