I don't have much experience with Serial I/O, but have recently been tasked with fixing some highly flawed serial code, because the original programmer has left the company.
The application is a Windows program that talks to a scientific instrument serially via a virtual COMM port running on USB. Virtual COMM port USB drivers are provided by FTDI, since they manufacture the USB chip we use on the instrument.
The serial code is in an unmanaged C++ DLL, which is shared by both our old C++ software, and our new C# / .Net (WinForms) software.
There are two main problems:
Fails on many XP systems
When the first command is sent to the instrument, there's no response. When you issue the next command, you get the response from the first one.
Here's a typical usage scenario (full source for methods called is included below):
char szBuf [256];
CloseConnection ();
if (OpenConnection ())
{
ClearBuffer ();
// try to get a firmware version number
WriteChar ((char) 'V');
BOOL versionReadStatus1 = ReadString (szBuf, 100);
...
}
On a failing system, the ReadString call will never receive any serial data, and times out. But if we issue another, different command, and call ReadString again, it will return the response from the first command, not the new one!
But this only happens on a large subset of Windows XP systems - and never on Windows 7. As luck would have it, our XP dev machines worked OK, so we did not see the problem until we started beta testing. But I can also reproduce the problem by running an XP VM (VirtualBox) on my XP dev machine. Also, the problem only occurs when using the DLL with the new C# version - works fine with the old C++ app.
This seemed to be resolved when I added a Sleep(21) to the low level BytesInQue method before calling ClearCommError, but this exacerbated the other problem - CPU usage. Sleeping for less than 21 ms would make the failure mode reappear.
High CPU usage
When doing serial I/O CPU use is excessive - often above 90%. This happens with both the new C# app and the old C++ app, but is much worse in the new app. Often makes the UI very non-responsive, but not always.
Here's the code for our Port.cpp class, in all it's terrible glory. Sorry for the length, but this is what I'm working with. Most important methods are probably OpenConnection, ReadString, ReadChar, and BytesInQue.
//
// Port.cpp: Implements the CPort class, which is
// the class that controls the serial port.
//
// Copyright (C) 1997-1998 Microsoft Corporation
// All rights reserved.
//
// This source code is only intended as a supplement to the
// Broadcast Architecture Programmer's Reference.
// For detailed information regarding Broadcast
// Architecture, see the reference.
//
#include <windows.h>
#include <stdio.h>
#include <assert.h>
#include "port.h"
// Construction code to initialize the port handle to null.
CPort::CPort()
{
m_hDevice = (HANDLE)0;
// default parameters
m_uPort = 1;
m_uBaud = 9600;
m_uDataBits = 8;
m_uParity = 0;
m_uStopBits = 0; // = 1 stop bit
m_chTerminator = '\n';
m_bCommportOpen = FALSE;
m_nTimeOut = 50;
m_nBlockSizeMax = 2048;
}
// Destruction code to close the connection if the port
// handle was valid.
CPort::~CPort()
{
if (m_hDevice)
CloseConnection();
}
// Open a serial communication port for writing short
// one-byte commands, that is, overlapped data transfer
// is not necessary.
BOOL CPort::OpenConnection()
{
char szPort[64];
m_bCommportOpen = FALSE;
// Build the COM port string as "COMx" where x is the port.
if (m_uPort > 9)
wsprintf(szPort, "\\\\.\\COM%d", m_uPort);
else
wsprintf(szPort, "COM%d", m_uPort);
// Open the serial port device.
m_hDevice = CreateFile(szPort,
GENERIC_WRITE | GENERIC_READ,
0,
NULL, // No security attributes
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (m_hDevice == INVALID_HANDLE_VALUE)
{
SaveLastError ();
m_hDevice = (HANDLE)0;
return FALSE;
}
return SetupConnection(); // After the port is open, set it up.
} // end of OpenConnection()
// Configure the serial port with the given settings.
// The given settings enable the port to communicate
// with the remote control.
BOOL CPort::SetupConnection(void)
{
DCB dcb; // The DCB structure differs betwwen Win16 and Win32.
dcb.DCBlength = sizeof(DCB);
// Retrieve the DCB of the serial port.
BOOL bStatus = GetCommState(m_hDevice, (LPDCB)&dcb);
if (bStatus == 0)
{
SaveLastError ();
return FALSE;
}
// Assign the values that enable the port to communicate.
dcb.BaudRate = m_uBaud; // Baud rate
dcb.ByteSize = m_uDataBits; // Data bits per byte, 4-8
dcb.Parity = m_uParity; // Parity: 0-4 = no, odd, even, mark, space
dcb.StopBits = m_uStopBits; // 0,1,2 = 1, 1.5, 2
dcb.fBinary = TRUE; // Binary mode, no EOF check : Must use binary mode in NT
dcb.fParity = dcb.Parity == 0 ? FALSE : TRUE; // Enable parity checking
dcb.fOutX = FALSE; // XON/XOFF flow control used
dcb.fInX = FALSE; // XON/XOFF flow control used
dcb.fNull = FALSE; // Disable null stripping - want nulls
dcb.fOutxCtsFlow = FALSE;
dcb.fOutxDsrFlow = FALSE;
dcb.fDsrSensitivity = FALSE;
dcb.fDtrControl = DTR_CONTROL_ENABLE;
dcb.fRtsControl = RTS_CONTROL_DISABLE ;
// Configure the serial port with the assigned settings.
// Return TRUE if the SetCommState call was not equal to zero.
bStatus = SetCommState(m_hDevice, &dcb);
if (bStatus == 0)
{
SaveLastError ();
return FALSE;
}
DWORD dwSize;
COMMPROP *commprop;
DWORD dwError;
dwSize = sizeof(COMMPROP) + sizeof(MODEMDEVCAPS) ;
commprop = (COMMPROP *)malloc(dwSize);
memset(commprop, 0, dwSize);
if (!GetCommProperties(m_hDevice, commprop))
{
dwError = GetLastError();
}
m_bCommportOpen = TRUE;
return TRUE;
}
void CPort::SaveLastError ()
{
DWORD dwLastError = GetLastError ();
LPVOID lpMsgBuf;
FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_FROM_SYSTEM |
FORMAT_MESSAGE_IGNORE_INSERTS,
NULL,
dwLastError,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), // Default language
(LPTSTR) &lpMsgBuf,
0,
NULL);
strcpy (m_szLastError,(LPTSTR)lpMsgBuf);
// Free the buffer.
LocalFree( lpMsgBuf );
}
void CPort::SetTimeOut (int nTimeOut)
{
m_nTimeOut = nTimeOut;
}
// Close the opened serial communication port.
void CPort::CloseConnection(void)
{
if (m_hDevice != NULL &&
m_hDevice != INVALID_HANDLE_VALUE)
{
FlushFileBuffers(m_hDevice);
CloseHandle(m_hDevice); ///that the port has been closed.
}
m_hDevice = (HANDLE)0;
// Set the device handle to NULL to confirm
m_bCommportOpen = FALSE;
}
int CPort::WriteChars(char * psz)
{
int nCharWritten = 0;
while (*psz)
{
nCharWritten +=WriteChar(*psz);
psz++;
}
return nCharWritten;
}
// Write a one-byte value (char) to the serial port.
int CPort::WriteChar(char c)
{
DWORD dwBytesInOutQue = BytesInOutQue ();
if (dwBytesInOutQue > m_dwLargestBytesInOutQue)
m_dwLargestBytesInOutQue = dwBytesInOutQue;
static char szBuf[2];
szBuf[0] = c;
szBuf[1] = '\0';
DWORD dwBytesWritten;
DWORD dwTimeOut = m_nTimeOut; // 500 milli seconds
DWORD start, now;
start = GetTickCount();
do
{
now = GetTickCount();
if ((now - start) > dwTimeOut )
{
strcpy (m_szLastError, "Timed Out");
return 0;
}
WriteFile(m_hDevice, szBuf, 1, &dwBytesWritten, NULL);
}
while (dwBytesWritten == 0);
OutputDebugString(TEXT(strcat(szBuf, "\r\n")));
return dwBytesWritten;
}
int CPort::WriteChars(char * psz, int n)
{
DWORD dwBytesWritten;
WriteFile(m_hDevice, psz, n, &dwBytesWritten, NULL);
return dwBytesWritten;
}
// Return number of bytes in RX queue
DWORD CPort::BytesInQue ()
{
COMSTAT ComStat ;
DWORD dwErrorFlags;
DWORD dwLength;
// check number of bytes in queue
ClearCommError(m_hDevice, &dwErrorFlags, &ComStat ) ;
dwLength = ComStat.cbInQue;
return dwLength;
}
DWORD CPort::BytesInOutQue ()
{
COMSTAT ComStat ;
DWORD dwErrorFlags;
DWORD dwLength;
// check number of bytes in queue
ClearCommError(m_hDevice, &dwErrorFlags, &ComStat );
dwLength = ComStat.cbOutQue ;
return dwLength;
}
int CPort::ReadChars (char* szBuf, int nMaxChars)
{
if (BytesInQue () == 0)
return 0;
DWORD dwBytesRead;
ReadFile(m_hDevice, szBuf, nMaxChars, &dwBytesRead, NULL);
return (dwBytesRead);
}
// Read a one-byte value (char) from the serial port.
int CPort::ReadChar (char& c)
{
static char szBuf[2];
szBuf[0] = '\0';
szBuf[1] = '\0';
if (BytesInQue () == 0)
return 0;
DWORD dwBytesRead;
ReadFile(m_hDevice, szBuf, 1, &dwBytesRead, NULL);
c = *szBuf;
if (dwBytesRead == 0)
return 0;
return dwBytesRead;
}
BOOL CPort::ReadString (char *szStrBuf , int nMaxLength)
{
char str [256];
char str2 [256];
DWORD dwTimeOut = m_nTimeOut;
DWORD start, now;
int nBytesRead;
int nTotalBytesRead = 0;
char c = ' ';
static char szCharBuf [2];
szCharBuf [0]= '\0';
szCharBuf [1]= '\0';
szStrBuf [0] = '\0';
start = GetTickCount();
while (c != m_chTerminator)
{
nBytesRead = ReadChar (c);
nTotalBytesRead += nBytesRead;
if (nBytesRead == 1 && c != '\r' && c != '\n')
{
*szCharBuf = c;
strncat (szStrBuf,szCharBuf,1);
if (strlen (szStrBuf) == nMaxLength)
return TRUE;
// restart timer for next char
start = GetTickCount();
}
// check for time out
now = GetTickCount();
if ((now - start) > dwTimeOut )
{
strcpy (m_szLastError, "Timed Out");
return FALSE;
}
}
return TRUE;
}
int CPort::WaitForQueToFill (int nBytesToWaitFor)
{
DWORD start = GetTickCount();
do
{
if (BytesInQue () >= nBytesToWaitFor)
break;
if (GetTickCount() - start > m_nTimeOut)
return 0;
} while (1);
return BytesInQue ();
}
int CPort::BlockRead (char * pcInputBuffer, int nBytesToRead)
{
int nBytesRead = 0;
int charactersRead;
while (nBytesToRead >= m_nBlockSizeMax)
{
if (WaitForQueToFill (m_nBlockSizeMax) < m_nBlockSizeMax)
return nBytesRead;
charactersRead = ReadChars (pcInputBuffer, m_nBlockSizeMax);
pcInputBuffer += charactersRead;
nBytesRead += charactersRead;
nBytesToRead -= charactersRead;
}
if (nBytesToRead > 0)
{
if (WaitForQueToFill (nBytesToRead) < nBytesToRead)
return nBytesRead;
charactersRead = ReadChars (pcInputBuffer, nBytesToRead);
nBytesRead += charactersRead;
nBytesToRead -= charactersRead;
}
return nBytesRead;
}
Based on my testing and reading, I see several suspicious things in this code:
COMMTIMEOUTS is never set. MS docs say "Unpredictable results can occur if you fail to set the time-out values". But I tried setting this, and it didn't help.
Many methods (e.g. ReadString) will go into a tight loop and hammer the port with repeated reads if they don't get data immediately . This seems to explain the high CPU usage.
Many methods have their own timeout handling, using GetTickCount(). Isn't that what COMMTIMEOUTS is for?
In the new C# (WinForms) program, all these serial routines are called directly from the main thread, from a MultiMediaTimer event. Maybe should be run in a different thread?
BytesInQue method seems to be a bottleneck. If I break to debugger when CPU usage is high, that's usually where the program stops. Also, adding a Sleep(21) to this method before calling ClearCommError seems to resolve the XP problem, but exacerbates the CPU usage problem.
Code just seems unnecessarily complicated.
My Questions
Can anyone explain why this only works with a C# program on a small number of XP systems?
Any suggestions on how to rewrite this? Pointers to good sample code would be most welcome.