Background
This is a multi-threaded batch application, each thread has it's own file. I have logic elsewhere that will stop the file rename from happening in the case of a file creation failure.
This process runs as a daemon and generates a few thousand files each day. This exception happens for maybe 1 file per 3 days, so the method we are using works most of the time.
The machine running the batch is Red Hat Enterprise Linux Server release 6.7 (Santiago)
Java version is 1.8.0_162
The temp filenames are generated by appending the result of UUID.randomUUID() from java.util.UUID.
The real filename may have duplicates, that's why we used a rand UUID instead of .tmp for a temp file name. This shouldn't be an issue since the move portion is in a synchronized block.
Exception:
2018-07-26 15:06:01,743 ERROR (ProcessRecordsTask.java:renameFileAfterProcess():674) - Error: Unable to rename file:
java.nio.file.NoSuchFileException: /logs/apps/appname/FILNAMESTUFF_07_26_2018_15_05_51.xml.5c80331c-3b7e-4e16-90d7-c0d7810451c5 -> /logs/apps/appname/FILNAMESTUFF_07_26_2018_15_05_51.xml
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396)
at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
at java.nio.file.Files.move(Files.java:1395)
at com.filetransferbatch.task.ProcessRecordsTask.renameFileAfterProcess(ProcessRecordsTask.java:664)
at com.filetransferbatch.task.ProcessRecordsTask.saveFileData(ProcessRecordsTask.java:349)
at com.filetransferbatch.task.ProcessRecordsTask.xmlTransfer(ProcessRecordsTask.java:244)
at com.filetransferbatch.task.ProcessRecordsTask.call(ProcessRecordsTask.java:162)
at com.filetransferbatch.task.ProcessRecordsTask.call(ProcessRecordsTask.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I am getting the exception from the following snippet:
private boolean renameFileAfterProcess(String tmpFileName) {
boolean fileRenamed = false;
try {
if (null != tmpFileName && (!("".equals(tmpFileName)))) {
Path tmpFilePath = Paths.get(tmpFileName);
logger.info("tmpFilePath:" + tmpFilePath + ":Renamed Filepath: " + realFilePath);
Path realFile = Paths.get(realFilePath);
synchronized (this) {
logger.info("File " + tmpFilePath + " exists: " + Files.exists(tmpFilePath));
Files.move( tmpFilePath,
realFile,
StandardCopyOption.REPLACE_EXISTING,
StandardCopyOption.ATOMIC_MOVE);
logger.info(tmpFileName + ":File was successfully renamed to :" + realFilePath);
fileRenamed = true;
}
}
} catch (IOException e) {
fileRenamed = false;
logger.error("Error: Unable to rename file:", e);
} catch (Exception e) {
logger.error("Error :", e);
}
return fileRenamed;
}
The file is created this way
private boolean createFile(String fileName, byte[] fileDataMerged) {
boolean fileCreated = false;
if (fileName.trim().length() != 0) {
try {
Path createdFilePath = Files.write( Paths.get(tmpFilePath),
fileDataMerged,
StandardOpenOption.SYNC,
StandardOpenOption.CREATE,
StandardOpenOption.WRITE);
if (createdFilePath != null) {
fileCreated = Files.exists(createdFilePath);
}
} catch (IOException e) {
logger.error("Error writing temp file: ", e);
} catch (Exception e) {
logger.error("Error writing temp file: ", e);
}
}
return fileCreated;
}
The only thing I can think of to fix this is possibly sleep the thread for a few milliseconds, in case it's a file system level problem. The issue is, it's really difficult to reproduce the exception in the non prod env.
I have a suspicion that the exception happens when nearly all the threads have the same real file name, so there is a bunch renames to the same file name, but I can't be sure of this.
Thanks
**Edit: **
We had a skybot job running that was grabbing files with a. csv extension that were older than a day. I think the job was locking all the files in the folder as it was looking for files to move. After I made a code fix that allowed me to remove the skybot job, the issue went away.