I got a typescript module (used by a VSCode extension) which accepts a directory and parses the content contained within the files. For directories containing large number of files this parsing takes a bit of time therefore would like some advice on how to optimize it.
I don't want to copy/paste the entire class files therefore will be using a mock pseudocode containing the parts that I think are relevant.
class Parser {
constructor(_dir: string) {
this.dir = _dir;
}
parse() {
let tree: any = getFileTree(this.dir);
try {
let parsedObjects: MyDTO[] = await this.iterate(tree.children);
} catch (err) {
console.error(err);
}
}
async iterate(children: any[]): Promise<MyDTO[]> {
let objs: MyDTO[] = [];
for (let i = 0; i < children.length; i++) {
let child: any = children[i];
if (child.type === Constants.FILE) {
let dto: FileDTO = await this.heavyFileProcessingMethod(file); // this takes time
objs.push(dto);
} else {
// child is a folder
let dtos: MyDTO[] = await this.iterateChildItems(child.children);
let dto: FolderDTO = new FolderDTO();
dto.files = dtos.filter(item => item instanceof FileDTO);
dto.folders = dtos.filter(item => item instanceof FolderDTO);
objs.push(FolderDTO);
}
}
return objs;
}
async heavyFileProcessingMethod(file: string): Promise<FileDTO> {
let content: string = readFile(file); // util method to synchronously read file content using fs
return new FileDTO(await this.parseFileContent(content));
}
async parseFileContent(content): Promise<any[]> {
// parsing happens here and the file content is parsed into separate blocks
let ast: any = await convertToAST(content); // uses an asynchronous method of an external dependency to convert content to AST
let blocks = parseToBlocks(ast); // synchronous method called to convert AST to blocks
return await this.processBlocks(blocks);
}
async processBlocks(blocks: any[]): Promise<any[]> {
for (let i = 0; i < blocks.length; i++) {
let block: Block = blocks[i];
if (block.condition === true) {
// this can take some time because if this condition is true, some external assets will be downloaded (via internet)
// on to the caller's machine + some additional processing takes place
await processBlock(block);
}
}
return blocks;
}
}
Still sort of a beginner to TypeScript/NodeJS. I am looking for a multithreading/Java-esque solution here if possible. In the context of Java, this.heavyFileProcessingMethod
would be a instance of Callable
object and this object would be pushed into a List<Callable>
which would then be executed parallelly by an ExecutorService
returning List<Future<Object>>
.
Basically I want all files to be processed parallelly but the function must wait for all the files to be processed before returning from the method (so the entire iterate
method will only take as long as the time taken to parse the largest file).
Been reading on running tasks in worker threads in NodeJS, can something like this be used in TypeScript as well? If so, can it be used in this situation? If my Parser
class needs to be refactored to accommodate this change (or any other suggested change) it's no issue.
EDIT: Using Promise.all
async iterate(children: any[]): Promise<MyDTO>[] {
let promises: Promies<MyDTO>[] = [];
for(let i = 0; i <children.length; i++) {
let child: any = children[i];
if (child.type === Constants.FILE) {
let promise: Promise<FileDTO> = this.heavyFileProcessingMethod(file); // this takes time
promises.push(promise);
} else {
// child is a folder
let dtos: Promise<MyDTO>[] = this.iterateChildItems(child.children);
let promise: Promise<FolderDTO> = this.getFolderPromise(dtos);
promises.push(promise);
}
}
return promises;
}
async getFolderPromise(promises: Promise<MyDTO>[]): Promise<FolderDTO> {
return Promise.all(promises).then(dtos => {
let dto: FolderDTO = new FolderDTO();
dto.files = dtos.filter(item => item instanceof FileDTO);
dto.folders = dtos.filter(item => item instanceof FolderDTO);
return dto;
})
}
await
in a for loop - it will effectively block until the next loop, then block again, until it is done. Instead you can create an array of promises and then useawait Promise.all(promises)
to resolve them concurrently. – Neuman