I have created a script that read data from a CSV file, check if the data already exists in the database and import it if it does not. If the data does exist (the code of a specific product) then the rest of the information needs to be updated from the CSV file.
For example; I have a member with code WTW-2LT, named Alex and surname Johnson in my CSV file. The script checks if the member with code WTW-2LT, named Alex and surname Johnson already exist, if it does, the contact details and extra details needs to be updated from the script (other details like subject and lecturer also needs to be checked, all details are in one line in the CSV), if it doesn't exist the new member just have to be created.
My script what I have so far with minimum other checks to prevent distraction for now;
while ($row = fgetcsv($fp, null, ";")) {
if ($header === null) {
$header = $row;
continue;
}
$record = array_combine($header, $row);
$member = $this->em->getRepository(Member::class)->findOneBy([
'code' =>$record['member_code'],
'name' =>$record['name'],
'surname' =>$record['surname'],
]);
if (!$member) {
$member = new Member();
$member->setCode($record['member_code']);
$member->setName($record['name']);
$member->setName($record['surname']);
}
$member->setContactNumber($record['phone']);
$member->setAddress($record['address']);
$member->setEmail($record['email']);
$subject = $this->em->getRepository(Subject::class)->findOneBy([
'subject_code' => $record['subj_code']
]);
if (!$subject) {
$subject = new Subject();
$subject->setCode($record['subj_code']);
}
$subject->setTitle($record['subj_title']);
$subject->setDescription($record['subj_desc']);
$subject->setLocation($record['subj_loc']);
$lecturer = $this->em->getRepository(Lecturer::class)->findOneBy([
'subject' => $subject,
'name' => $record['lec_name'],
'code' => $record['lec_code'],
]);
if (!$lecturer) {
$lecturer = new Lecturer();
$lecturer->setSubject($subject);
$lecturer->setName($record['lec_name']);
$lecturer->setCode($record['lec_code']);
}
$lecturer->setEmail($record['lec_email']);
$lecturer->setContactNumber($record['lec_phone']);
$member->setLecturer($lecturer);
$validationErrors = $this->validator->validate($member);
if (!count($validationErrors)) {
$this->em->persist($member);
$this->em->flush();
} else {
// ...
}
}
You can notice this script has to query the database 3 times to check if one CSV line exists. In my case I have files with up to 2000+ lines, so for every line to perform 3 queries to check if that line exists or not is quite time-consuming.
Unfortunately, I also can't import the rows in batches because if one subject doesn't exist it will create it so many times until the batch is flushed to the database and then I sit with duplicate records that serve no point.
How can I improve performance and speed to the max? Like first get all records from the database and store it in arrays (memory consuming?) and then do the checks and add the line to the array and check from there...
Can somebody please help me find a way to improve this (with sample code please?)
set_time_limit
, because the timeouts are mostly the problem. – Chromoplast