Challenges with PDF Generation Performance
A discussion on the performance issues faced when generating PDFs for thousands of students and how I optimized the process.
Today I wanna talk about a problem that I, more specifically, faced during MoraExams 2025.
MoraExams is a series of exams for A/L Physical and Biological Science students, conducted by Mora E-Tamils students. We usually conduct these exams about two months before the final GCE A/L exams.
There are several teams with specific responsibilities:
- the tech team handles technology-related work and web development,
- the marketing team handles social media and design,
- exam coordinators focus on exam paper preparation and answer explanations,
- parcel coordinators focus on sending and receiving exam papers.
There are also leadership roles like president, secretary, and treasurer.
I was part of the tech team. We redeveloped the admin dashboard and tried to build a finance system for the treasurer and finance team to handle their finace where the google sheets were used earlier. I was mostly assigned to backend work, and sometimes I uploaded model papers to the public website, moraexams.org.
While working on the backend, I got a special task from our tech coordinator, Sahithyan: generate student attendance sheets. Previously, our senior batch used a template and LaTeX to generate these sheets manually with hardcoded logic. So we decided to add an endpoint and control this from the dashboard.
For this work, I tested multiple libraries and found gopdf to be the best choice for PDF generation in Go. I refactored the previous code, added the endpoint in the backend, tested it with the previous student database, finalized the format, and removed old layouts.
However, the generation process was too slow. It took more than 3 minutes to generate sheets for around 5,000 students because the preparation function had a time complexity of O(n5). So I removed some loops and added multithreading to several parts. The final version took about 1.5 minutes on average, but performance still was not good enough.
So I updated the function to generate PDFs subject by subject and added a separate endpoint for that. Then I created a frontend button to generate and download the zipped PDFs. I tested everything locally, and it worked fine. Performance was much better than before.
After all other work was finished and admissions were closed, we deployed the backend to a DigitalOcean VPS. We did not test this flow after deployment. We started generation, clicked download, and sent those ZIP files to friends handling printing.
They called us at night, around two days before the first exam. At that time, we were on our way to Jaffna for exam work. They told us many student names were missing.
Then we found the VPS had only 1 GB RAM and 1 cores, but my local laptop had 8 GB RAM and 12 cores. This hardware difference was the main reason for the issue. And the student count was 10K. Generation was slower than expected, and we started downloading too early, so partial attendance sheets were zipped and downloaded.
We started generation again, waited, and downloaded again, but the same problem happened. We modified the code to generate each part of the exam separately for each subject, such as Part 1 and Part 2, but still got the same issue. Then we decided to back up the student data and generate sheets locally. We did that, and all attendance sheets were generated safely.
With help from our friends on the train, we manually compared all student names with the admin dashboard records page. We found some missing names caused by admission-approval mistakes and corrected them. Then we checked student counts in the regenerated sheets and sent them to friends in the exam preparation hall.
Still, they said some names were missing, but we could not find the error. I also realized I had forgotten to generate sheets gender-wise, so we could not use those either. I went directly from the train to the hall around 5:00 a.m. and fixed the gender-wise sheet issue.
By that time, our juniors had already sent parcels to some out-district exam centers with previously generated sheets. So our committee members decided not to use the regenerated sheets for the first exam, even though I said the bugs were fixed. They decided to use the corrected sheets from the next subject onward because of some various reasons.
On the first exam day, I added new students to the database and generated updated attendance sheets. Those sheets were printed and used after the first exam. It really helped invigilators check and verify students easily.
Because of these errors, our finance team spent extra money printing incorrect sheets. The cost was in five digits.
It’s been a hell of a journey, but I learned a valuable lesson the hard way: the projects we build in development mode are not always close enough to real-world production-level projects. The challenges we faced reinforced the importance of testing critical workflows in an actual deployment environment.
MoraExams 2025 taught me not only technical skills like Go, PDF generation & downloading, and deployment in VPS but also the importance of thorough testing and preparation for real-world scenarios. It’s crucial to make sure that everything works smoothly, especially when working with large-scale projects that affect many people.
Here are some useful related links
- MoraExams Official Website - The platform for students to access exam papers and resources.
- Go Programming Language Documentation - Official documentation for learning and using Go.
- gopdf GitHub Repository - The Go PDF library used for generating attendance sheets.
- Docker Documentation - A comprehensive guide to using Docker for containerization.
- DigitalOcean VPS Documentation - Official resources on managing and deploying applications on DigitalOcean VPS.
- LaTeX Documentation - If some readers are interested in learning LaTeX for document generation.
- PDF Generate Golang (My Testing Repo) - My testing repository for generating PDFs using Go.