Study question
How does the performance of an blastulation prediction model using full-frame data over 3 days compare to a single-frame model using day 3 data?
Summary answer
The single-frame model, based on a day 3 embryo image, demonstrated comparable predictive performance to the full-frame model
What is known already
Pregnancy rates following blastocyst transfer have been reported to be higher than those after cleavage transfer. However, not all cleavage embryos reach the blastocyst stage, leading to the exploration of predictive models using embryo images and artificial intelligence. Clinical practice often involves the decision to culture embryos to the blastocyst stage on day 3 based on embryo quality assessments. Recently, there has been interest in using morphokinetics data for predicting blastulation and pregnancy outcomes. While the importance of morphokinetics in IVF outcomes is acknowledged, the specific time frames and features contributing significantly to accurate predictions remain unclear.
Study design, size, duration
This retrospective study analyzed a dataset of 15,615 timelapse images from 906 patients at a single IVF center. The Single-Frame Model used a single image collected at 66.7 hours post-ICSI, and the Full-Frame Model used 145 frames collected from fertilization to 66.7 hours post-ICSI. Blastulation criteria included reaching a blastocyst on day 5, deemed suitable for transfer or freezing by an experienced embryologist.
Participants/materials, setting, methods
We trained the data using ResNet for the one-frame model and 3D CNN for the full-frame model. The ResNet model used pre-trained weights, while the 3D CNN model was built from scratch with no pre-trained weights. The models were evaluated using the AUROC, accuracy, and F1 score metrics. The statistical significance of the AUC performance differences was assessed using DeLong’s test.
Main results and the role of chance
Our study revealed similar AUC performance between the Single-Frame and Full-Frame Models for day 3 embryos. The Single-Frame Model exhibited an AUROC of 0.76, accuracy of 0.71, and an F1 score of 0.75, while the Full-Frame Model demonstrated comparable performance with an AUROC of 0.75, accuracy of 0.70, and an F1 score of 0.72. DeLong’s test was employed to assess the statistical significance of the difference in AUROC, yielding a non-significant p-value of 0.4884. This study underscores that the Single-Frame Model provides similar predictive performance for blastulation compared to the Full-Frame Model.
Limitations, reasons for caution
Our analysis was limited to images taken at 66.7 hours post-ICSI to mirror clinical practice. Further investigations at different time points are necessary to determine the optimal time for blastulation prediction. Additional analysis of the Day 1 and Day 2 single frames may offer further insights.
Wider implications of the findings
Accurate prediction of blastulation on day 3 is pivotal for embryo survival and IVF success. Our study challenges the assumption that more data improves AI models, highlighting the importance of day 3 observations. Further research is warranted to identify key time points and features for better predictive accuracy.