National Astronomy Meeting (NAM) 2025

Name: National Astronomy Meeting (NAM) 2025
Start: 2025-07-07T08:00:00+01:00
End: 2025-07-11T18:55:00+01:00
Location: Teaching and Learning Centre (TLC)

7–11 Jul 2025

Teaching and Learning Centre (TLC)

Europe/London timezone

Event contact

nam2025@durham.ac.uk

Image augmentations and training dataset size in machine learning models

Not scheduled

1h 30m

Teaching and Learning Centre (TLC)

Durham University South Road Durham DH1 3LS

Poster Barred Galaxies: Unraveling Their Evolution, Dynamics, and Cosmic Role Barred Galaxies: Unravelling Their Evolution, Dynamics and Cosmic Role

Image augmentation is an important aspect of training a deep learning model as it can improve a model’s performance. However, for a model of a given size, the addition of augmentations may provide no increase in model performance and can cause the training process to take longer, wasting potentially limited time and resources. Because of this we tested the importance of both the image augmentation and training dataset size with a pre-existing machine learning model called Zoobot, a state-of-the-art model that classifies several morphological features of galaxy images.

We find that generally, increasing the number of augmentations and size of training data does increase model performance. However, the increases in performance are often only minimal (2-3% increase in performance). Normally, the models with some level of image augmentation have similar performance, indicating that the exact selection of augmentations might not be that important. What appears to be more important is that there is some augmentation process, regardless of the specific augmentations chosen. Due to capacity, the different models often converge on a maximum ability they are unable to surpass. As more complex questions require more data, the models are less likely to converge for said questions. This research demonstrates the careful balancing required between model capacity and data diversity, and shows that we often don't need to throw everything but the kitchen sink into a deep learning model to achieve good results. In an era when sustainability is paramount, astronomers might consider how to minimise computer resource use going forward.

Leon Butterworth (University of Hertfordshire)

There are no materials yet.

National Astronomy Meeting (NAM) 2025

Event contact

Image augmentations and training dataset size in machine learning models

Teaching and Learning Centre (TLC)

Description

Author

Presentation materials

Choose timezone

National Astronomy Meeting (NAM) 2025

Event contact

Description

Author

Presentation materials