data = data_average_by_colony(data) Spectra are averaged. Grouping is done using data.colony_codes. Should 2 spectra in the same colony have different classes, the assumed is the one assigned to the first spectra in the colony.