For 209 objects an X-data set (467 variables) and a y-data set (1 variable) is available. The data describe GC-retention indices of polycyclic aromatic compounds (y) which have been modeled by molecular descriptors (X).
The data consist of mass spectra from 600 chemical compounds, where 300 contain a phenyl substructure (group 1) and 300 compounds do not contain this substructure (group 2). The mass spectra have been transformed to 658 variables, containing the mass spectral features. The 2 groups are coded as -1 (group 1) and +1 (group 2), and is provided as first last variable.
For 166 alcoholic fermentation mashes of different feedstock (rye, wheat and corn) we have 235 variables (X) containing the first derivatives of near infrared spectroscopy (NIR) absorbance values at 1115-2285 nm, and two variables (Y) containing the concentration of glucose and ethanol (in g/L).
For 15 cereals an X and Y data set, measured on the same objects, is available. The X data are 145 infrared spectra, and the Y data are 6 chemical/technical properties (Heating value, C, H, N, Starch, Ash). Also the scaled Y data are included (mean 0, variance 1 for each column). The cereals come from 5 groups B=Barley, M=Maize, R=Rye, T=Triticale, W=Wheat.