This research investigates the dry beans data set produced by Koklu et.al [1] and attempts to present a classification model which can discriminate among beans for the purpose of segregation in agricultural applications like separating seeds before sowing to maintain crop purity.
Dry Bean Dataset
Images of 13,611 grains of 7 different registered dry beans were taken with a high-resolution camera. A total of 16 features; 12 dimensions and 4 shape forms, were obtained from the grains.
Murat KOKLU Faculty of Technology, Selcuk University, TURKEY. ORCID : 0000-0002-2737-2360 mkoklu@selcuk.edu.tr
Ilker Ali OZKAN Faculty of Technology, Selcuk University, TURKEY. ORCID : 0000-0002-5715-1040 ilkerozkan@selcuk.edu.tr
Multivariate
Classification
Categorical Integer Real
CS / Engineering
Matrix
Does your data set contain missing values? No
Number of Instances (records in your data set): 13611
Number of Attributes (fields within each record): 17
Seven different types of dry beans were used in this research, taking into account the features such as form, shape, type, and structure by the market situation. A computer vision system was developed to distinguish seven different registered varieties of dry beans with similar features in order to obtain uniform seed classification. For the classification model, images of 13,611 grains of 7 different registered dry beans were taken with a high-resolution camera. Bean images obtained by computer vision system were subjected to segmentation and feature extraction stages, and a total of 16 features; 12 dimensions and 4 shape forms, were obtained from the grains.
1.) Area (A): The area of a bean zone and the number of pixels within its boundaries. 2.) Perimeter (P): Bean circumference is defined as the length of its border. 3.) Major axis length (L): The distance between the ends of the longest line that can be drawn from a bean. 4.) Minor axis length (l): The longest line that can be drawn from the bean while standing perpendicular to the main axis. 5.) Aspect ratio (K): Defines the relationship between L and l. 6.) Eccentricity (Ec): Eccentricity of the ellipse having the same moments as the region. 7.) Convex area (C): Number of pixels in the smallest convex polygon that can contain the area of a bean seed. 8.) Equivalent diameter (Ed): The diameter of a circle having the same area as a bean seed area. 9.) Extent (Ex): The ratio of the pixels in the bounding box to the bean area. 10.)Solidity (S): Also known as convexity. The ratio of the pixels in the convex shell to those found in beans. 11.)Roundness (R): Calculated with the following formula: (4piA)/(P^2) 12.)Compactness (CO): Measures the roundness of an object: Ed/L 13.)ShapeFactor1 (SF1) 14.)ShapeFactor2 (SF2) 15.)ShapeFactor3 (SF3) 16.)ShapeFactor4 (SF4) 17.)Class (Seker, Barbunya, Bombay, Cali, Dermosan, Horoz and Sira)
KOKLU, M. and OZKAN, I.A., (2020), “Multiclass Classification of Dry Beans Using Computer Vision and Machine Learning Techniques.” Computers and Electronics in Agriculture, 174, 105507. DOI: https://doi.org/10.1016/j.compag.2020.105507
KOKLU, M. and OZKAN, I.A., (2020), “Multiclass Classification of Dry Beans Using Computer Vision and Machine Learning Techniques.” Computers and Electronics in Agriculture, 174, 105507. DOI: https://doi.org/10.1016/j.compag.2020.105507