Part one of two; make a data and statistical analysis of abalones population in California; keep in mind this is a basisc tatistical analysis for academic purposes. Can read the full document here: Abalones Part 1 PDF Document
The purpose of this paper is to make an exploratory data analysis of failed Abalone population observational study, the goal is to determine probable reasons why the original study was not successful, or if any other variables should be considered in predicting age based on physical characteristics. This exploratory data analysis intend to identify possible relationships between the physical characteristics and other variables observed in Abalone data collection, and how this would be significant to understand the different underlying relations among the variables, improve future observational studies and conclusions in the second delivery.
Blacklip Abalone, according to Wild Fisheries Research Program, is large flattened marine mollusk use mainly as a food source for humans; the populations of this mollusk are distributed from Cabo San Lucas in Bajada California, Mexico all the way up to Oregon. According to the Center for Biological Diversity, the current Blacklip Abalone population is only 1% of the population that existed in 1985.
The factors often cited as the most relevant for such decline in the populations are:
- Predation.
- Mortality of small abalone for many reasons.
- Over harvesting. Abalone are easily over harvested because of slow growth and variable reproductive success.
- Competition. Sea urchins and other species, utilizing abalone food and living space.
- Illegal harvesting, the most important reason in declining.
- Loss of habitat. Coastal "development" and pollution have ruined large areas of abalone habitat.
- Environmental factors, such global warming, pollution or changing environmental factors.
- Diseases, such as withering syndrome.
Making sustainable efforts to understand the Blacklip abalone is very important for species subsistence;this paper uses the statistical and graphical methods of exploratory data analysis on abalone data collection and focuses on the approach followed by the scientist to gather the information. The main purpose is trying to understand the different variables or methods used during gathering process, and provide a systematic scheme for looking at data and extracting the patterns that are contained in the data.
The tools that will be used in this exploratory data analysis are:
- Making exploratory statistical analysis in the whole data set, regardless of differentiators such as sex or class, to have an overview of the data and understand each variable as separate entity.
- Making exploratory graphs for singles variables to recognize the structure, distribution, identify outliers or any other atypical characteristic.
- Understand the possible relationship between variables, more specifically between categorical and numerical variables, to comprehend possible patterns and structures.
- Apply principles of analytic graphics, to understand the relations among variables, type of distribution, and variation in the different variables, central tendency measurements, data characteristics and identify outliers. Plotting the data permits the analyst to determine the extent to which the assumptions are valid and to catch obvious errors in data entry.
-
Physical abalone characteristics are partially helpful to predict age, but when physical characteristics are similar between different observations subjective appreciation from investigators is part of data classification.
-
Even so many variables were include into the investigation, there’s a weakness in the model because factors such weather, location or government regulations are not part of it.
-
Similarly, failure to discover causal relations between variables (or assume there’s one) or the real impact of the different variables in predicting abalone’s age, lead to false discoveries and draw incorrect conclusions from it.
When there is not a clear relation or correlation between variables the probability to have wrong conclusions is high, as mentioned before “ring clarity can be an issue. At the completion of the breeding season sexing abalone can be difficult. Similar difficulties are experienced when trying to determine the sex of immature abalone referred to as infants.”, physical features are not as clear as we might think and that implies that individual appreciation is a fact when data is grouped; at this point certain strong criteria with no room for interpretation should be put in place so investigator can classify data with standard features and no in personal terms. Causality can be accepted if statistics process are run into it and determine correlation degree, but just with see data in tables or graphs assuming causality is a great risk to draw mistaken conclusions from it.