-
Younger age population in the sample is more than older age population, as per Kernel Density Distribution in Section 3 Distribution Plots. Two bumps, a global maximum and an almost flat local minimum, indicate that most of the age groups are concentrated around the older and younger middle ages, respectively.
-
Most people have less than 5 auxiliary nodes, also inferred from Section 3 Distribution Plots. Also, from the same section, it is found that most people also survive after removal of nodes.
-
The hexagonal density plot in Section 4.1 Viewing Density in Age VS Nodes indicates that: Overall, all age groups most had less than 10 nodes. Between 50 and 60, the number of nodes increased..
-
From Section 6.1 Strip Plot, correlation between Age VS Nodes shows: It is mostly the middle ages that go through such a procedure. Extremely young or old are rare. However, correlation is seen as slightly strong, as observed from colouration scattering. In order to understand if outliers are decreasing correlation, a box plot is drawn in Section 6.2 Box Plot, the observation of which is: The outliers indicate a combination of factors are involved in a non-linear relationship, instead of a simple node and age relationship.
-
According to Section 6.2 Relational Plot of Age VS Nodes with Colouration of Survival Status: Early detection increases survival, at or preferably before early 40s.
-
Looking at heat map in Section 7 Again, Looking at Whole Dataset, survival and node have medium correlation. This can be explained in two ways:
a. Small dataset
b. Medical intervention has positive correlation with survival, but damage induced by node has negative correlation with survival. Hence, the relationship between survival and node is not a straight-forward two-dimensional relationship.
NOTE: For now, no strong demarkation has been observed for linear classification, due to limited factors included as columns in dataset. The demarkations found are slightly strong or medium correlation.