Overall Conclusion:
After all our model tests had been completed, we discovered that the multiple linear regressions did not perform as well as we wanted. After further evaluation, we determined that out of all tests run, the three best results were from our 5th, 6th, and 7th Neural Network tests, with R2 scores of 56.535%, 58.178%, 59.190% respectively. These neural networks had 3 or 4 densely-connected layers of 30 perceptrons with ReLU activation functions.
Improvements:
As we progressed through our testing, there were several minor problems that we encountered: There could have been some extraneous outliers that messed with our data, a new tuner would have been better, or the data set was just an overal bad fit.
1. Our data set may have not been a good fit for machine learning. It is probable that there were other factors that influenced the data that were not captured in the original data set. The physical characteristics do not, by themselves, explain all of the variance in the data set.
2. A different tuner could have been implemented if there had been more time for us to work with the Keras Tuner to have gotten a more conclusive result.
3. Even with the ETL process, there were still several data "shenanigans" that could have skewed the entire data, e.g., a 0.5mm shell length. The numbers from the data set on UC, Irvine's website did not seem to make much sense with the units given (mm and grams). It didn't seem that the values had been converted into other units. We aren't sure why the values and units don't seem to match. This would be something we would want to look into further if we had more time.
Overall Thoughts:
While we did not achieve the strong predictive scores what we had sought, the process of utilizing machine learning to try and make a positive, real-life impact on the world, even for something as small and niche as abalone shells, was informative and helped us incorporate and apply the skills
that we have learned in this course.