8 Pages
2076 Words
Introduction To Statistical Analysis of Socioeconomic Data
Part I: Describe your data and graphical analysis
- a) Definition of variables
Variable Name |
Definition |
Country |
The nation from which the data is collected; in this case, Uruguay. |
Year |
The calendar year in which the survey data was recorded; here it is 2021. |
Global Region |
The geographical region where the country is located; Uruguay is in 'Latin America & Caribbean'. |
Country IncomeLevel2021 |
The World Bank's classification of the country's income level in 2021; Uruguay is 'High income'. |
Age |
The respondent's age at the time of the survey. |
AgeGroups4 |
A categorization of ages into four broad groups for demographic segmentation. |
Gender |
The respondent's self-identified gender. |
Education |
The highest level of education completed by the respondent. |
IncomeFeelings |
The respondent's subjective feeling about their current income situation. |
INCOME_5 |
The quintile distribution of the respondent's income compared to the national level. |
EMP_2010 |
The respondent's employment status as of 2010, which could indicate long-term employment trends. |
Resilience_Index |
A composite measure indicating the respondent's capacity to recover from economic disturbances. |
Variable categorisation
Country (Uruguay, or any other country if present in the full dataset)
Global Region (Latin America & Caribbean)
Gender (Male, Female)
Urbanicity (Large city/suburb, Rural area/small town)
Education (Primary, Secondary, Tertiary)
IncomeFeelings (Getting by, Finding it difficult, Living comfortably, etc.)
EMP_2010 (Employment status such as Employed, Unemployed, Out of workforce)
- b) Histogram
A non-uniform distribution of the underlying data is indicated by the large variations in frequencies across bins. The highest frequencies, for instance, are seen in the bins close to 28.23 and 33.52, indicating a concentration of values in these regions (Makwashi et al., 2019). With some exceptions, the frequency usually drops as the bin value rises. A skew in the data towards the lower end of the range may be indicated by this pattern. The lowest frequency bins, especially those starting at 83.77, indicate that there are not as many observations in the data's higher range (Anders Lindanger et al., 2021). This could suggest that as the value rises, the variable being monitored becomes less frequent or less common.
c)
Female |
595 |
Male |
405 |
Grand Total |
1000 |
A total of 1000 people were surveyed, with a bias towards female participation, who made up 59.5% of the sample, according to the data presented (Sidiropoulou et al., 2022). Male responders make up the remaining 40.5%. This disparity points to a potential gender bias in the sample process or response rate (Régner et al., 2019). If the population's gender ratio is more balanced, this discrepancy could lead to biassed conclusions when interpreting survey data or extrapolating findings to a larger population. d)
15-29 |
131 |
30-49 |
183 |
50-64 |
142 |
65+ |
139 |
Grand Total |
595 |
i)
The information supplied shows a thorough breakdown of respondents' ages and genders within a sizable sample size of 125,911, comprising 59,417 men and 66,494 women (Resarchgate, 2020). There is a noticeable presence of young adults, particularly those who are eighteen, indicating that this group is actively involved. The age 30 age group's top response rate suggests that this group is highly engaged in the survey or that there may be room for targeted outreach (Sommers et al., 2020). Female engagement regularly exceeds male participation across all age groups, suggesting either a demographic trend or a larger predisposition for female participation. Even in the 65+ age group, participation is still high, demonstrating the need to include older viewpoints. Notably, there is a section above 99 years old that may need to be verified for accuracy, as well as a category for responders who chose not to declare their age.
- ii) Gender and education
Part II: Numerical Analysis I
Calculate the descriptive statistics for all variables using the add-in data analysis in Excel
The age and resilience index descriptive statistics for a sample of 139 senior respondents show an average age of 74.4 years with a modest age distribution, as indicated by a standard deviation of 7.04 years (K Bailey , 2022). There are comparatively more responders at the older end of the age spectrum due to the modest right-skewed age data. With a median value that is extremely close to the mean and an average of 0.53 for the resilience index, the distribution around the centre value appears to be almost symmetrical (K Bailey , 2022). Nonetheless, a sizable portion of the sample has a resilience score below average, as indicated by the mode being lower than the mean.
Part III: Numerical Analysis II
- a) Age and resilience
Age |
resilience_index |
Age |
1 |
resilience_index |
-0.04836 |
1 |
Ones are always present in the matrix's diagonal since a variable has a perfect correlation with itself. There is a very minor negative association between age and resilience_index, as indicated by the correlation coefficient of about -0.04836. Given that this value is quite close to zero, the correlation between age and the resilience index appears to be negligible or nonexistent. Put practically, this indicates that the resilience index based on this sample does not significantly grow or decrease with age.
- b)
Education and gender
Gender |
Education |
Gender |
1 |
Education |
0.060199 |
1 |
The correlation coefficient between gender and education is 1, which is the usual value for any variable (Portillo et al., 2020). The correlation coefficient between gender and education, or off-diagonal value, is 0.060199. There is a very modest positive association between these factors, as indicated by this. Practically speaking, the extremely low correlation coefficient of 0.060199 indicates that there is little to no linear relationship in this dataset between gender and education. The positive value suggests that there is a proportionately small increase in the value of the other variable as the value of one variable marginally increases (Endri and Fathony, 2020). It is likely not statistically significant, though, because of how near the coefficient is to zero.
Income quintiles and gender
Gender |
IncomeFeelings |
Gender |
1 |
IncomeFeelings |
0.22919444 |
1 |
The diagonal of education likewise has a 1 for the same purpose. A very minor positive association between gender and education is indicated by the correlation coefficient of 0.060199 (MM Amoli, 2020). There is only a slight correlation between gender and education level in the dataset, as indicated by the correlation coefficient of 0.060199, which is very near to zero. This suggests that variations in educational attainment are not substantially linked to a person's gender in this group. Higher education levels in this sample may be marginally more closely related to one gender than the other, given the positive result.
c)
The data revealed a greater percentage of female responders, but it's unclear whether resilience differs between males and females in the absence of gender-specific resilience indicators (Lundorff et al., 2020). Similarly, although gender and education showed a very weak positive link, this does not immediately tell us about the resilience levels across the range of educational attainment. It was not possible to analyse resilience across different income levels because no income data was available. Resilience scores were symmetrically distributed around the mean, despite the age data showing a wide representation across the senior spectrum and a slight skew towards older ages (Manimaran et al., 2023). This points to a population in which ageing does not significantly affect the moderate degree of resilience exhibited by older adults. The storyline for this area may depict a society in which older people value resilience, possibly reinforced by social, cultural, or health policies that offer equal assistance to people of all ages and educational levels.
References
- Anders Lindanger, M. Marisaldi, Sarria, D., Nikolai Østgaard, Lehtinen, N.G., Chris Alexander Skeie, A. Mezentzev, P. Kochkin, K. Ullaland, Yang, S., Georgi Genov, Carlson, B.E., Christoph Köhn, J. Navarro-Gonzalez, Connell, P., V. Reglero and Neubert, T. (2021). Spectral Analysis of Individual Terrestrial Gamma?Ray Flashes Detected by ASIM. Journal Of Geophysical Research: Atmospheres, 126(23). doi:https://doi.org/10.1029/2021jd035347.
- Endri, E. and Fathony, M. (2020). Determinants of firm's value: Evidence from financial industry. Management Science Letters, [online] 10(1), pp.111–120. Available at: http://m.growingscience.com/beta/msl/3393-determinants-of-firms-value-evidence-from-financial-industry.html.
- K Bailey (2022). Evaluating the Impact of Post-Traumatic Stress Disorder on Suicidal and Self-Injurious Behaviors in Individuals with Borderline Personality Disorder - ProQuest. [online] www.proquest.com. Available at: https://search.proquest.com/openview/bd551f2595442a7d1341b6510eaf90c9/1?pq-origsite=gscholar&cbl=18750&diss=y [Accessed 23 Nov. 2023].
- Lundorff, M., Bonanno, G.A., Johannsen, M. and O'Connor, M. (2020). Are there gender differences in prolonged grief trajectories? A registry-sampled cohort study. Journal of Psychiatric Research, 129, pp.168–175. doi:https://doi.org/10.1016/j.jpsychires.2020.06.030.
- Makwashi, N., Barros, D.S., Sarkodie, K., Zhao, D. and Diaz, P.A. (2019). Depositional Behaviour of Highly Macro-Crystalline Waxy Crude Oil Blended with Polymer Inhibitors in a Pipe with a 45-Degree Bend. Day 4 Fri, September 06, 2019. doi:https://doi.org/10.2118/195752-ms.
- Manimaran, M., Dhar, M., Norabuena-Figueroa, R., Mahaveerakannan, R., Saraswathi, S. and Selvakumarasamy, K. (2023). Implementing Machine Learning-based Autonomic Cyber Defense for IoT-enabled Healthcare Devices. Journal of Artificial Intelligence and Technology, [online] 3(4), pp.162–172. doi:https://doi.org/10.37965/jait.2023.0209.
- MM Amoli (2020). The Effect of Medicaid Expansion on Diabetes Care - ProQuest. [online] www.proquest.com. Available at: https://search.proquest.com/openview/1cb41e2a42dfbd325423baf783e93ab4/1?pq-origsite=gscholar&cbl=18750&diss=y [Accessed 23 Nov. 2023].
- Portillo, J., Garay, U., Tejada, E. and Bilbao, N. (2020). Self-Perception of the Digital Competence of Educators during the COVID-19 Pandemic: A Cross-Analysis of Different Educational Stages. Sustainability, 12(23), p.10128. doi:https://doi.org/10.3390/su122310128.
- Régner, I., Thinus-Blanc, C., Netter, A., Schmader, T. and Huguet, P. (2019). Committees with implicit biases promote fewer women when they do not believe gender bias exists. Nature Human Behaviour, 3(11), pp.1171–1179. doi:https://doi.org/10.1038/s41562-019-0686-3.
- Resarchgate (2020). (PDF) Sample Size for Survey Research: Review and Recommendations. [online] ResearchGate. Available at: https://www.researchgate.net/publication/343303677_Sample_Size_for_Survey_Research_Review_and_Recommendations.
- Sidiropoulou, M., Gerogianni, G., Kourti, F.E., Pappa, D., Zartaloudi, A., Koutelekos, I., Dousis, E., Margari, N., Mangoulia, P., Ferentinou, E., Giga, A., Zografakis-Sfakianakis, M. and Dafogianni, C. (2022). Perceptions, Knowledge and Attitudes among Young Adults about Prevention of HPV Infection and Immunization. Healthcare, 10(9), p.1721. doi:https://doi.org/10.3390/healthcare10091721.
- Sommers, B.D., Chen, L., Blendon, R.J., Orav, E.J. and Epstein, A.M. (2020). Medicaid Work Requirements In Arkansas: Two-Year Impacts On Coverage, Employment, And Affordability Of Care. Health Affairs, 39(9), pp.1522–1530. doi:https://doi.org/10.1377/hlthaff.2020.00538.