Department of Nutrition, Gillings School of Global Public Health, UNC Center for Health Promotion and Disease Prevention, University of North Carolina-Chapel Hill, 2200 McGavran-Greenberg Hall, Chapel Hill NC ;
2.
Department of Public Health, East Carolina University, Lakeside Annex 8, Room 126, Greenville NC;
3.
UCLA Fielding School of Public Health, University of California, Los Angeles, CA;
4.
Department of Kinesiology & Nutritional Science, California State University, Los Angeles, CA
Received:
25 February 2015
Accepted:
17 August 2015
Published:
26 August 2015
Efforts to transform corner stores to better meet community dietary needs have mostly occurred in urban areas but are also needed in rural areas. Given important contextual differences between urban and rural areas, it is important to increase our understanding of the elements that might translate successfully to similar interventions involving stores in more rural areas. Thus, an in-depth examination and comparison of corner stores in each setting is needed. A mixed methods approach, including windshield tours, spatial visualization with analysis of frequency distribution, and spatial regression techniques were used to compare a rural North Carolina and large urban (Los Angeles) food environment. Important similarities and differences were seen between the two settings in regards to food environment context, spatial distribution of stores, food products available, and the factors predicting corner store density. Urban stores were more likely to have fresh fruits (Pearson chi2 = 27.0423; p < 0.001) and vegetables (Pearson chi2 = 27.0423; p < 0.001). In the urban setting, corner stores in high income areas were more likely to have fresh fruit (Pearson chi2 = 6.00; p = 0.014), while in the rural setting, there was no difference between high and low income area in terms of fresh fruit availability. For the urban area, total population, no vehicle and Hispanic population were significantly positively associated (p < 0.05), and median household income (p < 0.001) and Percent Minority (p < 0.05) were significantly negatively associated with corner store count. For the rural area, total population (p < 0.05) and supermarket count were positively associated (p < 0.001), and median household income negatively associated (P < 0.001), with corner store count. Translational efforts should be informed by these findings, which might influence the success of future interventions and policies in both rural and urban contexts.
Citation: Jared T McGuirt, Stephanie B. Jilcott Pitts, Alice Ammerman, Michael Prelip, Kathryn Hillstrom, Rosa-Elena Garcia, William J. McCarthy. A Mixed Methods Comparison of Urban and Rural Retail Corner Stores[J]. AIMS Public Health, 2015, 2(3): 554-582. doi: 10.3934/publichealth.2015.3.554
Related Papers:
[1]
George Koulierakis, Anastasia Dermatis, Dimitris Zavras, Elpida Pavi .
Protective behaviors during COVID-19 confinement measures in Greece: the role of anxiety, perceived risk and risky-choice framing. AIMS Public Health, 2023, 10(2): 281-296.
doi: 10.3934/publichealth.2023021
[2]
Erin Nolen, Catherine Cubbin, Mackenzie Brewer .
The effect of maternal food insecurity transitions on housing insecurity in a population-based sample of mothers of young children. AIMS Public Health, 2022, 9(1): 1-16.
doi: 10.3934/publichealth.2022001
[3]
Allison DaSantos, Carlisle Goddard, Dalip Ragoobirsingh .
Self-care adherence and affective disorders in Barbadian adults with type 2 diabetes. AIMS Public Health, 2022, 9(1): 62-72.
doi: 10.3934/publichealth.2022006
[4]
Francesco Marcatto, Donatella Ferrante, Mateusz Paliga, Edanur Kanbur, Nicola Magnavita .
Behavioral dysregulation at work: A moderated mediation analysis of sleep impairment, work-related stress, and substance use. AIMS Public Health, 2025, 12(2): 290-309.
doi: 10.3934/publichealth.2025018
[5]
Stephanie A. Godleski, Casey T. Harris, Kevin M. Fitzpatrick, Ammina Kothari .
Social and behavioral vulnerability, pregnancy, and negative mental health outcomes in the U.S. during the Covid-19 pandemic. AIMS Public Health, 2022, 9(2): 331-341.
doi: 10.3934/publichealth.2022023
[6]
Yasir Rehman, Nadia Rehman .
Association of climatic factors with COVID-19 in Pakistan. AIMS Public Health, 2020, 7(4): 854-868.
doi: 10.3934/publichealth.2020066
[7]
Jing Wu, Eleonora Dal Grande, Helen Winefield, Danny Broderick, Rhiannon Pilkington, Tiffany K Gill, Anne W Taylor .
Parent-reported Mental Health Problems and Mental Health Services Use in South Australian School-aged Children. AIMS Public Health, 2016, 3(4): 750-768.
doi: 10.3934/publichealth.2016.4.750
[8]
Karl Peltzer, Supa Pengpid .
The Association of Dietary Behaviors and Physical Activity Levels with General and Central Obesity among ASEAN University Students. AIMS Public Health, 2017, 4(3): 301-313.
doi: 10.3934/publichealth.2017.3.301
[9]
Nguyen Xuan Long, Nguyen Bao Ngoc, Tran Thi Phung, Dao Thi Dieu Linh, Ta Nhat Anh, Nguyen Viet Hung, Nguyen Thi Thang, Nguyen Thi Mai Lan, Vu Thu Trang, Nguyen Hiep Thuong, Nguyen Van Hieu, Hoang Van Minh .
Coping strategies and social support among caregivers of patients with cancer: a cross-sectional study in Vietnam. AIMS Public Health, 2021, 8(1): 1-14.
doi: 10.3934/publichealth.2021001
[10]
Theodoros Pesiridis, Petros Galanis, Eleni Anagnostopoulou, Athena Kalokerinou, Panayota Sourtzi .
Providing care to patients with COVID-19 in a reference hospital: health care staff intentional behavior and factors that affect it. AIMS Public Health, 2021, 8(3): 456-466.
doi: 10.3934/publichealth.2021035
Abstract
Efforts to transform corner stores to better meet community dietary needs have mostly occurred in urban areas but are also needed in rural areas. Given important contextual differences between urban and rural areas, it is important to increase our understanding of the elements that might translate successfully to similar interventions involving stores in more rural areas. Thus, an in-depth examination and comparison of corner stores in each setting is needed. A mixed methods approach, including windshield tours, spatial visualization with analysis of frequency distribution, and spatial regression techniques were used to compare a rural North Carolina and large urban (Los Angeles) food environment. Important similarities and differences were seen between the two settings in regards to food environment context, spatial distribution of stores, food products available, and the factors predicting corner store density. Urban stores were more likely to have fresh fruits (Pearson chi2 = 27.0423; p < 0.001) and vegetables (Pearson chi2 = 27.0423; p < 0.001). In the urban setting, corner stores in high income areas were more likely to have fresh fruit (Pearson chi2 = 6.00; p = 0.014), while in the rural setting, there was no difference between high and low income area in terms of fresh fruit availability. For the urban area, total population, no vehicle and Hispanic population were significantly positively associated (p < 0.05), and median household income (p < 0.001) and Percent Minority (p < 0.05) were significantly negatively associated with corner store count. For the rural area, total population (p < 0.05) and supermarket count were positively associated (p < 0.001), and median household income negatively associated (P < 0.001), with corner store count. Translational efforts should be informed by these findings, which might influence the success of future interventions and policies in both rural and urban contexts.
1.
Introduction
It is an established fact that the COVID-19 pandemic caused by the novel coronavirus (SARS-CoV-2) has affected people's mental health and behavior worldwide [1],[2]. Furthermore, preventive measures such as isolation and quarantine aggravated the problem and people experienced significant levels of anxiety, anger, confusion, and stress [3]. One of the most affected groups due to the pandemic and its consequences was the young adults enrolled in higher education, as they were exposed to an additional consequence of uncertainty regarding academic success, future careers, and social life during college, among other concerns [4]. The psychological health issues of this group have become a primary concern of psychological health practitioners and researchers across world.
Psychological problems may be very complex in nature and may have long-lasting effects. As such, the clear and appropriate identification of these problems is very important to deal with these problems. The choice of an appropriate tool is the first step toward the identification of the problem. One of the standard tools used by researchers is a questionnaire, which has been designed for a specific method and targets a specific group [5]. For example, the Patients Health Questionnaire (PHQ-9) is a 9-item questionnaire and is widely used to measure the severity of depression [6]. The Generalized Anxiety Disorder Scale-7 (GAD-7) is a 7-item, self-rated screening tool used for generalized anxiety disorders [7]. These tools can be administered to groups of respondents, as well as to an individual respondent.
The Strengths and Difficulties Questionnaire (SDQ) is a brief instrument used to measure psychological behavior problems and social dysfunction of a respondent and assesses both strengths and difficulties simultaneously [8],[9]. There are many versions of SDQ which have been designed according to the needs of different target groups. The 4–11 years SDQ version is for the parent/teacher of the subject. The 11–17 years version is used by the subject as well as their parent/teacher. The 17+ version, which has been used in this study, is a self-assessment questionnaire. Currently, there are three versions of the SDQ for each of these age groups: a short/basic version with 25 items, a longer form/extended version with an impact supplement, and an extended version with an added follow-up form. The 25 items of the basic version of the questionnaire are further categorized into five scales: the first scale (prosocial behavior) is the strength scale; and the remaining four scales are difficulty scales (namely, “conduct problems”, “peer problems”, “emotional symptoms”, and “hyperactivity-inattention”). The extended versions of the SDQ further enquire about chronicity, distress, social impairment, and burden to others through items 28–33. These five items, along with item 27 are answered only if the response to item 26 is “yes” (i.e., if the respondent feels difficulties in areas of emotions, concentration, behavior or being able to get along with other people). Item 27 measures the duration of distress and item 33 measures the burden of distress on the family and friends of the respondents.
A useful analysis of psychological data involves the identification and execution of an appropriate statistical technique. The psychological data is generally categorical in nature and many quality-of-life scales are ordinal. In order to estimate categorical response variables through independent predictors, in earlier works, ordinal and multinomial regression models have been found quite useful. Previous works have suggested that the classification for medical diagnosis is ordered, which corresponds to the level of health risk. Ordinal regression (OR) models provide an appropriate strategy for analysing the effects of multiple explanatory variables on an ordered, observed categorical outcome that cannot be assumed to be a continuous measurement with normal distribution [10]. In OR analysis, link functions are used to build specific models. Some of the commonly used link functions are logit, complementary log-log, negative log-log, probit, and Cauchit link functions, which are chosen on the basis of the characteristics of the underlying data. Generally, the logit link is considered suitable for analysing ordered categorical data evenly distributed among all categories; the complementary log-log link is often used when higher categories are more probable, whereas with a negative log-log link function, lower categories are more probable [11].
The OR models have been frequently used in medical data. A vast literature is found on applications of OR models and their variants used in medical and bio-statistical data. The proportional odds and partial proportional odds models have been used by the following: by Lall et al. (2002) to study cognitive function health and aging [12]; by Liu et al. (2018) in Diabetic Retinopathy Diagnosis (DR) with five risk levels [13] and in Breast imaging reporting cancer [14]; by French & Shotwell (2022) assessed COVID-19 status 14 days after a randomization test on a seven point scale, [15]; and by Wolde et al. (2022) to study three levels of hypertension [16], to name a few.
In this study, the OR has been used to estimate the categories of distress resulting in social dysfunction using the impact scores of SDQ. Using the SDQ 17+ extended version, two surveys were conducted during the COVID-19 pandemic: the first during the months of May–June 2020; and the second during the months of October 2020–February 2021. The aim of the surveys was to assess the impact of COVID-19 on the mental health of 18–25 years old college/ university students. The numbers of responses in the two surveys were 1,020 and 743, respectively. The data reliability was tested using Cronbach alpha and Guttman Lambda. The questionnaire had two components, namely “Difficulty” and “Impact” scores of SDQ, to measure behavioral problems and social dysfunction respectively. Furthermore, a study was conducted to understand if the two scores provide similar conclusions about the mental health of the respondents; under the hypothesis that the impact scores in “Normal”, “Borderline”, and “Abnormal”, bands can be estimated with “Difficulty” scores in the same bands. A hypothesis was tested by formulation of the ordinal models to estimate the probability/category of impact scores with independent predictors; conduct problem, peer problem, emotional symptoms and hyperactivity-inattention for every participant using a negative log-log link function of the form −ln(−ln(Fk(xi))) was tested by applying Cox and Snell, Nagelkerke, and McFadden test statistics to the model. The significance of the predictor was obtained using Wald statistics. Significant factors obtained for each category were compared to the base stage and the cutoff points. Using the fitted model, the category of distress of each respondent was predicted. The assumption of parallel lines was tested since the odds ratio was same for different categories of distress. Finally, a comparison between the predicted category and the observed category was obtained.
The novelty of the study was that the population under investigation was not unhealthy. These were psychologically healthy individuals but were facing unprecedented, unhealthy times. The study collected the data for the same population twice, at a gap of one year, when the levels of severity of the effect of the pandemic were not the same in the Indian subcontinent. The study clearly indicated the effect of the pandemic on the psychological health of the respondents; additionally, it estimated the predictive efficiency of the behavioral scales on the social dysfunction of these respondents during pandemic times. To the best of our knowledge this is the first study of its kind in India involving statistical modeling based on two surveys conducted during pandemic times on the same population throughout the country. Besides the introduction, the course of the paper is as follows: material and methods are explained in Section 2; results are discussed in Section 3, which are followed by a discussion in Section 4 and a conclusion in Section 5.
2.
Materials and methods
2.1. Material
During the COVID-19 pandemic period, data were collected through two surveys conducted in online and offline modes, on students studying in various colleges and higher educational institutes across India using the SDQ 17+ self-reported extended version. The surveys were conducted as follows: i) in the months of May–June 2020 almost two months after a nationwide lockdown was imposed; and ii) in the months of October 2020 to February 2021. The first survey was conducted entirely in the online mode and 1,020 students participated in the study. The survey gathered information on demographic variables such as age and gender, and 33 items of the SDQ 17+ questionnaire. The second survey was conducted both in online and physical modes and 743 undergraduate and postgraduate students participated in it. The questionnaire was divided into two sections. The first section had questions regarding the demographic details of the respondents such as their age, gender, place of living, family composition, and family income, along with details of the direct impact of COVID-19 in terms of the occurrence of the disease and resulting hospitalization in the family (including themselves) of the respondents. The second section (common in surveys 1 and 2) of the questionnaire was based on the SDQ 17+ extended version. The SDQ scores were categorized according to the standard classification of cut-off points in the SDQ manual [17].
2.2. Methods
OR models belong to the class of generalized linear regression models as they allow for a more generalized distribution of error terms that differs from the normal distribution of errors. OR models are used to predict ordinal-level dependent variables with a set of independent variables. The first category is usually considered the lowest category, the last category is the highest category (numerically coded from 0 on up), and the independent variable may be either categorical or continuous [18].
Let yi be the ith individual response i = 1, 2 ... n and yi* be the corresponding latent variable. The OR model makes the assumption that yi* (and not yi) depends on xi, i.e.
y*i=x′iβ˜+ϵi;i=1,2...n,
where β˜ is the vector of regression coefficients needed to be estimated and yi* is the unobserved dependent variable. The relationship between yi* and the observed variable y is as follows:
y={1 if0≤y*≤θ12 ifθ1≤y*≤θ2⋮N ifθN−1≤y*.
Let p1(xi), p2(xi), ..., pk(xi) denote the response probabilities at values for a set of explanatory variables. The cumulative probabilities are given by:
The parameters α1, α2, ..., αk−1, are non- decreasing in k and are known as the intercepts or the “cut-points”. The parameter vector β˜ contains the regression coefficients for the covariate vector x˜i. Inherent in this model is the proportional odds assumption, which states that the cumulative odds ratio for any two values of the covariates is constant across response categories or the “parallel line assumption”, which states that there is one regression equation for each category except the last category. The last category probability can be predicted as the second last category probability.
The model contains the K-1 response curves of the same shape, and therefore we cannot fit it by fitting separate logit models for each cut-point. Then, we maximize the multinomial likelihood, subject to constraints. The model assumes that the effects of the variables are the same for each cut-point, k = 1, 2... K−1.
One advantage of an ordered analysis over the corresponding nominal analysis is that, generally, fewer parameters are needed to describe a model for the response [20]. As a result, the ordinal regression models are more powerful.
In order to fit generalized linear models to ordinal response outcomes, distinct “link functions” are used to link the (cumulative) response to the set of predictor variables. Various available link functions used have been tabulated below in Table 1[11].
Table 1.Various link functions used in Ordinal Regression methods.
Link function
Form
Conditions to be used
Logit
ln(Fk(xi)1−Fk(xi))
Categorical data is evenly distributed among all categories. Here, the errors are distributed according to a logistic distribution.
Probit
Φ−1(Fk(xi))
Probit regression assumes that the errors are distributed normally.
Complementary log-log
ln(−ln(1−Fk(xi)))
For skewed data, when higher categories are more probable.
Negative log-log
−ln(−ln(Fk(xi)))
For skewed data, when lower categories are more probable
Cauchit
tan(π(Fk(xi)−0.5))
This type of link bears the same relation to the Cauchy distribution as the probit link bears to the normal. One characteristic of this link function is that the tail is heavier relative to the other links.
Norusis (2012) [21] suggests the choice link function should be based on the distribution of the response variable. In this study, we have used a negative log-log link function [22].
2.2.1. Parallel lines assumption
In OR models, there is an important assumption which states that the correlation between the independent variable and dependent variable does not change for the dependent variable's categories; additionally, parameter estimations do not change for cut-points. In other words, this assumption states that the dependent variable's categories are parallel to each other. The likelihood ratio test, Wald Chi-Square test, and other related tests are used to test parallel lines assumption [23],[24]. In OR, these tests examine the equality of the different categories and decide whether the assumption holds. If the assumption does not hold, interpretations about results will be wrong; therefore, in order to find correct results, alternative models are used instead of the ordinal logit regression models. The hypothesis that tests whether coefficients βk of independent variables are equal or not is tested for every single category.
H0:β1j=β2j=...=β(k−1)j=βj;j=1,2,...J
2.3. The goodness-of-fit tests
The null hypothesis for the goodness-of-fit tests is that the model fits the data well against the alternative hypothesis, which refers to an unspecific problem with the fit. Thus, a small p-value is an indication of lack of fit of the model. The following are the three pseudo-R2 statistics for OR.
Table 2.Test statistics for testing the goodness of fit of an ordinal model.
Test
Formula
Explanation
McFadden's R2
R2L=1−LLmodelLL0
This is the natural logarithmic linear ratio R2. A value close to 0 indicates that model has no predictive value
Cox and Snell's R2
R2CS=1−(LL0LLmodel)2n n= sample size
This is a “generalized” R2 (used in linear regression as well) rather than a pseudo R2. A problem with this R2 is that the upper bound of this statistic, given by 1−(pp(1−p)1−p)2 is less than 1where p is the marginal proportion of cases with events.
Nagelkerke's R2
R2Nagel=R2CS1−e2LL0n
It measures the proportion of the total variation of the dependent variable can be explained by independent variables.
LLmodel = full log-likelihood model including all coefficients (depending on the number of predictors);
LL0 = log-likelihood model with fewer coefficients (model with only the intercept b0); ln(L0) being analogous to residual sum of squares in linear regression.
3.
Results
In order to study the effect of COVID-19 on the psychological health of college/university students, two surveys were conducted in online and offline modes using the SDQ 17+ extended version. Approximately 1,020 and 743 students participated in survey 1 & survey 2, respectively. Among these, 462 (45.29%), and 383 (51.55%) were males in survey 1 and survey 2, and 558 (54.71%) and 360 (48.45%) females, respectively. The participants were from across several streams viz. humanities, commerce, sciences, law, management, engineering, medicine, nursing, and interns. All the responses were scored according to the SDQ manual. All five scales of the SDQ manual for all the participants were valid scores in both surveys. Table 3 below presents the descriptive statistics of all the items of SDQ; first the five scales of five items each and “Impact” scores for only those respondents who answered yes to item no 26 students under both the surveys stratified gender-wise.
The SDQ was designed to screen for behavioral problems in youths based on cutoff points that favor the instrument's diagnostic sensitivity [9],[16]. Graphically, we have displayed the cutoff points of three SDQ categories of all the respondents who participated in both surveys. Figure 1a presents the “Normal”, “Borderline”, and “Abnormal” categories, defined by the cutoff points of the “Difficulty” score in two surveys. It can be observed from Figure 1a that students with lower scores have a higher frequency than students with higher scores. However, there are more than 30% of respondents are in the affected groups (facing behavioral problems). Figure 1b depicts the proportion of respondents with “Impact” score of two surveys in different categories viz. “No distress”, “Normal”, “Borderline”, and “Abnormal”. It can be observed that students with a score < 1 are in the Normal band (either the answer to item no 26 is “no” or the impact score is 0) and there are more than 45% are in the affected groups (i.e., facing social dysfunction during the surveys).
Table 3.Descriptive statistics of two surveys giving mean, standard deviation, median, mode, minimum, and maximum of five strength and difficulty scales; and Difficulty and Impact scores.
3.1. The probability/category of impact Score of every respondent with the ordinal regression model
Ordinal models have been applied to estimate the probability/category of the impact score with the following independent predictors for every participant: conduct problem, peer problem, emotional symptoms, and hyperactivity-inattention. The difficulty scores of those respondents have been considered whose impact scores are available. The data (Figure 1a,b) suggest that the lower values of the impact score have a higher frequency than the higher values. Thus, the negative log-log link function is most appropriate for the OR model to be used.
As a first step of OR analysis, the intercept model is compared with the full model. Null hypothesis and alternative are:
The full model was found to be good for both surveys with a p-value < 0.001. Furthermore, the fitting of OR models with the negative log-log link function is tested using Cox and Snell, Nagelkerke, and McFadden test. The models were found to be appropriate for both surveys with p-values > 0.05. The upper bound of Cox and Snell R2 was found to be 0.952248 for p = 0.44 in survey 1 and 0.729325 for p = 0.36 in survey 2.
3.2. Social Dysfunction estimated with behavioral problems using Ordinal Regression
In the present study, by applying OR, the interest lies in deciding whether or not the predictors have the predictive efficiency of the model. The values of the regression coefficients for hyperactivity-inattention, conduct problems, emotional symptoms, and peer problems factors account for the size of the effect that a variable is having on the dependent variable, and the sign of the coefficient gives the direction of the effect. It has been found that hyperactivity-inattention and emotional symptoms are significant contributors for estimating respondents' probability of belonging to a category, as p < 0.05 in survey 1. It has been found that peer, hyperactivity, and emotional are significant contributors for estimating respondents' probability of belonging to a category, as p < 0.05 in survey 2. The conduct problem is not a significant factor in both surveys. The detailed results are given below in Table 5. If the response variable takes the value 0, it means that the respondent is under the normal category (distress is not affecting social dysfunction); if the response variable takes the value 1 / value 2, it means that the respondent is under the Borderline / Abnormal category (presence of social dysfunction). The detailed results are given below in Table 5.
Table 5.Ordinal Regression showing the partial effects of components of difficulty scales on impact scores of the participants in the two surveys.
OR models are based on the assumption of parallel lines (i.e., parameter estimations do not change for cut points). In other words, the dependent variable's categories are parallel. The assumption is needed for an accurate interpretation of the results. To test this assumption, the following null and alternative hypotheses were set:
H0: The slope coefficients of predictors in the model are the same across all response categories.
H1: The slope coefficients of predictors in the model are not the same at least for one of the response categories.
The significance values are found to be 0.071 and 0.251 for surveys 1 and 2, respectively. The proportional odds/parallel lines assumption is accepted. The detailed results are presented in Table 6 below.
3.3. Comparison of the estimated categories with the observed ones
The principal objective of the study is to estimate impact scores with behavioral problem (difficulty) scores. For this, the probability of each category of impact score (indicating social dysfunction) through behavioral problems has been computed for all the respondents. The criterion for categorization of distress is that the probability of that category should be highest among all the categories. The comparison between the observed and estimated bands of impact scores for both surveys is presented in Table 7 below.
The OR model estimated the observed normal band as the normal category and the observed advance band as the advance category, with almost 70% accuracy. The model has good predictive power, but it fails to estimate slightly raised (Borderline) band under all the categories, despite the model being an appropriate one in terms of prerequisites as enlisted in Table 1. Furthermore, it is clear from the estimated results that there were young adults (16.5% in survey 1 and 30.5% in survey 2) whose difficulty score was under the normal band, but they still faced the advance level of social dysfunction. This means that for these participants, the difficulty scores were less than 15; however, they were facing “a great deal” problem under at least one area of behavior problems resulting in an abnormal level of distress causing social dysfunction. On the other hand, if respondents were under the advance category of behavioral problems, then almost everyone experienced distress (more than 90% in survey 1 and 99% in survey 2). All the analysis has been done in SPSS, version 26 and R software, version 4.2.1.
Table 7.Comparison between the observed and estimated bands of impact scores for both the surveys.
The subjects of this study were young adults who were otherwise psychologically healthy; however, they were facing unprecedented problematic trials during the COVID-19 pandemic period. They were investigated to determine the effect of the pandemic on their psychological health. The SDQ (extended version) was chosen for the purpose for data collection due to its effectiveness and reliability in studying behavioral problems, as well as social dysfunction in a generally healthy population of young adults.
In order to study a statistical relationship between the categories of two components of SDQ scores, namely, the difficulty and the impact scores, the OR model was selected because it is a robust technique; in case the response variable is an ordered variable with few categories and mutually exclusive categories, these can be ordered by their clinical preference. This model has been used repeatedly in medical and bio statistical studies and has been useful in estimating the output variable (stages of disease) in diseases like cancer and chronic kidney disease with independent predictors. The models have been applied in COVID-19 related studies with as aims such as the identification of factors responsible for COVID-19 infection by application of a geographically weighted ordinal logistic regression model and the effect of space over these factors [25], and the effect of various treatments for the disease by assessing COVID-19 status 14 days after a randomization test on a seven point scale [15].
The choice of an appropriate link function is of crucial importance in OR. As the numbers of respondents in normal categories were highest for both the social dysfunction as well as behavioral problems, the most suitable link function is the negative log-log link function, as it is used when the lower values are more probable. The statistical relationship was examined by obtaining the following predictive efficiency of predictors: hyperactivity-inattention, conduct problems, peer problem, and emotional symptoms scores about the level of distress causing social dysfunction. One of the strengths of OR models is that OR considers the items and participants, incorporating all data information into the model, and controls for dependencies between ratings from the same person and between ratings of the same item. The parameters are the multiple intercepts that are thresholds/ cut points.
The cut points in the data of the present study indicate the levels of distress of the respondents. About 70% of respondents' category of distress is correctly estimated by the applied model. The predictive efficiency of the model was quite good. However, respondents who observed a “Borderline” difficulty score either were either under or over-estimated by this model. This is due to the complex and multi-component data collected through a SDQ questionnaire, which not only is subjective in nature but takes values in limited and narrow categories. For an impact score to lie in the “Abnormal” band, either the respondent has at least two or more problem areas in “quite a lot” category or at least one problem area in the “a great deal” category, while some other areas may be in the “not at all” or “only a little” categories. For a score to lie in the “borderline” band, the respondent has at most one problem area in “quite a lot” category and “no problem” in other areas. However, for the independent predictors, they contribute to the difficulty score of the respondent, indicating the behavioral problem. For a score to lie in the “borderline” band, the respondent has to have two/three problem areas in “quite a lot” category, or two problem areas, out of which one is in the “quite a lot” category and one is in the “the great deal” category. As an example, on the basis of scores of three respondents in survey 1, which are (2,5,7,4), (3,6,6,4), and (3,5,5,5), all the three are in the borderline category of the difficulty score (as per total). However, the first respondent has a “quite a lot” problem in one area (conduct problem) and “a great deal” problem in one area (peer problem); the second respondent has “quite a lot” problems in two areas (conduct problem and peer problem) and the third respondent has “quite a lot” problems in three areas (conduct problem, peer problem and emotional symptoms). The complexity of the data is evident from the fact that a respondent with a “normal” category difficulty score had the scores in individual scales of 1,2,2,8. Emotional conduct of this respondent was in “a great deal” band.
For both the surveys, the OR model estimated the impact scores of all the respondents having the “borderline” difficulty scores; however, with two problem areas, one in “a great deal” category while the other in “quite a lot” category or with three problem areas, all in the “borderline” category or in the “abnormal” category. This means that all those cases that were in the “borderline” category as per difficulty score but estimated as “abnormal” category of impact score might be as problematic as the “abnormal” difficulty score cases. While the OR model provides a reasonably good relationship between extremes category case (i.e., “normal” and “abnormal” difficulty and impact scores), it also suggests the case-by-case investigation of “borderline” cases. Therefore, the OR model has been able to provide useful additional information for clinicians and researchers with an interest in psychiatric scores. All the “borderline” cases and the “normal” cases with scores close to being “borderline” should be investigated further to determine the need of expert intervention.
The results of the study are consistent with earlier studies. The observed and the empirical conclusion that up to 50% of the respondents (both males and females) were facing severe distress corroborated the findings of the earlier studies, which suggested that a very high proportion of young adults were facing severe mental health issues during the pandemic times [26],[27].
The novelty of this study is the assessment of the general psychological behaviour of a healthy population in unhealthy times, not only through observations but also through statistical modelling. The study clearly shows the deviations of the population proportions from standard population proportions of (normal: borderline: abnormal) 80%:10%:10% in normal times. Additionally, the study emphasizes the need of case-by-case investigation of ‘borderline’ and ‘close to borderline’ cases if the questionnaire has been administered to a group of young adults.
The SDQ 17+ version is meant to identify the psychological problems of young adults. However, the data was collected mostly online from the young adults enrolled in higher educational institutions, thus limiting the scope of investigation to such young adults only in this study. Further the investigators were not in direct contact with the respondents at the time of data collection and therefore could not ensure the requirements of answering the SDQ (i.e., following time limit and not revisiting the responses). However, the data of the two surveys were consistent and had good reliability quotients. In the future, the model can be applied to a larger group of respondents, not necessarily students only. Additionally, the application of the model on the time series data may provide useful insight to the clinicians about the respondents' behaviour on a mass scale as well as for individual respondents.
5.
Conclusions
OR models are good at estimating the extreme categories, though the “Borderline” category was not estimated well. One of the reasons was the use of qualitative data with the least wide “Borderline” category, both for the difficulty and the impact scores. Normal difficulty scores do not necessarily indicate the absence of distress but advance levels of difficulty scores correspond to advance levels of distress. Even normal difficulty scores can have components lying in “quiet a lot” of “a great deal” categories. Such cases should be dealt individually. Extended version of SDQ should be preferred over the commonly used basic version of the questionnaire.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
References
[1]
Wang Y, Beydoun MA. (2007) The obesity epidemic in the United States — gender, age, socioeconomic, racial/ethnic, and geographic characteristics: a systematic review and meta-regression analysis. Epide Rev. 29:6-28. doi: 10.1093/epirev/mxm007
[2]
Winkleby M. A., Jatulis D. E., Frank E., et al. (1992) Socioeconomic status and health: How education, income, and occupation contribute to risk factors for cardiovascular disease. Am J Pub Health. 82(6): 816-820.
[3]
Bonaccio M, Bonanni AE, Di Castelnuovo A, et al. (2012) Low income is associated with poor adherence to a Mediterranean diet and a higher prevalence of obesity: cross-sectional results from the Moli-sani study. BMJ Open. 2(6):9
[4]
Diez-Roux AV, Nieto FJ, Caulfield L, et al. (1999) Neighbourhood differences in diet: the Atherosclerosis Risk in Communities (ARIC) Study. J. Epide Commun Health. 53(1):55-63.
[5]
Aggarwal A, Monsivais P, Cook AJ, et al. (2011) Does diet cost mediate the relation between socioeconomic position and diet quality? Eur. J. Clin. Nutr. 65(9):1059-1066.
[6]
Morton LW, Blanchard TC. (2007) Starved for access: Life in rural America's food deserts. Rural Realities.1(4):1-9. http://www.iatp.org/files/258_2_98043.pdf. Accessed February 15, 2015.
[7]
Gibson DM. (2011) The neighborhood food environment and adult weight status: estimates from longitudinal data. Am J Public Health. 101(1):71-8.
[8]
Ver Ploeg M, Breneman V, Farrigan T, et al. (2009) "Access to affordable and nutritious food: measuring and understanding food deserts and their consequences. Report to Congress." USDA Economic Research Service.
[9]
Fielding JE, Simon PA. (2011) Food Deserts or Food Swamps?: Comment on “Fast Food Restaurants and Food Stores”. Arch Intern Med. 171(13):1171-1172.
[10]
Powell LM, Slater S, Mirtcheva D, et al. (2007) Food store availability and neighborhood characteristics in the United States. Prev Med. 44:189-95 doi: 10.1016/j.ypmed.2006.08.008
[11]
Jetter KM, Cassady DL. (2006) The availability and cost of healthier food alternatives. Am J Prev Med. 30:38-44. doi: 10.1016/j.amepre.2005.08.039
[12]
Larson NI, Story MT, Nelson, MC. (2009) Neighborhood environments: disparities in access to healthy foods in the US. Am J Prevent Med. 36(1): 74-81.
[13]
Jilcott SB, Hurwitz J, Moore JB, et al. (2010) Qualitative perspectives on the use of traditional and nontraditional food venues among middle- and low-income women in Eastern North Carolina. Ecol Food Nutr. 49:373
[14]
Jilcott SB, Laraia BA, Evenson KR, et al. (2009) Perceptions of the community food environment and related influences on food choice among midlife women residing in rural and urban areas: a qualitative analysis. Women. Health. 49:164-180. doi: 10.1080/03630240902915085
[15]
Glanz K, Yaroch AL. (2004) Strategies for increasing fruit and vegetable intake in grocery stores and communities: policy, pricing, and environmental change. Prev Med. 39 (Suppl 2):S75-80.
[16]
Seymour JD, Yaroch AL, Serdula M, et al. (2004) Impact of nutrition environmental interventions on point-of-purchase behavior in adults: a review. Prev Med. 39 (Suppl 2):S108-36.
[17]
Flint E, Cummins S, Matthews SA. (2012) Do Supermarket Interventions Improve Food Access, Fruit and Vegetable Intake and BMI? Evaluation of the Philadelphia Fresh Food Financing Initiative. J Epidemiol Community Health. 66:A33
[18]
Evans A, Jennings R, Smiley A, et al. (2012) Introduction of farm stands in low income communities increases fruit and vegetable among community residents. Health Place. 18:1137-1143 doi: 10.1016/j.healthplace.2012.04.007
[19]
Wrigley N, Warm D, Margetts B. (2003) Deprivation, diet, and food-retail access: findings from the Leeds “Food Deserts” Study. Environ Plann. 35(1):151-188.
[20]
Song HJ, Gittelsohn J, Kim M, et al. (2009) A corner store intervention in a low-income urban community is associated with increased availability and sales of some healthy foods. Public Health Nutr. 12(11):2060-2067.
[21]
Hoffman J, Morris V, Cook J. (2009) The Boston Middle School-Corner Store Initiative: Development, implementation, and initial evaluation of a program designed to improve adolescents' beverage-purchasing behaviors. Psychology in the Schools. Special Issue: Obesity in the Schools. 46 (8):756-766.
[22]
Gittelsohn J, Franceschini MC, Rasooly I, et al. (2007) Understanding the food environment in a low-income urban setting: implications for food store interventions. J Hunger Envr Nutr. 2(2/3):33-50.
[23]
Story M, Kaphingst KM, Robinson-O'Brien R, et al. (2008) Creating healthy food and eating environments: policy and environmental approaches. Annu Rev Public Health. 29:253-72. doi: 10.1146/annurev.publhealth.29.020907.090926
[24]
Escaron A, Meinen A, Nitzke S, et al. (2013). Supermarket and Grocery Store-Based Interventions to Promote Healthful Food Choices and Eating Practices: A Systematic Review. Prev Chronic Dis. 10: E50.
[25]
Sloane DC, Diamant AL, Lewis LB, et al. (2003) Improving the nutritional resource environment for healthy living through community-based participatory research. J Gen Intern Med. 18(7):568-75.
[26]
Donkin AJ, Dowler EA, Stevenson SJ,et al. (2000) Mapping access to food in a deprived area: the development of price and availability indices. Public Health Nutr. 3(1):31-8.
[27]
Liese AD, Weis KE, Pluto D, et al. (2007) Food store types, availability, and cost of foods in a rural environment.J Am Diet Assoc. 107(11):1916-23.
[28]
Franco M, Diez Roux AV, Glass TA, et al. (2008) Neighborhood characteristics and availability of healthy foods in Baltimore. Am J Prev Med. 35(6):561-7.
[29]
Laska MN, Borradaile KE, Tester J, et al.(2010) Healthy food availability in small urban food stores: a comparison of four US cities. Public Health Nutr. 13(7):1031-5.
[30]
Cummins S, Smith DM, Taylor M, et al. (2009) Variations in fresh fruit and vegetable quality by store type, urban-rural setting and neighbourhood deprivation in Scotland. Public Health Nutr. 12(11):2044-50.
[31]
Bodor JN, Rose D, Farley TA, et al. (2008) Neighbourhood fruit and vegetable availability and consumption: the role of small food stores in an urban environment. Public Health Nutr. 11(4):413-20.
[32]
Zenk SN, Schulz AJ, Hollis-Neely T, et al. (2005) Fruit and vegetable intake in Blacks: income and store characteristics. Am J Prev Med. 29(1):1-9.
[33]
Galvez MP, Morland K, Raines C, et al. (2008) Race and food store availability in an inner-city neighbourhood. Public Health Nutr. 11(6):624-31.
[34]
Morland K, Wing S, Diez-Roux A, et al. (2002) Neighborhood characteristics associated with the location of food stores and food service places. Am J Prev Med. 22(1):23-9.
[35]
Smoyer-Tomic KE, Spence JC, Raine KD, et al. (2008) The association between neighborhood socioeconomic status and exposure to supermarkets and fast food outlets. Health Place. 14(4):740-54.
[36]
Raja S, Ma C, Yadav P. (2008) Beyond food deserts: measuring and mapping racial disparities in neighborhood food environments. J Plan Educ Res. 27(4):469-82.
[37]
Gittelsohn J, Rowan M, Gadhoke P. (2012) Interventions in small food stores to change the food environment, improve diet, and reduce risk of chronic disease. Prev Chronic Dis. 9:110015.
[38]
Webber CB, Sobal J, Dollahite JS. (2010) Shopping for fruits and vegetables. Food and retail qualities of importance to low-income households at the grocery store. Appetite. 54(2): 297-303.
[39]
Bailey-Davis L, Virus A, McCoy TA, et al. (2013) Middle school student and parent perceptions of government-sponsored free school breakfast and consumption: A qualitative inquiry in an urban setting.J Acad Nutr Diet . 113(2): 251-257.
[40]
Borradaile KE, Sherman S, Vander Veur S, et al. (2009) Snacking in children: the role of urban corner stores. Pediatrics. 124(5): 1293-1298
[41]
Jilcott SB, Wade S, McGuirt, JT, et al. (2011) The association between the food environment and weight status among eastern North Carolina youth. Public Health Nutri.14: 1610-1617.
[42]
Jilcott Pitts SB, Bringolf K, Lawton K, et al. (2013) Formative evaluation for a healthy corner store initiative in Pitt County, North Carolina: assessing the rural food environment, part 1. Prev Chronic Dis. 10:E121
[43]
Jilcott Pitts SB, Bringolf K, Lloyd C, et al. (2013) Formative evaluation for a healthy corner store initiative in Pitt County, North Carolina: engaging stakeholders for a healthy corner store initiative, part 2. Prev Chronic Dis. 10:E120
[44]
Dutko P, Ver Ploeg M, Farrigan T. (2012). Characteristics and influential factors of food deserts. U.S. Department of Agriculture, Econ Res Serv, ERR-1401
[45]
Ahern M, Brown C, Dukas S. (2011). A national study of the association between food environments and county-level health outcomes. J Rural Health 27:367-379. doi: 10.1111/j.1748-0361.2011.00378.x
[46]
Monica, F., 2007. Why is US poverty higher in nonmetropolitan than in metropolitan areas? Growth and Change 38, 56-76
[47]
Deller S, Canto A, Brown L. (2015) Rural poverty, health and food access. Regional Science Policy & Practice 7(2): 61-75.
[48]
Sharkey JR, Johnson CM, Dean WR. (2010) Food access and perceptions of the community and household food environment as correlates of fruit and vegetable intake among rural seniors. BMC Geriatrics. 10 (1): p. 32
[49]
Smith C, Morton LW. (2009) Rural food deserts: low-income perspectives on food access in Minnesota and Iowa. J Nutr Educ Behav. 41(3): 176-187.
[50]
Yeager CD, Gatrell JD. (2014) Rural food accessibility: An analysis of travel impedance and the risk of potential grocery closures. Applied Geogr. 53: 1-10. doi: 10.1016/j.apgeog.2014.05.018
[51]
Hendrickson D, Smith C, Eikenberry N. (2006) Fruit and vegetable access in four low-income food deserts communities in Minnesota. Agri Human Values. 23(3): 371-383.
[52]
Lake A, Townshend T. (2006) Obesogenic environments: exploring the built and food environment. J R Soc Promot Health. 126 (6): 262–267
[53]
Peterson SL, Dodd KM, Kim K, et al. (2010) Food cost perceptions and food purchasing practices of uninsured, low-income, rural adults. J Hunger Environ Nutr. 5 (1): 41–55
[54]
Yeager CD, Gatrell JD. (2014) Rural food accessibility: An analysis of travel impedance and the risk of potential grocery closures. Applied Geogr. 53: 1-10. doi: 10.1016/j.apgeog.2014.05.018
[55]
Wang M, Kim S, Gonzalez A, MacLeod K, Winkleby M. (2007) Socioeconomic and food-related physical characteristics of the neighborhood environment are associated with body mass index. J Epidemiol Community Health, 61: 491–498. doi: 10.1136/jech.2006.051680
[56]
Baker E, Schootman M, Barnidge E, Kelly C. (2006) The role of race and poverty in access to foods that enable individuals to adhere to dietary guidelines. Prev Chronic Dis, 3 (3): A76
[57]
Sharma, A. (2014). Spatial analysis of disparities in LDL-C testing for older diabetic adults: A socio-environmental framework focusing on race, poverty, and health access in Mississippi. Applied Geogr, 55, 248-256.
Ortega AN, Albert S, Sharif M, et al. (2015) "Proyecto MercadoFRESCO: A Multi-level, Community-Engaged Corner Store Intervention in East Los Angeles and Boyle Heights." J Commun Health. 40(2):347-56.
[60]
East Carolina University-Center for Health Systems Research and Development. (2013) "Regional Health Status: 41-County East." Center for Health Systems Research and Development.
[61]
Howard G, Labarthe DR, Hu J, et al. (2007) Regional differences in Blacks' high risk for stroke: the remarkable burden of stroke for Southern Blacks. Ann Epidemiol. 17(9):689–696.
[62]
US Census Bureau. (2010) "U.S. Census Bureau Releases Data on Population Distribution and Change in the U.S. Based on Analysis of 2010 Census Results". U.S. Census Bureau. https://www.census.gov/newsroom/releases/archives/2010_census/cb11-cn124.html
[63]
University of Wisconsin Population Health Institute. (2014) County Health Rankings & Roadmaps." County Health Rankings & Roadmaps. http://www.countyhealthrankings.org/app/northcarolina/2014/rankings/lenoir/county/factors/overall/snapshot
[64]
McGuirt JT, Jilcott SB, Vu MB, et al. (2011) Conducting Community Audits to Evaluate Community Resources for Healthful Lifestyle Behaviors: An Illustration From Rural Eastern North Carolina. Prevent Chron Dis. 8(6).
[65]
Kegler MC, Rodine S, McLeroy K, Oman R. (1998) Combining Quantitative and Qualitative Techniques in Planning and Evaluating a Community-Wide Project to Prevent Adolescent Pregnancy. The International Electronic J Health Edu. 1:39-48.
[66]
Pitts SB, Vu MB, Garcia BA, et al. (2013) A community assessment to inform a multilevel intervention to reduce cardiovascular disease risk and risk disparities in a rural community. Fam Community Health. 36(2): 135–146.
[67]
Andreyeva T, Blumenthal DM, Schwartz MB, et al. (2008) Availability and prices of foods across stores and neighborhoods: the case of New Haven, Connecticut. Health Aff (Millwood). 27(5):1381-1388.
[68]
LA County Department of Public Health-Environmental Health. (2012) "LA County Department of Public Health - Facility Rating." “Information about licensed food serving facilities in Los Angeles county available at: http://publichealth.lacounty.gov/eh/misc/ehpost.htm”
[69]
West S, Houseman R, Orenstein D, et al. (2010) California Grocery Store Observational Protocol Survey and Key. Ithica, NY: Cornell University. Accessed August 28, 2015. http://envirocancer.cornell.edu/obesity/tools.cfm/#FoodTools.
[70]
Duncan DT, Castro MC, Blossom JC, et al. (2011) Evaluation of the positional difference between two common geocoding methods. Geospat. Health. 5: 265-273. doi: 10.4081/gh.2011.179
[71]
U.S. Department of Health & Human Services. (2014) "Poverty Guidelines, Research, and Measurement." Poverty Guidelines, Research, and Measurement.Web. Accessed August 28, 2015. http://aspe.hhs.gov/poverty/14poverty.cfm
[72]
Logan JR, Zhang W, and Xu, H. Applying spatial thinking in social science research. GeoJournal. 2010 Jan 1; 75(10): 15–27.
[73]
Link BG, Phelan J. Social conditions as fundamental causes of disease. (1995) J Health Soci Behav, 35: 80-94 doi: 10.2307/2626958
[74]
Williams DR, Collins C. (2001) Racial residential segregation: a fundamental cause of racial disparities in health. Public Health Rep, 116 (5): 404-416
[75]
Williams DR, Jackson PB. (2005) Social sources of racial disparities in health. Health Aff (Millwood), 24 (2): 325–334
[76]
Fotheringham AS, Brunsdon C, Charlton M. (2003) Geographically weighted regression: the analysis of spatially varying relationships. John Wiley & Sons. Hoboken, New Jersey.
[77]
Anselin L. (2005) Exploring spatial data with GeoDa: a workbook. Urbana-Champaign, IL, Spatial Analysis Laboratory Department of Geography, University of Illinois. http://geodacenter.asu.edu/system/files/geodaworkbook.pdf
[78]
Anselin L. (1989) What is special about spatial data? Alternative perspectives on spatial data analysis. Technical Report 89-4 (Santa Barbara, CA: National Center for Geographic Information and Analysis). http://www.irss.unc.edu/content/pdf/anselin%201989.pdf
[79]
Bellenger DN, Valencia H. (1982) Understanding the Hispanic market. Business Horizons, 25(3): 47-50.
[80]
Lopez J, Madigan B, Calderon N, et al. (2014) Challenges to Walking for Health in East Los Angeles. Paper presented at: Annual meeting of the Centers for Population Health & Health Disparities; Marina del Rey, CA.
[81]
Morris PM, Neuhauser L, Campbell C. (1992) Food security in rural America: a study of the availability and costs of food. J Nutri Edu. 24(1): 52S-58S.
[82]
Grigsby-Toussaint DS, Zenk SN, Odoms-Young A, et al. (2010) Availability of commonly consumed and culturally specific fruits and vegetables in African-American and Latino neighborhoods. J Am Die Ass. 110: 746-752. doi: 10.1016/j.jada.2010.02.008
[83]
Sharkey JR, Horel S. (2008) Neighborhood socioeconomic deprivation and minority composition are associated with better potential spatial access to the ground-truthed food environment in a large rural area. J Nutr.138(3):620-627
[84]
Sharkey JR, Horel S, Han D, Huber JC. (2009) Association between neighborhood need and spatial access to food stores and fast food restaurants in neighborhoods of Colonias. Int J Health Geogr.8:9.
[85]
Lee RE, Heinrich KM, Medina AV, et al. (2010) A picture of the healthful food environment in two diverse urban cities. Environ Health Insights.4:49-60
[86]
Public Health Institute. "Target Marketing Soda & Fast Food: Problems with Business as Usual." Berkeley Media Studies Group. Dec. 2010. < http://www.bmsg.org/sites/default/files/bmsg_cche_marketing_brief_target_marketing_soda_and_fast_food.pdf>.
[87]
Grier SA, Kumanyika SK. (2008) The context for choice: health implications of targeted food and beverage marketing to Blacks. Am J Public Health. 98:1616-1629. doi: 10.2105/AJPH.2007.115626
[88]
Kotler P, Armstrong G. (2003) Principles of Marketing. 10th ed. Upper Saddle River, NJ: Prentice-Hall.
[89]
Payne C, Niculesu M. (2012) Social Meaning in Supermarkets as a Direct Route to Improve Parents' Fruit and Vegetable Purchases. Agri Res Eco Rev .41 (1): 124-137
[90]
Fleischhacker SE, Evenson KR, Sharkey J, et al. (2013) Validity of secondary retail food outlet data: a systematic review. Am J Prev Med. 45(4):462-73.
[91]
Coulton C, Korbin J, Chan T, Su M. (2001) Mapping residents' perceptions of neighborhood boundaries: a methodological note. Am J Community Psychol. 29: 371-383. doi: 10.1023/A:1010303419034
Jared T McGuirt, Stephanie B. Jilcott Pitts, Alice Ammerman, Michael Prelip, Kathryn Hillstrom, Rosa-Elena Garcia, William J. McCarthy. A Mixed Methods Comparison of Urban and Rural Retail Corner Stores[J]. AIMS Public Health, 2015, 2(3): 554-582. doi: 10.3934/publichealth.2015.3.554
Jared T McGuirt, Stephanie B. Jilcott Pitts, Alice Ammerman, Michael Prelip, Kathryn Hillstrom, Rosa-Elena Garcia, William J. McCarthy. A Mixed Methods Comparison of Urban and Rural Retail Corner Stores[J]. AIMS Public Health, 2015, 2(3): 554-582. doi: 10.3934/publichealth.2015.3.554
Table 1.Various link functions used in Ordinal Regression methods.
Link function
Form
Conditions to be used
Logit
ln(Fk(xi)1−Fk(xi))
Categorical data is evenly distributed among all categories. Here, the errors are distributed according to a logistic distribution.
Probit
Φ−1(Fk(xi))
Probit regression assumes that the errors are distributed normally.
Complementary log-log
ln(−ln(1−Fk(xi)))
For skewed data, when higher categories are more probable.
Negative log-log
−ln(−ln(Fk(xi)))
For skewed data, when lower categories are more probable
Cauchit
tan(π(Fk(xi)−0.5))
This type of link bears the same relation to the Cauchy distribution as the probit link bears to the normal. One characteristic of this link function is that the tail is heavier relative to the other links.
Table 2.Test statistics for testing the goodness of fit of an ordinal model.
Test
Formula
Explanation
McFadden's R2
R2L=1−LLmodelLL0
This is the natural logarithmic linear ratio R2. A value close to 0 indicates that model has no predictive value
Cox and Snell's R2
R2CS=1−(LL0LLmodel)2n n= sample size
This is a “generalized” R2 (used in linear regression as well) rather than a pseudo R2. A problem with this R2 is that the upper bound of this statistic, given by 1−(pp(1−p)1−p)2 is less than 1where p is the marginal proportion of cases with events.
Nagelkerke's R2
R2Nagel=R2CS1−e2LL0n
It measures the proportion of the total variation of the dependent variable can be explained by independent variables.
Table 3.Descriptive statistics of two surveys giving mean, standard deviation, median, mode, minimum, and maximum of five strength and difficulty scales; and Difficulty and Impact scores.
Categorical data is evenly distributed among all categories. Here, the errors are distributed according to a logistic distribution.
Probit
Φ−1(Fk(xi))
Probit regression assumes that the errors are distributed normally.
Complementary log-log
ln(−ln(1−Fk(xi)))
For skewed data, when higher categories are more probable.
Negative log-log
−ln(−ln(Fk(xi)))
For skewed data, when lower categories are more probable
Cauchit
tan(π(Fk(xi)−0.5))
This type of link bears the same relation to the Cauchy distribution as the probit link bears to the normal. One characteristic of this link function is that the tail is heavier relative to the other links.
Test
Formula
Explanation
McFadden's R2
R2L=1−LLmodelLL0
This is the natural logarithmic linear ratio R2. A value close to 0 indicates that model has no predictive value
Cox and Snell's R2
R2CS=1−(LL0LLmodel)2n n= sample size
This is a “generalized” R2 (used in linear regression as well) rather than a pseudo R2. A problem with this R2 is that the upper bound of this statistic, given by 1−(pp(1−p)1−p)2 is less than 1where p is the marginal proportion of cases with events.
Nagelkerke's R2
R2Nagel=R2CS1−e2LL0n
It measures the proportion of the total variation of the dependent variable can be explained by independent variables.
Scale(Items)
Total
Mean
Sd
Minimum
Maximum
Prosocial behaviour (1, 4, 9, 17, 20)
Survey 1
772
7.891
1.686
1
10
Survey2
584
7.932
1.754
0
10
Hyperactivity-inattention (2, 10, 15, 21, 25)
Survey 1
772
4.104
2.034
0
9
Survey2
584
3.724
2.059
1
10
Emotional symptoms (3, 8, 13, 16, 24)
Survey 1
772
4.193
2.450
1
10
Survey2
584
3.995
2.488
0
10
Conduct problem (5, 7, 12, 18, 22)
Survey 1
772
2.935
1.456
0
9
Survey2
584
2.785
1.441
1
7
Peer problem (6, 11, 14, 19, 23)
Survey 1
772
2.902
1.713
0
10
Survey2
584
2.942
1.787
0
9
Difficulty score
Survey 1
772
14.136
5.142
1
31
Survey2
584
13.443
5.568
2
33
Impact Score (28, 29, 30, 31, 32)
Survey 1
772
1.528
1.721
0
7
Survey2
584
1.885
2.100
0
9
Model
-2 Log Likelihood
Chi-Square
Df
Significance
Survey_1
Intercept Only
1590.920
Full
1486.807
144.113
4
<0.001
Survey_2
Intercept Only
1186.088
Full
1031.238
154.851
4
<0.001
Survey 1
Estimate
Std. Error
Wald
Df
Sig.
95% Confidence Interval
Lower Bound
Upper Bound
Threshold
[impact = 0]
1.217
0.148
67.784
1
<0.001
0.927
1.506
[impact = 1]
1.810
0.155
136.651
1
<0.001
1.506
2.113
Location
Hyperactivity
0.096
0.025
15.113
1
<0.001
0.048
0.144
Emotional
0.176
0.022
66.691
1
<0.001
0.134
0.219
Conduct
-0.030
0.032
0.894
1
0.344
-0.094
0.033
Peer
0.023
0.028
0.655
1
0.418
-0.032
0.077
Survey 2
Threshold
[impact = 0]
1.387
0.161
74.722
1
<0.001
1.073
1.702
[impact = 1]
2.117
0.174
148.813
1
<0.001
1.777
2.457
Location
Hyperactivity
0.110
0.030
13.817
1
<0.001
0.052
0.168
Emotional
0.197
0.026
58.552
1
<0.001
0.147
0.248
Conduct
-0.038
0.036
1.081
1
0.298
-0.034
0.109
Peer
0.063
0.031
4.021
1
0.045
0.001
0.124
Model
-2 Log Likelihood
Chi-Square
Df
Sig.
Survey 1
Null Hypothesis
1446.807
General
1438.185
8.622
4
0.071
Survey 2
Null Hypothesis
1031.238
General
1025.865
5.372
4
0.251
Survey 1
Impact Score
Total
No problem
Slightly raised
Advanced
Difficulty Score
Observed
No problem
283
34
172
489
Slightly raised
46
28
94
168
Advanced
14
24
77
115
Estimated
No problem
408
00
81
489
Slightly raised
34
00
134
168
Advanced
02
00
113
115
Survey 2
Difficulty Score
Observed
No problem
190
73
128
391
Slightly raised
19
21
66
106
Advanced
05
09
73
87
Estimated
No problem
272
00
119
391
Slightly raised
00
00
106
106
Advanced
00
00
87
87
Figure 1. GWR Coefficient Distribution of Relationship for No Vehicle to Corner Store Count, Los Angeles
Figure 2. GWR Coefficient Distribution of Relationship for Percent Minority to Corner Store Count, Los Angeles
Figure 3. GWR Coefficient Distribution of Relationship for Percent Hispanic to Corner Store Count, Los Angeles
Figure 4. GWR Coefficient Distribution of Relationship for MHHI to Corner Store Count, Los Angeles
Figure 5. GWR Coefficient Distribution of Relationship for Total Population to Corner Store Count, Los Angeles
Figure 6. GWR Coefficient Distribution of Relationship for Total Population to Corner Store Count, with NEMS-S-Rev Results, Los Angeles