Assignment 6 - Regression Analysis
The goals of this assignment are to understand how to perform regression analysis and to predict results given data from a regression output. This information will then be mapped in ArcMap
Background: In order to understand this assignment, it is important to understand the basic elements of a regression analysis.
Regression Analysis- A statistical tool that is used to investigate the relationships between 2 variables
Regression Equation- Y= a +bx. Almost identical to the equation of a line
Independent Variable = The X variable in the equation. Explains the Dependent Variable
Dependent Variable= The Y value of the equation. Explained by the Independent Variable
Regression Coefficient= The slope of the line, shows responsiveness of dependent variable to change in the independent variable.
Constant- The A value, or Y when X=0
Coefficient of Determination (r^2 value)= Measures, from 0 to 1, how well one variable explains another.
Methods:
Part 1- The first part of this assignment focuses on understanding regression outputs. A Excel spreadsheet was provided which contained data for an unnamed town. The data contains the percentage of children who receive free lunch for several different neighborhoods as well as crime data for the same locations. A local news station is investigating whether or not there is a significant correlation between poverty and crime rate. They claim that as the number of children who receive free lunch increases, so does crime. Using this data, a regression analysis was performed using SPSS statistical software to determine if this is true. The results may be seen below.
| Figure 1 Coefficients and Model Summary for Part 1 |
| Figure 2 Table displaying conflicting results |
Part 2
The second part of the assignment involves using a dataset regarding 911 calls in Portland, Oregon. The scenario is that the City of Portland is concerned about sufficient response times for 911 calls. They would like to know what might explain where calls are coming from. Another company is interested in building a hospital and would like to know the best place to put it. I performed 3 separate regression analyses to determine some factors that may be correlated to 911 calls.
_________________________________________________________________________________
| Figure 4 Regression analysis results comparing alcohol sales to 911 calls |
_______________________________________________________________________________
| Figure 5 Regression analysis results comparing low education and 911 calls |
The second regression analysis examines the relationship between individuals without high school degrees and 911 calls. The coefficient of determination is .567. This indicates a fairly strong relationship between the two variables. The slope is positive, meaning as one variable increases, so does the other. The significance value for low education is far below .05 so the null hypothesis would be rejected. For every 1 person without a high school degree, the number of 911 calls increase by 16 percent. These results all indicate that there is a significant linear relationship between individuals without high school degrees and 911 calls.
_________________________________________________________________________________
| Figure 6 Regression analysis results comparing foreign born populations and 911 calls |
_________________________________________________________________________________
The final phase of this project involved creating a residual and choropleth map that illustrates the results
The standard deviation map illustrates the census tracts in Portland that have higher numbers of 911 calls. The average amount of calls was 25 and the standard deviation was 28. This means that the census tracts in the north that were 1.5 standard deviations above the mean had an average of 42 more calls than other tracts.
The final map is a map of the residual for each census tract, or how far each census tract deviates from the best fit line if the values were placed on a trend line. In other words, this map shows the locations of major outliers in the data. The orange and red areas show outliers with a higher number of 911 calls, and the blue areas are ones with lower numbers. Comparing this map to the choropleth map, one can see that northern census tracts receive a higher than average number of 911 calls than the other tracts. If I were a company looking to build a new hospital, this would be the best place to do so.
| Figure 7 Standard deviation map of 911 calls in Portland |
| Figure 8 Residual Map of 911 calls compared to lower education |
| Figure 9: Ideal Census tracts for a new hospital (shown in blue) |
Comments
Post a Comment