We will be using the following SPSS files:
Open the titantic.sav file.
In SPSS go to Graphs -> Legacy -> Bar and then choose Simple and press Define.
We shall draw a bar chart showing the passenger's gender.
Choose 'Passenger Gender' from the left-hand box and use the right arrow to place it in 'Category Axis:'.
We also have options in 'Bars Represent' to have a vertical axis which counts cases (N of cases) or to show as percentages. Press OK.
Simple bar charts can also be shown side by side for comparisons. Again go to Graphs -> Legacy -> Bar and Simple. Click 'Reset' at the bottom of the window. This time we shall place 'Passenger Survived' in the 'Category Axis' and we will put 'Passenger Gender' in the 'Columns' box. This will draw a bar chart showing survival but will separate the bar charts by gender. Placing gender in the columns box will put the bar charts side by side. If we were to put gender in the rows box the bar charts would be one above the other.
We can compare groups with a clustered bar chart.
Staying with the titanic.sav file.
Go to Graphs -> Legacy -> Bar and choose 'Clustered' and click Define.
Set the dialog box up to match:
Staying with the titanic.sav file.
A stacked bar chart compares groups by stacking one bar on top of another. Go through Graphs -> Legacy -> Bar and choose 'Stacked'. We shall again look at passenger class as the category axis and we will define the stacks by passenger survived.
Press OK and:
We can see by looking at the total height of the bar how the number of people in each passenger class compared and within the bars we can see how the number who survived compares to those that did not. Clearly chances of surviving in 3rd class were much lower than in 1st class!
We will need to open the titanic.sav file in SPSS.
Histograms are used to display the distribution of a continuous variable. The continuous variable in the titanic file is the 'Passenger Age'. Go to Graphs -> Legacy Dialogs -> Histogram.
We need to put 'Passenger Age' into Variable box by using the right arrow button.
We now choose OK.
We can use the histogram to explore the shape of the distribution.
We will be using the titanic.sav SPSS file.
Boxplots are used to compare the distributions on different variables. A boxplot will give a visual of a distribution's minimum value, maximum value, lower quartile (the value for which 25% are below), upper quartile (the value for which 25% are above) and median (the 50% point).
We shall explore the age distribution of the passengers in the different classes.
Go to Graphs -> Legacy Dialogs -> Boxplot
Choose 'Simple' and click 'Define'.
The 'Variable' will be 'Passenger Age' (choose and move over with the right arrow button) and the 'Category Axis' will be 'Passenger Class'
The box represents the middle 50%, with the lower point being the lower quartile and the upper point being the upper quartile. The line in the middle of the box represents the median. For the first class passengers we can see that the lower quartile is approximately 28 years, median is approximately 40 years and the upper quartile is approximately 50 years. We can also see that the median age reduces from 1st class down to 3rd class, on average older people were in 1st class. The lines either side (called the 'whiskers' represent the range of the data (minimum to maximum). The dots refer to 'outliers'. These are values that SPSS has deemed to be outside the normal values of the data. Any value that is more than 1.5 box lengths (box length is the inter-quartile range) above the upper quartile or less than 1.5 box lengths below the lower quartile. The numbers refer to the case numbers in the dataset.
Scatter plots are used to explore the relationship between two continuous variables. We can look for correlations.
Open the cat.sav SPSS file. We will look for the relationship between the body weight and the heart weight of cats.
Go to Graphs -> Legacy Dialogs -> Scatter/Dot
Choose the 'Simple Scatter' and click 'Define'.
Scatter plots compare two variables so we need to define two axes (x - horizontal and y - vertical). Move the body weight into the x and heart weight into the y boxes.
We can see that the body and heart weights are positively correlated.
We can define the scatter points by gender too. Return to the scatter plot window and place the 'Cat Gender' into the 'Set Markers by' box.
We have the same scatter plot but the dots are coloured based on the gender of the cat.