# ANOVA for word length in novels using SPSS

The literary styles of different authors can vary widely. In particular, one factor that may differ is the word length of sentences. Here are two (fiction) novels of reasonable length, written in English and by two different authors.

Novel 1: Memoirs of a Geisha by Arthur Golden.
Novel 2: The Pact by Jodi Picoult.

To get a random sample of twelve sentences for each novel, I used a random number table to select a sample of 12 different pages and then found the first complete sentence on each of the pages.

For each of the chosen sentences I recorded the measurements on the following variables:
Page Number: the page number that the sentence appeared on.
Number Words: the number of words in the sentence.
Number Letters: the number of letters in the sentence.

I entered my data into an Excel spreadsheet, creating two separate tables - one for each novel. I added to new variables to my spreadsheet: Word length - the average number of letters per word, and Word Density - a variable that classifies sentences as either "Low", "Medium" or "High" density depending on whether they have an average word length of less than 4.0 letters, at least 4.0 letters but less than 5.0 letters, or at least 5.0 letters respectively.

Memoirs of a Geisha Literary Data
Page Number Word Number Letter Number Word Length (2 d.p.) Word Density
2 37 168 4.54 Medium
219 30 119 3.97 Low
361 20 76 3.8 Low
392 7 39 5.57 High
276 13 37 2.85 Low
14 28 104 3.71 Low
180 20 75 3.75 Low
183 27 114 4.22 Medium
205 14 60 4.29 Medium
208 11 44 4 Medium
378 22 100 4.55 Medium
149 45 284 6.31 High

The Pact Literary Data
Page Number Word Number Letter Number Word Length (2 d.p.) Word Density
390 5 25 5 High
76 15 49 3.27 Low
287 17 56 3.29 Low
35 16 47 2.94 Low
183 10 31 3.1 Low
149 17 53 3.12 Low
95 5 19 3.8 Low
78 16 64 4 Medium
248 30 114 3.8 Low
440 14 62 4.43 Medium
189 1 2 2 Low
309 17 75 4.41 Medium

a) Using Excel or SPSS (Version 14 only), draw side-by-side dot plots of the Word Lengths of the two novels. Using plain English, compare the Word Lengths of the two novels.

b) Either using Excel or SPSS (Version 14 only), use an appropriate exploratory tool to display the relationship between the variables Word Length and Number of Words for the second novel. Comment on what this display shows.