aaronpb, Aaron Brzezinski

This lab was fun. I enjoy linguistics, so it was interesting to explore the use of language over time in Part 1, and to see the ways in which computers are still far from perfect in analyzing language in Part 3.

Part 1

For my first graph, I wanted to chart the use of the word "whom" in the English vocabulary over time. The advanced feature that I used was the * wildcard, which showed me all the uses of the word "whom" that were preceded by another word. Naturally, the top ten results for a word preceding "whom" were all prepositions. The graph shows a steady decline in use of the word "whom" from the year 1800 to present, which makes sense because the word has clearly fallen out of fashion in modern English.

For this one, I was interested in looking at the use of the word "tweet," both as a noun and as a verb. I used the _VERB and _NOUN tags to distinguish these two forms of the word. The graph shows a sharp increase in the use of both the noun and the verb form of "tweet," starting around the year 2006, which just so happens to be the year Twitter was created.

Part 2

For this part of the lab, I used the same text as I did in Lab 6 - The Scarlet Letter, by Nathaniel Hawthorne.

This is the word cloud.


The tool I found the most interesting was the Trends graph. When I selected for just uses of the words "child," "little," and "pearl," I found that the trends in their usage followed a similar curve, which makes sense, since they were most likely used together a lot in the text.

I also found the Summary tool to be very interesting, mostly because the stats it reports looked familiar to me. The word counts and such are reminiscent of activities we did in Lab 6.

Here is a view of the Trends graph with all of the 5 words selected. I liked this a lot because I found it very interesting to have a visual representation of the use of these words throughout the book.

Part 3

From the Sentimood list, I think that competitive and underestimated could be considered to have a positive or negative connotation, depending on context.

I think the weighing isn't quite right on cheer (a 2 that could easily be a 3 or 4) and forced (a -1 that I think could be a -2 or a -3).

Two sentences where the two agree and appear to be correct are "She is a brilliant mathematician" and "He is a certifiable fool".

Two examples where they disagree are "Blissful ignorance" and "Not the brightest bulb in the box".


And they agree on "If he was the only one competing, he would win by a landslide" and "She was wicked smart", but both are incorrect in their assessment.


Part 4

For this part, "It was love at first sight" and "The apple doesn't fall far from the tree" both translated to Spanish and back without issue.

However, "I have two wrists" and "Sir, double the pages" did not do so well.

All of these translation services did incredibly well, even up against more complex sentences and archaic language. The only thing I could find to really trip them up was Spanish words that have completely different meanings based on context. Perhaps it would have been easier to find some egregious mistranslations in a language less closely-related to English. Overall, though, the translation services did very well and would definitely be useful if I needed them in a real situation.

Part 5

For this part, I first did an Image Project using photo samples of me holding a Class of 2024 pennant, me holding nothing, and me wearing my glasses. I wanted to see how well it would work with something big and obvious (like a pennant) vs something a little more subtle (like glasses).

With about 30 samples each, the machine had a bit of a hard time pinning down exactly what I was showing it.




But once I gave it more to learn from, it did very well.



It wasn't really sure what to do when I was wearing glasses and holding the pennant at the same time, so it chose the pennant, presumably because it's bigger and more obvious.

Note: I realize now that there is a typo in my label for the "pennant" class, but in principle it makes no difference. Apologies.


For the second part, I wanted to see how the audio machine worked with different songs that had similar melodic patterns, so I compared an audio clip of me singing The Star Spangled Banner against Happy Birthday.

The machine had no problem recognizing background noise, as expected.

But did not do as well as I had hoped with the tunes. These were the results while I was singing Happy Birthday.


The prediction for The Star Spangled Banner was a bit better.