Week 8

Makino replied, indicating that he's found a bug in the message class, the resolution of which seems to have solved the memory leak problems, at least with polling and advertisements turned off. He will be running more tests on it over the weekend and then revert.
Planning to implement some form of semantic analysis of the SMS input, as this will provide a deeper understanding of the utterance. Potential problems include lexical ambiguity (this could potentially be problematic due to the extraction of definitions from the dictionary, which may yield results containing multiple senses for each chosen word), referential ambiguity (e.g. what "it" refers to, in a sentence with multiple subjects/objects) and scopal ambiguity.
Start with representing knowledge in forms that will allow for semantic interpretation:
Knowledge Representation for Semantic Interpretation
A. First order logic (predicates and inferences) - Prolog
B. Semantic Networks
For future reference, I'm attaching a table of the POS Tags used in our PosTagger. These were obtained by simply executing the query "SELECT * FROM blogwall.pos_tags" in MySQL Administrator.
id type description
1 CC Coordinating conjunction
2 CD Cardinal number
3 DT Determiner
4 EX Existential there
5 FW Foreign word F Foreign word
6 IN Preposition or subordinating conjunction
7 JJ Adjective
8 JJR Adjective, comparative
9 JJS Adjective, superlative
10 LS List item marker
11 MD Modal
12 NN Noun, singular or mass
13 NNS Noun, plural
14 NNP Proper noun, singular
15 NNPS Proper noun, plural
16 PDT Predeterminer
17 POS Possessive ending
18 PRP Personal pronoun
19 PRP$ Possessive pronoun
20 RB Adverb
21 RBR Adverb, comparative
22 RBS Adverb, superlative
23 RP Particle
24 SYM Symbol
25 TO to
26 UH Interjection
27 VB Verb, base form
28 VBD Verb, past tense
29 VBG Verb, gerund or present participle
30 VBN Verb, past participle
31 VBP Verb, non-3rd person singular present
32 VBZ Verb, 3rd person singular present
33 WDT Wh-determiner
34 WP Wh-pronoun
35 WP$ Possessive wh-pronoun
36 WRB Wh-adverb
For semantic analysis, I have decided to try to use first-order logic to represent the meaning of each sentence, as described in "Handbook of Natural Language Processing", pp. 101-102. On this website, this approach is called "Predicate argument structure analysis". To quote from the website, "A predicate such as a verb and an adjective is the central element of a sentence, and it describes the movement or state of an event. The argument is a person or thing associated with the event."
I plan to use this idea in the input analysis by doing the following:
1) Extend current poem database by representing every poem line in this predicate argument structure form.
2) Perform semantic analysis on the SMS input sentence and represent it using first-order logic.
3) Find lines in the poem database that has first-order logic of the same or similar form.
4) Select these lines.
To begin this, I will attempt to perform semantic analysis of short sentences using this method, then work on extending the poem database. This predicate argument structure representation of each line in the database, like the idf-tf calculations, should only need to be performed once for each new poem/line added to the database.