Week 9

Makino reverted with new visualisation codes, which Nimesha and Ken have tested out but still had memory leak problems. After several e-mail exchanges, Makino suggested that the problem lies with the parts of the code that interacts with the database (which is in CLI - Microsoft's Common Language Infrastructure), as TextDisplay.exe runs not just the visualisation codes, but also codes that interact with the database. He has sent us new code which we are testing again.

Following Prof Cheok's e-mail asking for FYPs to update these blogs in detail, I have scanned some pages of one of the books I've been studying, choosing only those pages that I think help summarise the ideas of predicate calculus that I am trying to implement. There is plenty of material that I have gone through but not scanned as they provide other information, such as background to Predicate Calculus and also Natural Language Understanding in general. Also, I have only chosen to scan pages from one book, as this is the only one that provides some insight into how I may try to obtain a predicate representation of a sentence. The following scans are from the book "Artificial Intelligence - Structures and Strategies for Complex Problem Solving" by George F Luger. The other books are listed in previous entries.

Firstly, some definitions of the symbols in use:

Next, an explanation of what an atomic sentence is:

REMINDER TO SELF: One important thing to note here is that "A PREDICATE RELATION IS DEFINED BY ITS NAME AND ITS ARITY". This is important because I'm planning to add two attributes to the table for poetry lines in the database - "predicate" and "arity", which are of types string and integer respectively. This is to aid in the comparison and matching of the SMS input's predicate to those of the poetry lines.

The following show some syntax rules, which help identify the legality of a sentence and also forms the basis of parsing a sentence and obtaining a parse tree:

The rules above may then be applied to parse the sentence 'The man bites the dog.', and the resulting parse tree is shown as well:


The pseudocode for procedures used for semantic representation (which I am trying to implement) is as follows:


I am trying to obtain a predicate representation for any SMS input sentence using the preceding pseudocode.

To do this, I am trying to understand the PosTagger code and see if there is already a parse tree that follows the noun_phrase verb_phrase structure above, which I can then use to form the predicate representation.

(Edit:
I've studied the PosTagging code (which I'm having some problems with understanding fully how it works), but at least I've succeeded in finding out where it stores the actual words and the corresponding PosTags of the individual words in the SMS input. What this means is that I have now successfully identified each word in the SMS input SEPARATELY, and also found their corresponding PosTags which was done by the PosTagger. Therefore, I will next attempt to identify the noun_phrase and/or verb_phrases in the input so that I can form the predicate representation.

Will be attempting this noun_phrase/verb_phrase identification thing next. )
Comments