Demos - Sabine Buchholz

MBSP: Memory-based shallow parsing for English

Shallow parsing is a useful preprocessing step for many Natural Language Processing applications. Sentences are then no longer just sequences of words, but receive some structure: groups of words that closely belong together are marked, specific relations between (groups of) words are found. In contrast to full parsing, shallow parsing does not attempt to find a structure comprising the whole sentence. Therefore, it is in general much faster. The Memory-Based Shallow Parser (MBSP) applies several modules to an English sentence supplied by the user. It first assigns a Part-of-Speech to each word in the sentence (see MBT). In a next step MBSP recognises chunks (non-overlapping, non-embedded constituents). Finally, MBSP assigns subjects and objects to the verbal chunks in the sentence. MBSP is trained on the Wall Street Journal (WSJ) treebank, a link to more recent WSJ material is included.