Everything from Exam 1, plus…
Ethics
- in-class discussion; see your own notes/readings…
Text Classification
- text categorization
- sentiment analysis
- decision list classifier
- naive Bayes classifier
- bag-of-words
- Bayesian inference
- prior probability
- likelihood
- naive Bayes assumption
- linear classifiers
- unknown words
- stop words
- multinomial NB
- binary NB
- sentiment lexicons
- feature selection
- information gain
Evaluation
- gold labels
- contingency table
- microaveraging
- macroaveraging
- statistical significance testing
- accuracy
- evalb
- mean reciprocal rank
- perplexity
POS Tagging
- part of speech (POS)
- closed class
- open class
- function word
- lots of specific parts of speech
- Penn Treebank tag set
- disambiguation
- sequence model
- Markov chain
- Markov assumption
- Hidden Markov Model
- decoding
- Viterbi algorithm
- MEMM
- feature templates
- word shapes
- label/observation bias
Grammars
- preposed and postposed constructions
- Context-Free Grammar
- lexicon
- terminal symbols
- non-terminal symbols
- derivation
- parse tree
- start symbol
- bracketed notation
- (un)grammatical
- generative grammar
- subcategorization frame
- metarules
- treebanks
- traces
- syntactic movement
- Chomsky normal form
- unit productions
- combinatory categorical grammar
- forward composition
- backward composition
- type raising
Parsing
- structural ambiguity
- attachment ambiguity
- coordination ambiguity
- syntactic disambiguation
- CKY parsing
- parsing vs. recognizing
- partial parse
- shallow parse
- chunking
- IOB tagging
- PCFG
- consistent (for a PCFG)
- yield
- probabilistic CKY
- inside-outside algorithm
- EM algorithm
- parent annotation
- lexicalized grammar
- head tag
- probabilistic CCG parsing
Meaning Representation
- meaning representation
- computational semantics
- knowledge base
- verifiability
- canonical form
- inference
- non-logical vocabulary
- logical vocabulary
- denotation
- domain (of a model)
- truth-conditional semantics
- First-Order Logic
- Term
- Constant
- function
- variable
- logical connectives
- quantifiers
- lambda notation
- lambda-reduction
- currying
- Modus ponens
- forward chaining
- backward chaining
- abduction
- event variables
- neo-Davidsonian event representation
- temporal logic
- tense logic
- reference point
Machine Translation
- fertility
- permutation
- spurious words
- word alignments
- fractional counts
- alignment probabilities
- distortion
Information Extraction
- named entity recognition
- relation extraction
- event extraction
- temporal expressions
- temporal normalization
- template filling
- template recognition
- gazetteer
- RDF
- seed patterns
- bootstrapping
- confidence values
- semantic drift
- noisy-or technique
- distant supervision
- open information extraction
- lexical triggers
- temporal anchor
Question Answering
- query reformulation
- question classification
- answer type
- passage retrieval
- span labeling
- n-gram tiling
- reading comprehension datasets
- sentence selection
- focus
Other Concepts From Labs
- word sense
- polysemous
- most frequent sense
- feature vector
- collocation
- sparse matrix
- iterative parsing