1. Homepage
  2. Homework
  3. 520.666 Information Extraction from Speech and Text Homework # 4: Alternative strategies for smoothing a bigram language model
This question has been solved

520.666 Information Extraction from Speech and Text Homework # 4: Alternative strategies for smoothing a bigram language model

Engage in a Conversation
USJHU Johns Hopkins University520.666600.666Information Extraction from Speech and TextLinear interpolationBigram

(520j600).666 Information Extraction from Speech and Text Homework # 4 Due March 10, 2023. In class, we discussed linear interpolation for smoothing a bigram language model, namely P(wjv) = f(wjv) + (1)f(w); wheref(j) andf() denoted the appropriate relative frequency estimates, and was chosen so as to maximize the probability of some held-out data. CourseNana.COM

This homework considers alternative strategies for smoothing a bigram language model by directly modifying the counts observed in the training data. In particular, let C(v;w) denote the count of a bigram hv;wiin the training text , and letC(v;w) be the modi ed count. For some constant >0, consider the three cases (i)C(v;w) =C(v;w) +, (ii)C(v;w) =C(v;w) +C(w), and (iii)C(v;w) =C(v;w) +C(v)f(w). In each case, the smoothed bigram probability is calculated as P(wjv) =C(v;w) P w02VC(v;w0): LetN(v;w) denote the count of a bigram hv;wiin the held-out textH. CourseNana.COM

  1. Derive an expression for the that maximizes the log-probability P(H) =Xv2VXw2VN(v;w) logP(wjv) of the held-out text in each of the three cases (i), (ii) and (iii) above. CourseNana.COM

  2. Show that if N(v;w) =C(v;w) for all bigramshv;wi, then the optimal value is = 0 in each case. Why is this an expected result? CourseNana.COM

  3. Show, in each case, that Pmay be written as the linear interpolation of a bigram and a lower order language model, though not necessarily f(w). P(wjv) =f2(wjv) + (1)f1(w); i.e., identify f1,f2and , and discuss the merits/drawbacks of each smoothing strategy. After finishing the homework, carefully review all sections of Chapter 4 again. CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
US代写,JHU代写, Johns Hopkins University代写,520.666代写,600.666代写,Information Extraction from Speech and Text代写,Linear interpolation代写,Bigram代写,US代编,JHU代编, Johns Hopkins University代编,520.666代编,600.666代编,Information Extraction from Speech and Text代编,Linear interpolation代编,Bigram代编,US代考,JHU代考, Johns Hopkins University代考,520.666代考,600.666代考,Information Extraction from Speech and Text代考,Linear interpolation代考,Bigram代考,UShelp,JHUhelp, Johns Hopkins Universityhelp,520.666help,600.666help,Information Extraction from Speech and Texthelp,Linear interpolationhelp,Bigramhelp,US作业代写,JHU作业代写, Johns Hopkins University作业代写,520.666作业代写,600.666作业代写,Information Extraction from Speech and Text作业代写,Linear interpolation作业代写,Bigram作业代写,US编程代写,JHU编程代写, Johns Hopkins University编程代写,520.666编程代写,600.666编程代写,Information Extraction from Speech and Text编程代写,Linear interpolation编程代写,Bigram编程代写,USprogramming help,JHUprogramming help, Johns Hopkins Universityprogramming help,520.666programming help,600.666programming help,Information Extraction from Speech and Textprogramming help,Linear interpolationprogramming help,Bigramprogramming help,USassignment help,JHUassignment help, Johns Hopkins Universityassignment help,520.666assignment help,600.666assignment help,Information Extraction from Speech and Textassignment help,Linear interpolationassignment help,Bigramassignment help,USsolution,JHUsolution, Johns Hopkins Universitysolution,520.666solution,600.666solution,Information Extraction from Speech and Textsolution,Linear interpolationsolution,Bigramsolution,