This question has been solved
520.666 Information Extraction from Speech and Text Homework # 4: Alternative strategies for smoothing a bigram language model
USJHU Johns Hopkins University520.666600.666Information Extraction from Speech and TextLinear interpolationBigram
(520j600).666
Information Extraction from Speech and Text
Homework # 4
Due March 10, 2023.
In class, we discussed linear interpolation for smoothing a bigram language model, namely
P(wjv) = f(wjv) + (1 )f(w);
wheref(j) andf() denoted the appropriate relative frequency estimates, and
was chosen so as to maximize the probability of some held-out data.
CourseNana.COM
This homework considers alternative strategies for smoothing a bigram language model
by directly modifying the counts observed in the training data. In particular, let C(v;w)
denote the count of a bigram hv;wiin the training text , and letC(v;w) be the modied
count. For some constant >0, consider the three cases
(i)C(v;w) =C(v;w) +,
(ii)C(v;w) =C(v;w) +C(w), and
(iii)C(v;w) =C(v;w) +C(v)f(w).
In each case, the smoothed bigram probability is calculated as
P(wjv) =C(v;w)
P
w02VC(v;w0):
LetN(v;w) denote the count of a bigram hv;wiin the held-out textH.
CourseNana.COM
-
Derive an expression for the that maximizes the log-probability
P(H) =Xv2VXw2VN(v;w) logP(wjv)
of the held-out text in each of the three cases (i), (ii) and (iii) above.
CourseNana.COM
-
Show that if N(v;w) =C(v;w) for all bigramshv;wi, then the optimal value is = 0
in each case. Why is this an expected result?
CourseNana.COM
-
Show, in each case, that Pmay be written as the linear interpolation of a bigram and
a lower order language model, though not necessarily f(w).
P(wjv) =f2(wjv) + (1 )f1(w);
i.e., identify f1,f2and
, and discuss the merits/drawbacks of each smoothing strategy.
After finishing the homework, carefully review all sections of Chapter 4 again.
CourseNana.COM
US代写,JHU代写, Johns Hopkins University代写,520.666代写,600.666代写,Information Extraction from Speech and Text代写,Linear interpolation代写,Bigram代写,US代编,JHU代编, Johns Hopkins University代编,520.666代编,600.666代编,Information Extraction from Speech and Text代编,Linear interpolation代编,Bigram代编,US代考,JHU代考, Johns Hopkins University代考,520.666代考,600.666代考,Information Extraction from Speech and Text代考,Linear interpolation代考,Bigram代考,UShelp,JHUhelp, Johns Hopkins Universityhelp,520.666help,600.666help,Information Extraction from Speech and Texthelp,Linear interpolationhelp,Bigramhelp,US作业代写,JHU作业代写, Johns Hopkins University作业代写,520.666作业代写,600.666作业代写,Information Extraction from Speech and Text作业代写,Linear interpolation作业代写,Bigram作业代写,US编程代写,JHU编程代写, Johns Hopkins University编程代写,520.666编程代写,600.666编程代写,Information Extraction from Speech and Text编程代写,Linear interpolation编程代写,Bigram编程代写,USprogramming help,JHUprogramming help, Johns Hopkins Universityprogramming help,520.666programming help,600.666programming help,Information Extraction from Speech and Textprogramming help,Linear interpolationprogramming help,Bigramprogramming help,USassignment help,JHUassignment help, Johns Hopkins Universityassignment help,520.666assignment help,600.666assignment help,Information Extraction from Speech and Textassignment help,Linear interpolationassignment help,Bigramassignment help,USsolution,JHUsolution, Johns Hopkins Universitysolution,520.666solution,600.666solution,Information Extraction from Speech and Textsolution,Linear interpolationsolution,Bigramsolution,