WhatsApp +1 (332) 244-5747

Machine learning questions

Get your custom paper done at low prices

Y

275 words/page

Y

Double spacing

Y

Free formatting (APA, MLA, Chicago, Harvard and others)

Y

12 point Arial/Times New Roman font

Y

Free title page

Y

Free bibliography & reference

Write the following machine learning questions regarding gradient 2. Gradient estimators, 20 points. All else being equal, it’s useful for a gradient estimatorto be unbiased. The unbiasedness of a gradient estimator guarantees that, if we decay the stepsize and run stochastic gradient descent for long enough (see Robbins & Monroe), it will convergeto a local optimum.The standard REINFORCE, or score-function estimator is defined as:e(2.1)gSF [f] = f (b)log p(b|0),b ~ p(b|0)(a) [5 points] First, let’s warm up with the score function. Prove that the score function haszero expectation, i.e. Ep(x|0)[Ve log p(x|0)] = 0. Assume that you can swap the derivativeand integral operators.(b) [5 points] Show that REINFORCE is unbiased: Ep(b/0) f (b) 50 log p(b|0)] = 30(c) [5 points] Show that REINFORCE with a fixed baseline is still unbiased, i.e. show thatSo Ep(ble) [f (b)].Ep(b10) [[f(b) – clog logp(b|0)] = 3086 Ep(b10) [f (b)] for any fixed c.(d) [5 points] If the baseline depends on b, then REINFORCE will in general give biasedgradient estimates. Give an example where Ep(b|0) [[f (b) – c(b)]5 logp(b|0)] + 56for some function c(b), and show that it is biased.( Ep(b10) [f (b)]The takeaway is that you can use a baseline to reduce the variance of REINFORCE, but notone that depends on the current action.

TESTIMONIALS

What Students Are Saying

Outstanding service, thank you very much.

Undergraduate Student

English, Literature

Awesome. Will definitely use the service again.

Master's Student

Computer Science