Skip to content

Chapter 6: Bayesian Multi-armed Bandits Code #546

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
Ander-MZ opened this issue Jul 13, 2022 · 0 comments
Open

Chapter 6: Bayesian Multi-armed Bandits Code #546

Ander-MZ opened this issue Jul 13, 2022 · 0 comments

Comments

@Ander-MZ
Copy link

Ander-MZ commented Jul 13, 2022

After carefully studying the example code for the multi-armed bandit on chapter six, I found a piece of code which I believe is missing a parameter:

def sample_bandits(self, n=1):

        bb_score = np.zeros(n)
        choices = np.zeros(n)
        
        for k in range(n):
            #sample from the bandits's priors, and select the largest sample
            choice = np.argmax(np.random.beta(1 + self.wins, 1 + self.trials - self.wins))
            
            #sample the chosen bandit
            result = self.bandits.pull(choice)

Here, np.random.beta(1 + self.wins, 1 + self.trials - self.wins) is missing the size parameter, thus it returns a single value, not an array. That makes np.argmax() to pick a bandit useless, as that will always return 0.

Shouldn't the code be np.random.beta(1 + self.wins, 1 + self.trials - self.wins, len(self.n_bandits)) ?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant