Training an AI with RL to win pokemon battles, how should i define the rewards so it learns to play properly and get a consistent winrate?

Training an AI with RL to win pokemon battles, how should i define the rewards so it learns to play properly and get a consistent winrate?

A Conspiracy Theorist Is Talking Shirt $21.68

Yakub: World's Greatest Dad Shirt $21.68

A Conspiracy Theorist Is Talking Shirt $21.68

  1. 3 weeks ago
    Anonymous

    >self.
    >self.
    >self.
    >self.

    • 3 weeks ago
      Anonymous

      >alive(0)
      >alive(1)
      >alive(2)
      >alive(3)
      >alive(4)
      is this bait?

      • 3 weeks ago
        Anonymous

        It's not bait, it looks cute.

        https://i.imgur.com/3R6ijrn.jpg

        >self.
        >self.
        >self.
        >self.

        Don't worry about that.

        Anyways, i was thinking of the obvious rewards like positive rewards for fainting an enemy pokemon, and another one that's calculated with comparing the previous current HP and the one after taking the action.

        What i'm unsure of is what to do with cases where a switch in is the optimal move, in this case no pokemon would be beaten, the user HP decreased which would either give no reward or a negative one despite being the optimal move.

        • 3 weeks ago
          Anonymous

          >Anyways, i was thinking of the obvious rewards like positive rewards for fainting an enemy pokemon
          i think that will bias it towards frontloading strong attackers and just trying to get a few kills. i would try setting up to just reward winning, since pokemon is just clicking menus it shouldn't have trouble winning a few by chance to kick off the training (i think micro-rewards are for complicated games like an action game where it needs early rewards because it wont even win once until it gets a lot of training)

          • 3 weeks ago
            Anonymous

            I'll try this first and see how it does then, but i assume it will take it a lot longer to learn

            >comparing the previous current HP and the one after taking the action.
            wouldn't that completely remove moves that self-injure like curse or something?

            Like double edge? i forgot about that, you are right. I'll make use of both HP, enemy and user's instead so if the enemy HP is reduced more, then it provides a higher reward. Nice catch.

        • 3 weeks ago
          Anonymous

          >comparing the previous current HP and the one after taking the action.
          wouldn't that completely remove moves that self-injure like curse or something?

      • 3 weeks ago
        Anonymous

        I've seen worse code from China and India in real companies, tbh. It's still dogshit.

        https://i.imgur.com/4yK31fx.png

        Training an AI with RL to win pokemon battles, how should i define the rewards so it learns to play properly and get a consistent winrate?

        You have a bunch of things with very specific use cases. Why do you have a generic list that mangles all of them together? How do you later easily find out any of the information you've stuck in that? Why are you using magic strings to compare to values? Why are you commenting the obvious, like "pokemon at slot 0"?

        i think there is nothing to laugh at but you can teach other people to do it. We all talked about it: everyone in this economy has 10 years of python experience on their CV but maybe 6 month at most for real. Tell me how to improve and do not laugh at me because i can be your future coworker.

        Anon is right. It is something to laugh at and deserves to be laughed at. It's ok to not know a cereal spoon isn't the best answer, it's not ok to be told you're a moron and then ignore it.

        • 3 weeks ago
          Anonymous

          I'm putting all the input data for the actor neural network into an array, it will be latter turned into a tensor, it doesn't need any particular organization as far as i'm aware.
          The pokemon is alive 0-5 will just become 5 0/1 values in the array that the machine (hopefully) uses into account to take an action

          • 3 weeks ago
            Anonymous

            what i guess the other anon found as a problem is that you do not use https://www.w3schools.com/python/python_for_loops.asp
            python for each in the style of :
            "
            for x in "banana":
            print(x)
            "

            • 3 weeks ago
              Anonymous

              That's fine, i'm not really looking to improve the code or make it look nice, it's just a personal project, the one asking about standards and formats it's another anon, i just care about defining the right rewards.

              • 3 weeks ago
                Anonymous

                still, if you used for loops you could ahve just like 9-10 lines of code instead of like 40. it would save time to use loops.

          • 3 weeks ago
            Anonymous

            so, you see this exmaple:
            "
            fruits = ["apple", "banana", "cherry"]
            for x in fruits:
            print(x)
            "
            put all the strings you have in such array and do a for each.
            then next you see:
            "
            for x in range(6):
            print(x)
            "
            this is how you do the 1-5 calls.

        • 3 weeks ago
          Anonymous

          i still want to know how to omit "self." in code. the funny thing is that the code i have open right now uses :
          "
          QMainWindow.ui1 = Ui_MainWindow()
          QMainWindow.ui1.setupUi(QMainWindow)

          QMainWindow.resize(800, 800)
          QMainWindow.move(200, 200)
          "
          where QMainWindow would have been "self" if it was a normal app that did not have 2 windows and this is the only way i see to omit "self" - the write the name of the object that self refers to.

          • 3 weeks ago
            Anonymous

            I've seen worse code from China and India in real companies, tbh. It's still dogshit.
            [...]
            You have a bunch of things with very specific use cases. Why do you have a generic list that mangles all of them together? How do you later easily find out any of the information you've stuck in that? Why are you using magic strings to compare to values? Why are you commenting the obvious, like "pokemon at slot 0"?
            [...]
            Anon is right. It is something to laugh at and deserves to be laughed at. It's ok to not know a cereal spoon isn't the best answer, it's not ok to be told you're a moron and then ignore it.

            and lets be honest: you cannot omit the object.function structure so i do not understand the problem with "self"

    • 3 weeks ago
      Anonymous

      funny enough i program like that in Qt. It works. I would not do the >alive(0)...(1) thing as it can be put in array and a for each cycle but what is wrong with > .self?

      • 3 weeks ago
        Anonymous

        >It works.
        You can dig a ditch with a cereal spoon if you want but people will laugh at you and you will deserve to be laughed at.

        • 3 weeks ago
          Anonymous

          i think there is nothing to laugh at but you can teach other people to do it. We all talked about it: everyone in this economy has 10 years of python experience on their CV but maybe 6 month at most for real. Tell me how to improve and do not laugh at me because i can be your future coworker.

          • 3 weeks ago
            Anonymous

            SAAR do not redeem

            • 3 weeks ago
              Anonymous

              1. if i want to redeem i will.
              2. i do not think there is an easy way to omit writing "self" as it is a part of the code that has meaning
              3. i am not talking about the parts that are just an array and a for each loop, i am talking specifically for the "self" part.

              • 3 weeks ago
                Anonymous

                have you asked chat gpt

              • 3 weeks ago
                Anonymous

                i have not. but i think you are an butthole and will not. now i will explain the first anon what i find as a problem in his code and will consider your posts null and void.

          • 3 weeks ago
            Anonymous

            To be fair Python is literally a joke language (named after comedy troupe Monty Python) so maybe I'm taking it too seriously. Enjoy getting paid to copy paste the same thing 100x in a row, if you can get away with it.

          • 3 weeks ago
            Anonymous

            use this, for one thing https://black.readthedocs.io/en/stable/

            • 3 weeks ago
              Anonymous

              the frick

              • 3 weeks ago
                Anonymous

                use this, for one thing https://black.readthedocs.io/en/stable/

                https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html
                this just formats the code? I do not think it is relevant to the question of why would you use "self"

              • 3 weeks ago
                Anonymous

                use this, for one thing https://black.readthedocs.io/en/stable/

                the frick

                although it is useful and i could use it. So thank you.

  2. 3 weeks ago
    Anonymous

    give my you're code sar

  3. 3 weeks ago
    Anonymous

    Actually the Mixtral responce is bad. I am posting the Gemma responce and deleting the Mixtral one:

    >can you omit "self." calls in python?

    Omitting "self." calls in Python
    Yes, it is possible to omit "self." calls in Python when calling methods within the same class. This is because "self." is the default argument for the first parameter of instance methods, and it is used to refer to the current instance of the class.

    Here's a breakdown:

    With "self.":

    class MyClass:
    def __init__(self, value):
    self.value = value

    def add(self, other):
    return self.value + other

    # Create an instance
    my_object = MyClass(5)

    # Call the add method with "self."
    result1 = my_object.add(3)
    Without "self.":

    class MyClass:
    def __init__(self, value):
    self.value = value

    def add(self, other):
    return self.value + other

    # Create an instance
    my_object = MyClass(5)

    # Call the add method without "self."
    result2 = MyClass.add(my_object, 3)
    Both result1 and result2 will have the same value of 8.

    While omitting "self." is allowed, it is generally recommended to include it for clarity and consistency:

    Clarity: Including "self." makes it clear that the method is an instance method, and it avoids ambiguity with static methods.
    Consistency: Consistent use of "self." promotes readability and avoids potential errors.
    Therefore, omitting "self." should be reserved for situations where it improves readability or avoids unnecessary verbosity:

    Nested classes: In nested classes, omitting "self." can reduce nesting depth and improve readability.
    Static methods: For static methods, "self." is unnecessary as there is no instance associated with the method.
    In general, the decision of whether to omit "self." should be based on the specific context and readability of the code.

    End to End Time: 2.13 s
    Model: Gemma-7b-It

Your email address will not be published. Required fields are marked *