Following up from last week, I talked to Nathan and decided to work on implementing the length function before trying to improve the objective function. The length function would seem to be fairly easy to implement, so I was very puzzled why it wasn’t working at all. It took an embarassingly long period of time before I realized that in point of fact, some relic of an improperly deleted attempt was left behind in my code, and so the random tensors that were being used for generation were entirely different than the ones that were left behind. When that was remedied, the length function worked, and more excitingly, the text improved further. This was much more of a success than anticipated.
The disadvantage of the length function, though, was that because the random numbers get optimized universally, and because there were very few long sentences in the dataset, long sentences tended to be very bad and drag the shorter random numbers down with them. It helped when the maximum length was set much lower, but even so, it still struggled with constructing viable 10-15 word sentences. In a move that echoes previous things I’ve tried, I decided to work more on the objective function, training with different setups to attempt to get it more accurate. It worked, but unfortunately, it seems to really return binary scores of either 0 or 1, not anywhere in between. To progress, we decided to try to resolve that somewhat. In my last week, I’m going to make my final effort to make the objective function work better, and hopefully come up with a result that works!