Week 10

It was the final week of my project, and I’m actually genuinely pretty sad to see it go. I started out by trying to improve the objective function, as promised. I did manage to make it so that it was slightly more accurate on the testing data; unfortunately for me, this did not really change the quality of the generated sentences. I tried tweaking the design a lot, but in the end, the objective function was a limiting factor. I’ve come to the conclusion over this project that maybe I should have implemented it with a GAN; the problem is that so many design iterations passed that by the time that was clear, it was a little bit too late for that.

Read More

Week 9

Following up from last week, I talked to Nathan and decided to work on implementing the length function before trying to improve the objective function. The length function would seem to be fairly easy to implement, so I was very puzzled why it wasn’t working at all. It took an embarassingly long period of time before I realized that in point of fact, some relic of an improperly deleted attempt was left behind in my code, and so the random tensors that were being used for generation were entirely different than the ones that were left behind. When that was remedied, the length function worked, and more excitingly, the text improved further. This was much more of a success than anticipated.

Read More

Week 8

After working with the objective function for a while last week, there were still a lot of issues. For one thing, the new model didn’t train unless the learning rate was absurdly high- close to .1, which is orders of magnitude larger than it should be. For another, it didn’t do a good job. After talking about it with Nathan, he suggested that the reason it was doing so badly could be that having the token numbers implied a kind of ordinality that doesn’t exist. The solution would be to work with one-hot vectors, but the vocabulary size was large enough that it took a very long time to do that. Eventually I came across pytorch’s embedding layer, which worked amazingly- so well, in fact, that when I tested a trained model with pyribs, it actually generated text! Even without using an embedding as a seed, it generated somewhat plausible sentences (admittedly, still within the bounds of what the generator could produce), which was huge progress considering how terrible the other generated text could be.

Read More

Week 7

As expected, I started out last week by trying to speed up the process of generating text based on larger batches of random vectors and optimize how it was done. It went very successfully, outside of having to run the objective function separately for each sentence. This did speed up the process somewhat, letting there be more iterations, and despite my best hopes, it didn’t help the quality at all. Eventually, I decided to compare how the objective function (which gives a number that demonstrates how “good” a sentence is) rated a terrible generated sentence (generally it would pick a common word, such as “and” or “with”, and repeat it the entire time) versus a preexisting sentence. Unfortunately, it rated the bad sentences extremely well- if I had to guess, it gauges how probable each word is and scores very common words low. This helped explain why all of the sentences were terrible, even if they were noise for an embedded sentence- the objective function itself was flawed.

Read More

Week 6

Coming off of last week, I was very excited to find out which version of the model was the “best” version of the model. The model that I’d worked with that had performed the best was the model after the 61st epoch, but I had a suspicion that it was overfitted, which was confirmed when we looked at the error on the validation set and found that the best-performing models were around the 40-45 epoch mark. The model we selected to continue with was the one that had been trained for 40 epochs, as it was the first model with comparably low error to the other four lowest error models.

Read More

Week 5

After last week, I was really excited to train my model on the lab’s GPU! Unfortunately, things did not go according to plan. I accessed the computer fine with Anydesk, and downloaded my files to it. The disadvantage of using the lab computer, though, was that I had to be careful on the computer to not update or downgrade (or install) any packages. I was able to get around that with a virtual environment, but when I tried to run my code, it couldn’t access the CUDA. When I tested it on the computer, it could access the CUDA- but I couldn’t install packages I would need later, and I couldn’t load any of my data because of the Pytorch version being too outdated. Eventually, the solution we came up with was to train the model on Nathan’s (my PhD student mentor) personal computer, which has a GPU. After some minor errors, and fixing the code to run marginally faster (paradoxically by taking the data off the GPU), it started training.

Read More

Week 4

This week has been a mixed bag with the LSTM. Initially, I kind of followed a tutorial for a seq2seq model that used a GRU- ultimately that may have ended up being more complicated, since the embeddings I calculated act similarly to the encoder portion of a traditional seq2seq model, and I had to figure out how to combine the structure of the tutorial with what I want to use the model for. Once that worked, I tried to get it to run more efficiently by adding in batch sizes. What I’ve come to realize while working on this project is that the most frustrating part of it is trying to make sure all the dimensions line up. I want to say that I’m getting better at it, though, and after a few hours of trying to understand the documentation, I succeeded in getting it to run on batches. Finally, I converted the GRU to an LSTM.

Read More

Week 3

This week I finally started coding for my project! Where I left off, I was still looking for datasets to use sentiment/style transfer. Ultimately, it came down to a choice between one dataset of emails with varying degrees of politeness and one dataset of Yelp reviews (positive versus negative reviews). Eventually we agreed to use the Yelp dataset, since both of them were about the same level of accessibility, and the Yelp reviews had more clear delineations between positive and negative. (Also, the emails were incredibly dry, and I’d be looking at them for the next eight weeks or so.)

Read More

Week 2

After spending a lot of time trying to code fully connected networks last week, I started working on LSTMs and RNNs this week. Given that LSTMs are what I’ll be working with in my project, it’s important that I understand the fundamentals of how it works, even if I’m a little frustrated with how long it’s taken me to learn. I think it’s been worth it, since I’m pretty satisfied with my understanding of how they work now- in fact, I feel more confident that I can understand and talk about backpropogation, which I’ve always struggled with in particular.

Read More

Week 1

My first week has been a lot of trying to get settled into the lab- the first task I needed to do was complete the lab onboarding. I had to take two online certifications, one for human subjects research and one for HIPAA training. Even though my project won’t involve human interaction, it was interesting to learn about the laws and rules governing these areas.

Read More