Before I get started here's some formalities for the course:
My work on this release can be found here:
https://github.com/cadecairos/mozilla-central/tree/bugs/686137
And now, the story:
As I described in my previous post, I've been hacking on Firefox media code for my OSD700 class. This post is going to detail more about the bug I was working on and how things went with it.
Bug 686137 was an audio API related issue where the MozAudioAvailable event on an audio element would report an incorrect frameBufferLength value. This happened after it had been manually changed through the mozFrameBufferLength attribute on the <audio> element. The first thing I did was create a test case using code that David Humphrey included in his bug report. Using this test I narrowed down the list of possible bugs that caused this regression and eventually discovered the bug that created it. From there I zeroed in on the content/media folder within mozilla-central, where all the media code lives. Using the Mozilla Cross-Reference website, I searched for a couple of keywords: mozFrameBufferLength and MozAudioAvailable.
This search led me to more files than I'd have liked to have seen. I spent the majority of my time last week reading through these files trying to familiarize myself with what was going on in them, but this ultimately got me no closer to a fix. I spoke with Jon Buckley, who told me that in his experience, using GDB was the best way to trace the execution of the bug down. I hadn't used GDB since OOP344 with Fardad, where I and two other team members created a simple text editor in C/C++ called Bite. After a simple search, I found a wiki detailing techniques for debugging Firefox with GDB.
The obvious thing for me to do here was find all the places in the code where the frameBufferLength was being handled in some manner. My plan was to set break points at parts where I thought it was being set and do some checking of the data from within GDB. For your information, the test case I was using used a default buffer length of 2048 Bytes, which was set in JavaScript to 4096 Bytes. When debugging against this test case, I found it very curious that at all the break points I had set, the frame buffer length was always 2048. There was no trace of the 4096 value in the test case.
What this told me is that I needed to follow the path that the value 4096 was taking. Using MXR I located the setter method for the mozFrameBufferLength attribute and using GDB, set a break point within the function. Stepping through the code, it looked as if everything was being set properly. At this point in my investigation, I found myself completely stumped. I spent many hours stepping through the code trying to find the difference in the initial setting of the buffer length on the
Then I noticed it. The key to the puzzle was a class called nsBuiltinDecoder. This class inherited from a base class named nsMediaDecoder. Each one contains a RequestFrameBufferLength() method that accepts a 32 bit integer for the new buffer size and handle it in slightly different ways. Why is this important? Well, When the nsBuiltinDecoder::RequestFrameBufferLength(). On the other hand, when setting the buffer length from the element, it passes the argument into nsMediaDecoder::RequestFrameBufferLength! I opened the header file for nsMediaDecoder and found the function definition, and lo and behold, there was no "virtual" keyword! By this point I'd spent over 6 hours that day tracing this thing down, and I was quite fatigued. I made the addition and using a tool called "smartmake" suggested to me by a Mozillian in #introduction on Moznet, began an incremental build. I was 100% sure it would work. I opened up Nightly and navigated to my test page and.... nothing changed. What. The. Fudge.
It was a huge let down. But I wasn't let down for long. After posting about where I was, a developer named Yuri posted a comment, indicating that the virtual keyword was the problem. After discussions with him and Dave Humphrey on IRC, it became clear that I should perform a top-level build. I'm not gonna lie, I was pretty excited. After my top-level build completed, the test case finally passed! Vindication at last.
Now that I had a fix I needed to create a patch for it. Matthew Schranz posted a guide on developing for Firefox using GIT (which is what I was doing) and he had included a command that would create a properly formatted patch. Using that, I created this attachment on the bug. I then set out to write a test for it so that it would never break unknowingly again. Using existing tests as a base, and after reading several guides on writing tests for Firefox, I produced this attachment to the bug.
From there, my navigation through the review process was a little rocky. I found that upon setting an attachment for review, I should set a person responsible for it (something I did not do). Luckily it got picked up by Matthew Gregan AKA kinetik who r+ the first patch. He also gave me some pointers for the future. In my patch I didn't follow the regular commit message conventions, something I won't do again. The second patch needed some work done before it could be approved. They were just minor issues relating to comments I'd forgotten to remove and unneeded listeners I'd used. I also took his advice and added a second assertion for a case I had not considered. After wrestling with git for an hour trying to get the patch right I posted a new attachment with the fixes, and a commit message that met the expectation I mentioned above.
This new attachment got a positive review from Kinetik. The next day, after discussion about the bug I was advised to set the bug to have the checkin-needed flag. I also uploaded a new attachment with a conforming commit message, obsoleting the first attachment I posted. Later that evening the patches were merged into mozilla-inbound. I learned that this is a sort of staging area, where new patches have the tests run on multiple platforms to make sure that they don't break the entire program. While driving to work the next morning I got an email notification that my patches were now merged into mozilla-central.
Hacking on this bug taught me a lot. I got a great introduction to some of the media code and its inner workings. I got a nice refresh on using GDB. I now have a deeper understanding of building Firefox from the source (when in doubt, do a top-level build). I learned more about the firefox review process (what NOT to do). And I personally learned that even the simplest of fixes can take a ridiculous amount of time to track down. Especially when you're not familiar with the code like I was. Coming out of this thing I feel I'm in a better position to take on more bugs.
My plan for the next two weeks is to get on a couple other bugs I've been looking at ( bug 715323 and bug 711742 ). I'm also going to find more bugs to work on related to the media code.