Bug 1146955 - Make GMP plugin crash reporting UI work in e10s
Let me copy the salient bits from my notes for bug 1110887 .

GMP

Both of the GMP cases rely on the gmp-plugin-crashed observer notification firing in the same process that content is running in, which is not going to work in e10s land, but I don't need to worry about that. It should still work properly for the non-e10s case, assuming the PluginCrashed event handler doesn't change behaviour too much.

GMP + WebRTC case


Here's the bug for GMP e10s support. This has landed!

I think I might need the "runId" thing for GMP's as well, but this might be far simpler because PGMPService.ipdl's LoadGMP will return a processID - I can probably pass a run ID back too.

Node ID isn't what we want. According to this , node ID distinguishes as follows:

"... site A.com running EME in an <iframe> from CDN.com will be considered a different node id than B.com running EME in an <iframe> from CDN.com."

GMPProcessParent might need to be able to get at our static run ID counter. Is there some common place where it makes sense for GMP and PluginModuleChromeParent to get the static counter? Or should they not use the same counter?

So, facts:

  1. GMP already uses the PluginCrashed event, but sets the gmpPlugin property to true on it to differentiate and are dispatched on the document itself.
  2. There doesn't seem to be a way of iterating GMP's for some document like there is for NPAPI plugins. I guess that makes sense - GMP's are deep in the plumbing and aren't actually represented in the DOM anywhere

Suppose we have each GMP process have a gmpRunId. On crash, parent gets observer notification, right?

If so, send message down to the child saying, "Hey yo, we heard a crash with such and such a GMP run ID".

Child hears it, and checks to see if it's ever seen a PluginCrashed event for a gmpPlugin with that run ID. If so, sends a message up saying so. Otherwise, adds the run ID to a set of crashed GMP run IDs until pagehide.

If a GMP PluginCrashed event is seen, check to see if there is a run ID being stored that matches, and send up a message immediately saying, "Hey, I was using that GMP when it crashed". Otherwise, toss the run ID into the same set so that we can send a message back up immediately when we get the message from the parent.

Whoa whoa whoa whoa whoa. It looks like the PluginCrashed event is fired from dom/media/PeerConnection.js. I need to understand this thing more.

Hm… this is going to break PeerConnections.js crash handling model. Like, it's not going to get, as far as I can tell, the observer notification in the content process that it currently uses to bubble up the PluginCrashed events.

Ok, so suppose the parent process hears this observer notification (which it will), and then what it does is (like the NPAPI case), messages all browsers that the GMP with run ID X crashed.

Then somehow I need to be able to identify which browsers are making use of a GMP to know whether or not to display the notification bar. Maybe have PeerConnection.js add a message listener to the child process message manager.

Hears the message, then for each PeerConnection implementation that it's tracking, it calls pluginCrash, passing down the GMP run ID. Wait… how is it supposed to match run IDs to running PeerConnection instances? Fuck.

Maybe expose the run ID on PeerConnectionImpl.webidl? Then when we hear the message, PeerConnection.js iterates the list of all active peer connections, queries the PeerConnectionImpl's for matching run IDs, and if they match, calls PluginCrash on it. That causes the PluginCrash event to be bubbled up on the active document in the window that owns the PeerConnection.

That's when PluginContent.jsm picks it up. Notices that it's a gmpPlugin and shunts it accordingly - will sent a message directly to the parent to show the notification bar.

4:02 PM < mconley > billm: suppose I wanted GMP's and NPAPI plugins to use the same counter for their process run IDs. Is there a good place to put that common counter? Or do you think it makes more sense to keep GMP's absolutely separate, having their own run ID counter?
4:03 PM < & billm > mconley: maybe we could put it in GeckoChildProcessHost?
4:04 PM < & billm > mconley: that's the only shared piece I can think of

With bug 1057908 landed, how do the PluginCrashed events work? Do they even? I think they probably do… right?

STR

Go here: https://mozilla.github.io/webrtc-landing/pc_test.html , start up a camera, crash the plugin-container that gets started up.

So I think this will require me to dig a bit into GMP / EME, since it looks like all of the crash handling code seemed to assume single process. :(

I need a strategy to break this stuff up.

So blassey wrote up a patch that messages the content process when the parent notices a crash… let's examine it.

So blassey's patch sends a message down to the content process, which does the job of bubbling up PluginCrashed events.

Is that even necessary? What's necessary here?

So the message comes down, child receives in PeerConnection.js, PeerConnection finds each GMP per window and calls PeerConnectionImpl::PluginCrash on each, passing in the plugin ID. Each plugin is responsible for making sure that the plugin ID matches their own, and then dispatches the PluginCrashed event on its Window.

The run ID thing is the first bit. PeerConnection and PeerConnectionImpl should be using that to map crashing to plugins, not the plugin ID.

Ah… billm writes:

"I'm not familiar with PeerConnection.js at all. But if this is the same plugin ID as appears in the GMP code, then it is already a run ID. That is, we create a new one for each GMPParent instance. So I think it should be fine to use. "

Interesting. Let's look at where the plugin ID comes from.

Here's some code from PeerConnection.js:

} else if (topic == "gmp-plugin-crash" ) {
// a plugin crashed; if it's associated with any of our PCs, fire an
// event to the DOM window
let sep = data.indexOf( ' ' );
let pluginId = data.slice( 0 , sep);
let rest = data.slice(sep +1 );
// This presumes no spaces in the name!
sep = rest.indexOf( ' ' );
let name = rest.slice( 0 , sep);
let crashId = rest.slice(sep +1 );
for ( let winId in this ._list) {
broadcastPluginCrash(
this ._list, winId, pluginId, name, crashId);
}
}

PeerConnection uses this to fire off the PluginCrashed event. The pluginId is gotten from the observer notification. And who fires that? GMPParent.cpp, it seems...

static void
GMPNotifyObservers ( const nsACString & aPluginId, const nsACString & aPluginName, const nsAString & aPluginDumpId)
{
nsCOMPtr
< nsIObserverService > obs = mozilla :: services :: GetObserverService();
if (obs) {
nsString id;
AppendUTF8toUTF16(aPluginId, id);
id.Append(NS_LITERAL_STRING(
" " ));
AppendUTF8toUTF16(aPluginName, id);
id.Append(NS_LITERAL_STRING(
" " ));
id.Append(aPluginDumpId);
obs
-> NotifyObservers(nullptr, "gmp-plugin-crash" , id.Data());
}

nsRefPtr
< gmp :: GeckoMediaPluginService > service =
gmp
:: GeckoMediaPluginService :: GetGeckoMediaPluginService();
if (service) {
service
-> RunPluginCrashCallbacks(aPluginId, aPluginName, aPluginDumpId);
}
}

So GMPNotifyObservers gets it, called from:

void
GMPParent
:: ActorDestroy(ActorDestroyReason aWhy)
{
LOGD(
"%s: (%d)" , __FUNCTION__, ( int )aWhy);
#ifdef MOZ_CRASHREPORTER
if (AbnormalShutdown == aWhy) {
Telemetry
:: Accumulate(Telemetry :: SUBPROCESS_ABNORMAL_ABORT,
NS_LITERAL_CSTRING(
"gmplugin" ), 1 );
nsString dumpID;
GetCrashID(dumpID);

// NotifyObservers is mainthread-only
NS_DispatchToMainThread(WrapRunnableNM( & GMPNotifyObservers,
mPluginId, mDisplayName, dumpID),
NS_DISPATCH_NORMAL);
}

So each GMPPluginParent gets an mPluginId. Where is that assigned?

Ah, in the constructor:

GMPParent :: GMPParent()
: mState(GMPStateNotLoaded)
, mProcess(nullptr)
, mDeleteProcessOnlyOnUnload(
false )
, mAbnormalShutdownInProgress(
false )
, mIsBlockingDeletion(
false )
, mGMPContentChildCount(
0 )
, mAsyncShutdownRequired(
false )
, mAsyncShutdownInProgress(
false )
#ifdef PR_LOGGING
, mChildPid( 0 )
#endif
{
LOGD(
"GMPParent ctor" );
// Use the parent address to identify it.
// We could use any unique-to-the-parent value.
mPluginId.AppendInt( reinterpret_cast < uint64_t > ( this ));
}

So I guess that is sufficient as a run ID. Ok, good.

I can work with this.

Have the PluginChromeParent and GMPParent use the GeckoChildProcessHost to host the seed for run IDs.
Change PeerConnectionMedia to use 32-bit pluginID instead of 64-bit.
Only send the pluginID
Modify blassey's patch to only send the pluginID and name down to the child.
Remove the plugin dump ID from PeerConnectionImpl::PluginCrash
Have the PeerConnectionImpl set the run ID on the event
"PluginContent:ShowGMPCrashedNotification" should only pass the run ID up to browser-plugins.js

Now for testing.

13:19 (jesup) mconley|livehacking: https://mozilla.github.io/webrtc-landing/pc_test.html and check "Require H.264"
13:20 (jesup) that will use GMP for the OpenH264 plugin
13:20 (felipe) mconley|livehacking: actually when I worked on that bug the machinery was not fully set up yet, so I don't know that myself either. We implemented the bug assuming the platform would send the expected notification on a crash
13:20 (mconley|livehacking) interesting
13:20 (jesup) there's also a test of the fake GMP plugin in dom/media/tests/mochitest

I want to find a better, cross-app (so non-browser-specific) way of firing the PluginCrashed event in the child process.

Ok, got this kinda working. Let's talk to jesup to see if it's sensical.

Try push.

Hrm. Crash when I crash the plugin-container with single process Firefox… why?

* thread #80: tid = 0x6a53c2, 0x0000000105870df1 XUL`webrtc::RtpPacketizerH264::PacketizeFuA(this=0x000000011bd1cf20, fragment_offset=4, fragment_length=1418) + 193 at rtp_format_h264.cc:180 , name = 'GMPEncoded', stop reason = EXC_BAD_ACCESS (code=1, address=0x116ded004)
* frame #0: 0x0000000105870df1 XUL`webrtc::RtpPacketizerH264::PacketizeFuA(this=0x000000011bd1cf20, fragment_offset=4, fragment_length=1418) + 193 at rtp_format_h264.cc:180
frame #1: 0x0000000105870cf2 XUL`webrtc::RtpPacketizerH264::GeneratePackets(this=0x000000011bd1cf20) + 114 at rtp_format_h264.cc:158
frame #2: 0x0000000105870c6d XUL`webrtc::RtpPacketizerH264::SetPayloadData(this=0x000000011bd1cf20, payload_data=0x0000000116ded000, payload_size=1419, fragmentation=0x0000000145ee88b8) + 221 at rtp_format_h264.cc:150
frame #3: 0x00000001058ab672 XUL`webrtc::RTPSenderVideo::Send(this=0x0000000152b3c800, videoType=kRtpVideoH264, frameType=kVideoFrameDelta, payloadType='~', captureTimeStamp=2025040744, capture_time_ms=0, payloadData=0x0000000116ded000, payloadSize=1419, fragmentation=0x0000000145ee88b8, rtpTypeHdr=0x000000000000000c) + 322 at rtp_sender_video.cc:343
frame #4: 0x00000001058a5e8e XUL`webrtc::RTPSenderVideo::SendVideo(this=0x0000000152b3c800, videoType=kRtpVideoH264, frameType=kVideoFrameDelta, payloadType='~', captureTimeStamp=2025040744, capture_time_ms=0, payloadData=0x0000000116ded000, payloadSize=1419, fragmentation=0x0000000145ee88b8, codecInfo=0x0000000000000000, rtpTypeHdr=0x000000000000000c) + 286 at rtp_sender_video.cc:295
frame #5: 0x000000010589f9bd XUL`webrtc::RTPSender::SendOutgoingData(this=0x000000014f7fe008, frame_type=kVideoFrameDelta, payload_type='~', capture_timestamp=2025040744, capture_time_ms=0, payload_data=0x0000000116ded000, payload_size=1419, fragmentation=0x0000000145ee88b8, codec_info=0x0000000000000000, rtp_type_hdr=0x000000000000000c) + 1373 at rtp_sender.cc:499
frame #6: 0x000000010589f033 XUL`webrtc::ModuleRtpRtcpImpl::SendOutgoingData(this=0x000000014f7fe000, frame_type=kVideoFrameDelta, payload_type='~', time_stamp=2025040744, capture_time_ms=0, payload_data=0x0000000116ded000, payload_size=1419, fragmentation=0x0000000145ee88b8, rtp_video_hdr=0x0000000000000000) + 387 at rtp_rtcp_impl.cc:519
frame #7: 0x000000010589f3fe XUL`webrtc::ModuleRtpRtcpImpl::SendOutgoingData(this=0x000000014f4d0000, frame_type=kVideoFrameDelta, payload_type='~', time_stamp=2025040744, capture_time_ms=0, payload_data=0x0000000116ded000, payload_size=1419, fragmentation=0x0000000145ee88b8, rtp_video_hdr=0x0000000000000000) + 1358 at rtp_rtcp_impl.cc:567
frame #8: 0x0000000105a02fd7 XUL`webrtc::ViEEncoder::SendData(this=0x000000013fd121a0, frame_type=kVideoFrameDelta, payload_type='~', time_stamp=2025040744, capture_time_ms=0, payload_data=0x0000000116ded000, payload_size=1419, fragmentation_header=0x0000000145ee88b8, rtp_video_hdr=0x0000000000000000) + 151 at vie_encoder.cc:780
frame #9: 0x0000000105a0305a XUL`_ZThn8_N6webrtc10ViEEncoder8SendDataENS_9FrameTypeEhjxPKhjRKNS_22RTPFragmentationHeaderEPKNS_14RTPVideoHeaderE(this=0x000000013fd121a8, frame_type=kVideoFrameDelta, payload_type='~', time_stamp=2025040744, capture_time_ms=0, payload_data=0x0000000116ded000, payload_size=1419, fragmentation_header=0x0000000145ee88b8, rtp_video_hdr=0x0000000000000000) + 122 at Unified_cpp_webrtc_video_engine0.cpp:788
frame #10: 0x000000010580ffef XUL`webrtc::AudioEncoderPcm::AudioEncoderPcm(this=0x000000013fc27a10, config=0x0000000145ee9000) + 28575 at audio_encoder_pcm.cc:37
frame #11: 0x000000010286de90 XUL`mozilla::WebrtcGmpVideoEncoder::Encoded(this=0x000000013fc89bc0, aEncodedFrame=0x0000000129abca90, aCodecSpecificInfo=0x000000011c7955d8) + 1584 at WebrtcGmpVideoCodec.cpp:568
frame #12: 0x000000010286e247 XUL`_ZThn16_N7mozilla21WebrtcGmpVideoEncoder7EncodedEP20GMPVideoEncodedFrameRK8nsTArrayIhE(this=0x000000013fc89bd0, aEncodedFrame=0x0000000129abca90, aCodecSpecificInfo=0x000000011c7955d8) + 55 at Unified_cpp_webrtc_signaling0.cpp:572
frame #13: 0x0000000104638396 XUL`mozilla::gmp::EncodedCallback(aCallback=0x000000013fc89bd0, aEncodedFrame=0x0000000129abca90, aCodecSpecificInfo=0x000000011c7955d8, aThread=0x0000000145ee8a40) + 70 at GMPVideoEncoderParent.cpp:291
frame #14: 0x000000010463d8dc XUL`mozilla::runnable_args_nm_4<void (this=0x000000014f2d4a80)(GMPVideoEncoderCallbackProxy*, GMPVideoEncodedFrame*, nsTArray<unsigned char>*, nsCOMPtr<nsIThread>), GMPVideoEncoderCallbackProxy*, mozilla::gmp::GMPVideoEncodedFrameImpl*, nsTArray<unsigned char>*, nsCOMPtr<nsIThread> >::Run() + 108 at runnable_utils_generated.h:325
frame #15: 0x0000000101723a06 XUL`nsThread::ProcessNextEvent(this=0x0000000127019f10, aMayWait=true, aResult=0x0000000145ee8c6e) + 2086 at nsThread.cpp:868
frame #16: 0x000000010177e8d8 XUL`NS_ProcessNextEvent(aThread=0x0000000127019f10, aMayWait=true) + 168 at nsThreadUtils.cpp:265
frame #17: 0x0000000101dbfb47 XUL`mozilla::ipc::MessagePumpForNonMainThreads::Run(this=0x00000001486d5b00, aDelegate=0x000000014d5e0e50) + 951 at MessagePump.cpp:355
frame #18: 0x0000000101d32225 XUL`MessageLoop::RunInternal(this=0x000000014d5e0e50) + 117 at message_loop.cc:233
frame #19: 0x0000000101d32135 XUL`MessageLoop::RunHandler(this=0x000000014d5e0e50) + 21 at message_loop.cc:226
frame #20: 0x0000000101d320dd XUL`MessageLoop::Run(this=0x000000014d5e0e50) + 45 at message_loop.cc:200
frame #21: 0x0000000101721ea6 XUL`nsThread::ThreadFunc(aArg=0x0000000127019f10) + 358 at nsThread.cpp:364
frame #22: 0x0000000101379dcf libnss3.dylib`_pt_root(arg=0x000000013fc27a10) + 463 at ptthread.c:212
frame #23: 0x00007fff8c2a9772 libsystem_c.dylib`_pthread_start + 327

Ah… heh, ok, that's not my problem. This happens without my patch as well.

I think I'm good to land.

Landed and merged!

Unify browser-plugins.js notification bar stuff.
File bug to signal content process to fire PluginCrashed somewhere more sensical - filed bug 1161587
File a bug for the crash. filed bug 1161589
File a bug for GMP crash reporting (e10s and non-e10s) to work on Windows - filed bug 1161814 .