Bugfix: Fix manifest update event been added many times#1779
Bugfix: Fix manifest update event been added many times#1779KunXi-Fox wants to merge 1 commit intocanalplus:devfrom
Conversation
| ); | ||
|
|
||
| // reload if the Representation disappears from the Manifest | ||
| manifest.removeEventListener("manifestUpdate"); |
There was a problem hiding this comment.
This would completely remove all manifestUpdate listeners currently set which we probably don't want.
The listener below should normally clean up automatically when we don't need it anymore (that's what its fnCancelSignal argument is for - it's a CancellationSignal which serves the same role than an AbortSignal: cancelling stuff)
There was a problem hiding this comment.
fnCancelSignal seems will clear all these event listener, I think what we really need is just clear the one that we don't need anymore.
I've been read all those code again, If I understand correctly, what we want to do actually is, when manifestUpdate been triggered, we want to check if the adaptation that we linked to is been removed, if so, we'll need to call waitingMediaSourceReload method.
Am I right?
There was a problem hiding this comment.
I've been read all those code again, If I understand correctly, what we want to do actually is, when manifestUpdate been triggered, we want to check if the adaptation that we linked to is been removed, if so, we'll need to call waitingMediaSourceReload method.
That's right, that code is some resilience logic to handle cases where the manifest totally changes server-side (e.g. due to a server issue, a TV channel being "rebooted" etc.) by detecting that the currently-played quality isn't there anymore and scheduling a reload.
fnCancelSignal seems will clear all these event listener, I think what we really need is just clear the one that we don't need anymore.
fnCancelSignal will only clear the event listeners to which it is linked, when it triggers. Maybe if you saw a leak that signal is not the right one?
There was a problem hiding this comment.
fnCancelSignal seems to originate from here:
/**
* When triggered, cancel all `RepresentationStream`s currently created.
* Set to `undefined` initially.
*/
let cancelCurrentStreams: TaskCanceller | undefined;
// Each time the list of wanted Representations changes, we restart the logic
content.representations.onUpdate(
(val) => {
if (cancelCurrentStreams !== undefined) {
cancelCurrentStreams.cancel("locked representations changed");
}
Considering that it only is canceled when "locked representations changed" (e.g. as a consequence of a lockVideoRepresentations call), it might not be the right signal here...
I'm wondering if the intent wasn't to call createRepresentationStream with repStreamTerminatingCanceller.signal instead...
Looking at it!
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
The #1778 and #1779 issues / PR noticed a leak that seem to arise when multiple quality switches happen. I'm still unsure of the severity (looking at it what this fixes seems very minimal, and we did not notice this yet on production at Canal+ including on low-memory devices for what seems to be a change that has been here for 2 years - but external contributors actually did notice a leak so maybe a set of conditions amplify the issue), but looking closely at the code in question, there does seem to be an improper event listener clean-up on a quality switch. The issue is rooted in the complexity behind how quality switch happen: - depending on heuristics, we may either perform an "urgent" quality switch (where we directly cancel the requests linked to the older quality) or a non-urgent one (where we will wait for the current requests to finish and only after load the new quality). - If non-urgent, we want to still do the requests for the new quality as soon as we can, thus we parallelize it with the pushing operations of the segments we just loaded from the previous quality. Thus when a "non-urgent" quality switch happen, there might be a short time where several quality-linked modules are running at the same time (the old one to push segments, the new one to load them), whereas at first glance they seemed conflicting (one loads and push one quality, the other loads and push another quality of the same thing). This lead to an awkward architecture where the clean-up process of those modules is subtly different than in other RxPlayer modules - this one has actually 2 means to terminate: - its `terminate` parameter, kind of like a SIGTERM: just finish what you're doing (e.g. finish loading segments and/or pushing them then stop). Once the `RepresentationStream` (the module in question) has finished loading segments, it sends a `terminating` event - but it might still be pushing segments. It however has no event to indicate that segments have been pushed, for now. - its `cancelSignal` parameter, more akin to a SIGKILL: terminate everything now without delay. This one is e.g. triggered when stopping the content, changing the track etc. The leaking event listener was wrongly linked to that "SIGKILL" signal, even if it was intended to be cleaned up when the module is not needed anymore. When the module was only "SIGTERMed", it was not cleaned up. --- I chose to clean it up not right when "SIGTERMed", but when the module itself anounced that it is "terminating" (it is done loading and is now pushing segments). I found it to be more appropriate for the logic in question and a corresponding `CancellationSignal` was already used for other similar logic linked to the same lifetime.
0142e34 to
1fd9df3
Compare
This PR fixes #1778