Overview
I am using Spark (pyspark) to distribute big data jobs across servers. I would like to distribute Pyav encoding and filter features with Spark on many servers .
Expected behavior
The distributed function will execute Pyav features which need to be aware of previous and/or next frames in the GOP.
How to implement those features with GOP acknowledgement?
Actual behavior
Isolated frame in a separate job can not be used alone.
Investigation
A solution to give Spark the context by serializing the Frame object with pickle is not enough, see #652
Research
I have done the following:
https://gitter.im/mikeboers/PyAV?at=5eab0dd59f0c955d7d97bbb1
Additional context
@koenvo, may be you can help me on this as you propose in #652 (comment)