One of the most important research areas for Livepeer is the verification of work on the network. The verification solutions used in the network have evolved in the past few years:
- The original whitepaper proposed the use of Truebit, an off-chain computation protocol, to report whether a CPU based deterministic video transcoding task was executed correctly and the result would be used to determine whether to slash stake
- Initially, the network used trusted “solver nodes” to report whether a CPU deterministic video transcoding task was executed correctly as a temporary trusted solution before migrating to a trust minimized solution
- The Streamflow network upgrade introduced GPU transcoding which dramatically accelerated the capacity and performance potential of the network by allowing cryptocurrency miners to leverage the chips on their existing GPUs for transcoding. However, while there are ways to force CPU video transcoding to be deterministic (with some performance tradeoffs), it became clear that trying to force determinism with GPU video transcoding would be impractical given the lack of available configuration knobs and the diversity of hardware implementations
- As of the Streamflow network upgrade, slashing was disabled in favor for real-time statistical verification that can be performed by broadcasters using a machine learning model powered verifier. The machine learning model was trained to detect tampered videos and a research paper on the model was published. A broadcaster would sample and verify transcoded results returned by orchestrators in real-time and stop working with orchestrators that trigger verification failures
At the moment, broadcasters can use the machine learning verifier mentioned above, but it has a number of shortcomings:
- Increased computation burden for broadcasters because transcoded results need to be decoded (but not re-encoded) to run verification
- Increased latency for streams because there is a delay before results are usable by video players while they are being verified
- While individual broadcasters are able to use the verifier and filter out potentially malicious orchestrators, there is no way for misbehavior to be made transparent to the rest of the network which would protect other broadcasters and would increase the cost incurred by a misbehaving orchestrator
This post outlines a roadmap for making improvements to transcoding verification on the network through a combination of a node software upgrade called “fast verification” and a protocol upgrade called “full verification”.
Fast verification improves on the current machine learning verifier approach by pushing any additional computation away from broadcasters that are typically compute resource constrained to orchestrators that typically have substantially more compute resources available via the hardware that they’re already using for transcoding.
The following algorithms are used:
- A perceptual hash algorithm
fthat returns a fingerprint for a video segment
- A collision detection algorithm
dthat returns a boolean indicating whether two fingerprints are sufficiently similar to be considered fingerprints for the same video segment
A broadcaster will maintain an internal list of trusted and untrusted orchestrators. A broadcaster may consider an orchestrator as trusted based on reputation measured through a combination of historical performance on the network and community interaction. A broadcaster would consider an orchestrator as untrusted if it has never interacted with the orchestrator before and has no reputation providing information for the orchestrator. Over the course of using the network, a broadcaster can update the reputation scores for orchestrators in order to update its list of trusted and untrusted orchestrators.
During a stream, a broadcaster will run fast verification on certain sampled segments using the following process:
- A segment will be sent to multiple orchestrators, one trusted orchestrator and
Nuntrusted orchestrators, to be transcoded
- Orchestrators are required to return the perceptual hash of each of the transcoded renditions for a segment alongside the renditions themselves
- The perceptual hashes of from each of the untrusted orchestrators are compared with the perceptual hashes returned from the trusted orchestrator using the collision detection algorithm
- If an untrusted orchestrator either does not return a response within a timeout or returns perceptual hashes that do not collide with those of the trusted orchestrator, it is discarded
If there is at least one untrusted orchestrator that passed fast verification, the broadcaster will choose to use results from one of the untrusted orchestrators. Otherwise, the broadcaster will fallback to the trusted orchestrator. This mechanism allows the broadcaster to progressively interact with more and more untrusted orchestrators while using trusted orchestrators it is already aware of to run fast verification.
Any untrusted orchestrators that fail fast verification will be subject to local ban policies that a broadcaster can use to filter out the orchestrators during selection.
Full verification complements fast verification by re-introducing in-protocol slashing/economic penalties for orchestrator misbehavior.
The following algorithms are used:
- A verification algorithm
vthat accepts a source segment and a transcoded rendition and returns true if the transcoded rendition is valid relative to the source and false if the transcoded rendition is tampered relative to the source
vrequires the full decoding of both the source segment and transcoded rendition
- A natural initial candidate for
vis to use the perceptual hash algorithm for fast verification, compute the hashes for both the source segment and transcoded rendition and then pass the hashes to the collision detection algorithm
A broadcaster is configured with the following parameters:
fullVerifierList. A list of verifiers that support executing
v. Initially, these verifiers can be the same entities on the trusted orchestrator list used for fast verification
fullVerificationFrequency. The % of segments that the broadcaster will run full verification for. Ex. 1% would result in 1 out of every 100 segments being fully verified which means running full verification roughly once every ~3 minutes (given 2 second segments)
Whenever full verification is required based on
fullVerificationFrequency , the broadcaster will send a full verification request with the source segment and a transcoded rendition from an orchestrator response to a verifier from
fullVerifierList. The full verification request is asynchronous meaning that the broadcaster immediately inserts transcoded renditions from the response into the playlist without waiting for full verification to complete.
If full verification fails, the broadcaster will save the following pieces of data:
- The source segment
- The transcoded rendition
- The orchestrator’s digital signature bound to the source segment and transcoded rendition
The purpose of full verification is not to detect tampered video in real-time, but rather to asynchronously detect tampered video and collect cryptographically binding evidence that can be used for dispute resolution.
An on-chain dispute resolution mechanism is defined to allow a broadcaster to present cryptographically binding evidence of tampering in order to economic penalize the offending orchestrator.
A broadcaster should only raise a dispute if they are confident that they will win the dispute. This can be achieved by establishing an operational process that involves manually reviewing transcoded renditions that have been flagged by full verification during a certain time period (similar to how certain automated content moderation systems for certain UGC platforms work). Each transcoded rendition can be eligible for disputes for a fixed time period i.e. 2 weeks giving broadcasters ample time to manually review flagged data. If there is manual confirmation that a transcoded rendition is tampered, the broadcaster or anyone else acting on the broadcaster’s behalf can then raise a dispute on-chain by staking
disputeStake LPT and submitting the source segment, transcoded rendition and the orchestrator’s signature bound to these two pieces of data (in practice, of course the video data would not be posted on-chain - details on the actual approach are left out for now). The expectation would be that during the dispute window, the arbitrator(s) would inspect evidence presented for a dispute and rule in either the broadcaster or orchestrator’s favor.
For additional flexibility, broadcasters and orchestrators can agree on the arbitrator that should be used in the event that dispute resolution is required. Orchestrators can configure the arbitrators that they support either on-chain or off-chain (probably on-chain to provide visibility to their delegators) and broadcaster can select orchestrators that support their preferred arbitrators. In practice, at the very start, there will only be a single arbitrator implementation, but this architecture would provide flexibility for additional parallel arbitrator implementations to co-exist allowing broadcasters and orchestrators to choose the one that is most well suited for their own preferences. Additionally, this would allow different arbitrator implementations to be used for different video tasks beyond video transcoding such as AI inference tasks.
If the arbitrator rules in the broadcaster’s favor, the orchestrator would be economically penalized. Potential implementations of this penalty include:
- Slashing the orchestrator’s self-delegated stake
- Slashing all of the orchestrator’s delegated stake
- Jailing the orchestrator by freezing all of its stake for a period of time and excluding it from rewards while it is jailed
If the arbitrator rules in the orchestrator’s favor, the broadcaster’s stake would be seized and could be burned or sent to a protocol treasury.
Candidate arbitrator implementations include:
- A multisig operated by a governance elected council
- A DAO governed by network stakeholders
- A decentralized court system such as Aragon Court or Kleros that are based on “schelling coin games” where jurors are incentivized to resolve a dispute based on what they believe other jurors think. Tamper detection may actually be a good use case for schelling coin games since the tampers should be immediately obvious to the human eye (if they aren’t then it is either questionable whether a tamper actually occurred or the definition of tampering used for dispute needs to be clarified). If the tampers are immediately obvious to the human eye then that is a clear schelling point i.e. common knowledge that jurors can use to make assumptions on how other jurors will rule
- An escalation game system such as reality.eth that allows resolutions to be challenged by paying higher and higher fees until the final arbitration is invoked which could actually be a system like Aragon Court or Kleros - for example, Gnosis’ Omen prediction market uses Kleros as the final arbitrator. Alternatively, the final arbitration implementation could be based on a system exclusive to LPT holders. The expectation is that in most cases, final arbitration is not needed, but its mere presence in the mechanism is an incentive for people to tell the truth so that it is never invoked
- A cryptographic system for proving that a video was transformed in certain permissible ways - see PhotoProof for a precedent with authenticating image transformations
- Fast verification can provide a real-time verification mechanism with security guarantees that can be good enough for certain use cases
- Full verification with dispute resolution complement fast verification by serving as the last line of defense - if an attacker bypasses fast verification, the threat of being caught by full verification and being economically penalized during dispute resolution raises the cost of attack
- Deploy fast verification
- Complete research for full verification
- Define an initial arbitrator implementation
- Define the initial penalty implementation
- Create a pre-proposal with the initial specs for a full verification protocol upgrade
- Get community feedback on the pre-proposal