Transcoding Verification Improvements: Fast & Full Verification

How sensitive is the collision detection algorithm comparing the result of the perceptual hash? Could the output from different generation NVENCs be enough to cause a failed verification?

The latest tests we’ve ran using MPEG-7 video signatures as the perceptual hash (phash) include comparing:

  • phashes generated for different renditions (i.e. 720p vs. 360p) transcoded by the same GPU
  • phashes generated for the same rendition transcoded by a GPU vs. a CPU

In these tests, I believe the accuracy has been around 98%. However, these tests are definitely not exhaustive with the scenarios that they account for (for example, we have not performed thorough testing comparing various different GPU models). For this reason, the intention is to perform additional testing in the wild on the network where there is likely a greater variety of GPU models and use the results to make improvements in the perceptual hash algorithm, the fast verification implementation or both. As mentioned in an earlier comment, the expectation is that there is the possibility of false negatives (i.e. no collision even though the phashes correspond to videos with the same content), so the fast verification implementation (i.e. in the local ban policy) will consider the frequency of failures instead of treating a single failure as an indication of orchestrator misbehavior.

1 Like