Hi everyone,
My name is Alex, but some of you may know me as “syrinx node”.
As a TPM at LivePeer and an orchestrator one pain-point in our community is efficiently communicating / raising issues with LivePeer inc and engineers. Critical issues could be breaking-changes to Go-LivePeer or an ingest node going down for a few hours. Less critical issues could be things like UI improvements to LivePeer explorer, persistent bugs (non-breaking) with Go-LivePeer or future feature requests that aren’t quite defined enough to be submitted as a GitHub issue.
As LivePeer org has grown keeping communication efficient and balanced with ample context when it comes to engineering is paramount. Unfortunately, it’s become more and more difficult to handle this process solely with Github issues and scattered Discord threads. That said, we love discord and would like to propose a draft solution to make this process as easy as possible for orchestrators.
Below is v0.1 of this process, if you have suggestions or disagree with the approach please respond below and ideally in a week or so we’ll have something we can deploy.
O Incident Escalation Pipeline
The format of this first iteration will be a ticketing Discord plugin, tickets will be created directly from discord and assigned a ticket number.
Creating a ticket will require:
- category describing the nature of the issue
- incident / outage
- service abnormality / broken feature
- feature request
- long term eng request (in response to broken feature or service abnormality)
- description including context of issue (will require a minimum length)
- indication of severity
- this may be removed, but if used properly would help triage faster
Low - service is functioning well, but would greatly benefit by raising this issue
- this may be removed, but if used properly would help triage faster
-
should require more context than others
-
each should require a minimum length explanation / description of what is requested
- Low - service is functioning well, but would greatly benefit by raising this issue
- should require more context than others
- each should require a minimum length explanation / description of what is requested
- Medium - service is functioning but sub-optimally
- High - service is down / immediately negatively affecting LivePeer network
- Low - service is functioning well, but would greatly benefit by raising this issue
Once a ticket is submitted, it’s associated ticket number should be used when other discord users are referring to a similar issue. Ideally, this will help understand how wide spread the issue is, instead of multiple users submitting identical tickets.
Tickets will for now be triaged by myself or other members of the product team and then triaged to engineering as necessary. For instance, urgent queries regarding transcoder infrastructure (ingest nodes, test streams etc) will be routed with higher priority than feature requests or UI bugs.
When a ticket is completed, an response will be posted with the ticket number. Ideally, this will make responses easier to find and help keep a better track record of recurring errors or queries.
Again, this is a draft proposal - we’re curious of your feedback and suggestions to improve this before it’s initial implementation next week.
Eager to hear your feedback!