Live Peer + Ethereum (eth) mining

One of the reasons I joined the live peer community was as a traditional cryptocurrency miner Live peer has the ability to run in parallel with traditional Proof-of-Work (PoW) with minimal impact to each service.

I think the obvious choice for PoW + Live Peer is Eth as the fees required to make reward calls are presented in eth. The basis for my research is the following paper:

What we can tell you is that we have experienced in stability in the configuration to the point of the hardware platform requiring a complete hard power down and restart. Lets us present the hardware components, versions of software, added/modified configuration files.

Triple AMD Radeon Sapphire RX470 8GB - Stock Firmware
1000w EVGA power supply
Built on a custom open airframe

OS: Ubuntu 18.04.2 LTS
Live Peer: v0.3.3
Mining Software: Ethminer 0.17.1

As part of our optimization we limit the max power draw of the GPUs and spin the fans faster than the factory setting to keep heat at a reasonable level ( < 60c )
GPU Setup

export GPU_FORCE_64BIT_PTR=1
export GPU_MAX_HEAP_SIZE=100
#Allow enablement of manual fan control
echo 1 > /sys/class/drm/card0/device/hwmon/hwmon0/pwm1_enable
echo 1 > /sys/class/drm/card1/device/hwmon/hwmon1/pwm1_enable
echo 1 > /sys/class/drm/card2/device/hwmon/hwmon2/pwm1_enable
## Set Fan Speed
echo 180 > /sys/class/drm/card0/device/hwmon/hwmon0/pwm1
echo 180 > /sys/class/drm/card1/device/hwmon/hwmon1/pwm1
echo 180 > /sys/class/drm/card2/device/hwmon/hwmon2/pwm1
## Set Power Cap
## default value 110000000
echo  95000000 > /sys/class/drm/card0/device/hwmon/hwmon0/power1_cap
echo  95000000 > /sys/class/drm/card1/device/hwmon/hwmon1/power1_cap
echo  95000000 > /sys/class/drm/card2/device/hwmon/hwmon2/power1_cap

Mining Command Line

./ethminer --tstop 60 --tstart 50 --farm-recheck 200  --HWMON 1 -R -G stratum://

This configuration is currently running @ 74Mh/s

We have seen issues with system stability. We believe it’s due to the mining software or the mining software in conjunction with LP. When the system runs LP solo we have not seen any issues.

One we see these errors the system becomes close to non-responsive. The time to get into this state is around 3 weeks.

The next step is to cycle the mining software on given intervals to give the operating system a chance perform garbage collection and perform a reset of system resources. This will be performed with a combination of timeout and cron.

If you would like to support the efforts of the this project please consider bonding to transcoder 0x71d:

Here’s an update on what we have done and what we have found. All interesting and the bottom line upfront is it appears to be an issue with the AMD drivers released for Ubuntu 18.

We started with creating a cron job as we mentioned that utilized timing cycles of run/cool down. This was in an attempt to clear out any memory stack issues or other cumulative errors that appear. The error have been isolated 100% to ethminer and it appears that the LPT transcoder has no negative impact. Here is the cycle chart of our run/cool down. 50min hot, 10min off.

It ran for days but eventually began throwing errors after a period of time and only during the time frame in which ethminer was running. After much research it appears that other miners in the traditional Proof-Of-Work space utilizing mining on Ubuntu also see the problem running both Ethminer and Claymore (dual miner). The common result is that once the issue occurs the only viable option is to reboot the machine to clear the state.

All in all it appears we need an update from AMD.