AI Alpha subnet GPU / hardware poll

The objective of this thread is to have an open discussion on choosing the optimal Nvidia GPUs for Orchestrator’s participating in the AI Alpha Subnet. A top 5 or 10 ten list if you will.

This chart summarizes the GPUs and their power consumption:

Cards benched (512x512, 50 steps, SD v1.5 webui):

  • 1080 Ti 11 GB (Not buying this GPU at release will haunt me in my dreams)
  • 3060 Ti 12 GB (Cheapish low-end)
  • 3070 Ti 8 GB (way worse than I expected)
  • 2080 Ti 12 GB (Cheapest low-end, especially used)
  • 3080 12 GB (Prices have dropped since 4xxx release)
  • 3090 24 GB (Best)

I found the performance for image generation similar to Tom’s Hardware.
The absolute best GPU appears to be the US $2200 24 GB RTX 4090 (Top GPU in 2023 / 2024 or until the 4090 Ti / Titan release).

  • It should smoke even the A100 assuming inference workloads that fit within 24 GB.
  • It will max out just fine on PCIe Gen 3.0 x16 (16 GT/s) slots with only a 5 Δ% performance drop compared to PCIe Gen 4.0 (32 GT/s) and a negligible Δ% vs PCie Gen 5. Useful for selecting the motherboard, meaning even older HEDT chipsets like X99 (40x PCIe lanes) will work.
  • Driving more than two 4090s in a system will draw significant power and three might be the limit before liquid cooling is required.
  • 4 - 6x 4090s can be driven in custom desktop and 2U rackmount chassis (dual-socket Epyc 1+TB RDIMM) which ship with 4x 2400W redundant power supplies.
  • Workloads are sensitive to VRAM and I assume are using only FP16, please correct me if I am wrong.

This is not a post that covers benchmarks in detail, I found the results similar to Tom’s Hardware.


Also see GPU Benchmarks for Deep Learning | Lambda

Seems dual any work is out, and fractional workloads aren’t supported on the consumer GPUs.
We need a benchmark tool, that internally categorises these GPUs by capability (Generation?) first, then performance.

Any Turing GPU can be a starting point (with patience), but dual air-cooled 3090 / 4090 rigs should be the sweet spot?

Practical low end, pool entry-level: 2080 Ti 11 GB < 3060 Ti 12GB
Orchestrator grade: 3080 12 GB < 4080 16 GB < 3090 24 GB < 4090 24 GB

Which GPUs would you deploy?

  • RTX 2080 Ti 11 GB
  • RTX 3060 Ti 12 GB
  • RTX 3080 12 GB
  • RTX 3080 Ti 12 GB
  • RTX 3090 24 GB
  • RTX 3090 Ti 24 GB
  • RTX 4070 Ti 12 GB
  • RTX 4080 16 GB
  • RTX 4090 24 GB
  • Other
0 voters
4 Likes

For me it would depend, the capex on getting 3090’s, 3090Ti’s or 4090’s is a tad high, so it would really depend on how quickly that investment pays off. So for now 4080’s seem like the best value option to me even though it has a bit less VRAM

2 Likes

What about a card like the A6000 or A6000 ADA ? They would have much more VRAM available, but are also slower and a lot more expensive. Would there be any value in such a card or would multiple 4090’s simply be a more efficient choice?

EDIT:
I’ve asked about this during the last treasury chat, I’ll dump the convo:

Marco | captain-stronk.eth — 2024/01/03, 18:26:08
@Lazydayz137 is there an gpu you would consider ‘optimal’?

Lazydayz137 — 2024/01/03, 18:26:38
24gb ram…3090 Best Buy if not sending to data center and you can run consumer.

Marco | captain-stronk.eth — 2024/01/03, 18:28:07
is there a merit to looking into cards like A6000? or better to get multiple 4090 or 3090’s ?

Lazydayz137 — 2024/01/03, 18:53:25
If in data center want a500 or 6000

Lazydayz137 2024/01/03, 18:53:59
3090 best bang for buck and that 25 holds 7b models to train at like 16gp or something that makes it magic

Lazydayz137 — 2024/01/03, 18:54:21
But I’m trying to spread o we and see what workloads can be distributed amongst gpu’s like already on network

2 Likes

Maybe the 4070 ti Super with 16 giga VRAM could be a nice option for the beginning ?

image

I’m also wondering about the importance of the CPU and RAM in this type of configuration.