{"id":11,"date":"2023-09-26T14:21:55","date_gmt":"2023-09-26T13:21:55","guid":{"rendered":"http:\/\/patrick.direct\/?p=11"},"modified":"2023-10-31T10:25:42","modified_gmt":"2023-10-31T10:25:42","slug":"low-cost-hardware","status":"publish","type":"post","link":"http:\/\/patrick.direct\/index.php\/2023\/09\/26\/low-cost-hardware\/","title":{"rendered":"Getting Started"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">The barrier to entry for machine learning experiments is pretty high &#8211; to run any model yourself, let alone train a model, one needs a decently modern CPU, ideally a CUDA-capable graphics card with at least 8GB of RAM, oodles of disk space and oodles of RAM (64GB is generally considered a minimum starting point). This kind of hardware should be sufficient to run large language models (LLMs) either from the command line or using one of a handful of web-server based front-ends such as [<a href=\"https:\/\/github.com\/oobabooga\/text-generation-webui\">Oobabooga<\/a>]. Whilst these needs can be met via online cloud providers, and indeed there are even brokerages that seek to link those with spare processor cycles and end-users, it all comes at a hefty cost and there are significant disadvantages with cloud based solutions given the size of machine learning models that we&#8217;ll be shuffling across the intertubes. So instead I decided to explore a less conventional route, and look for an ex-enterprise server that I could run from home. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Selecting the right server: A false start<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">My first try was a false start. An HP DL380 Gen 8, which had a couple of E5-2609 Xeon processors and 64GB of RAM for \u00a375. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The immaculate hardware spun up perfectly (I love the way ex-data centre gear is always so clean and dust-free), and I opted to install Proxmox on an SSD to give me some flexibility in experimenting with multiple alternative configurations as either containers or virtual machines on the physical host. However, I rapidly came to realise the shortcomings of this configuration &#8211; much of the open source LLM related software requires additions to the CPU instruction set that only appeared relatively recently. It <em>is<\/em> possible to upgrade the CPUs, I think there is only one model that the server supports that has the elusive AVX2 support (E5-2690 v2 &#8211; the v2 bit is critical by the way), but given that 64GB of RAM was also clearly going to be a limiting factor, I decided to cut my losses and shop around again<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Second Time Lucky?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Trawling the world&#8217;s largest online bootsale, I lucked out with a DL380 Gen 9 with a monster spec.: 180GB RAM, a pair of Xeon processors totalling 48 cores, modern enough to support AVX2 instructions, and a bonus dual port 10GB network card, all for the princely sum of \u00a3220. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As is common for these kind of purchases, the server came with no disks or caddies. Whilst caddies are readily available,  I opted for a <a href=\"https:\/\/www.printables.com\/model\/470886-lff-caddy-for-hpe-proliant-gen9-for-35-or-25-drive\">quick 3D print<\/a> instead . Now, 2U form factor servers have a bit of a reputation for being noisy, but I&#8217;ve found that when using SSDs (nothing special, normal consumer grade SATA disks) in preference to spinning rust, the server is essentially silent.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I&#8217;ve decided to run with Proxmox to give me flexibility in terms of run-time OS environments, and to allow for throw-away experiments on VMs. Once I have a bit of a handle on the basics, I&#8217;ll think about adding a GPU or two.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The barrier to entry for machine learning experiments is pretty high &#8211; to run any model yourself, let alone train a model, one needs a decently modern CPU, ideally a CUDA-capable graphics card with at least 8GB of RAM, oodles of disk space and oodles of RAM (64GB is generally considered a minimum starting point). [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8],"tags":[],"class_list":["post-11","post","type-post","status-publish","format-standard","hentry","category-hardware"],"_links":{"self":[{"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/posts\/11","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/comments?post=11"}],"version-history":[{"count":9,"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/posts\/11\/revisions"}],"predecessor-version":[{"id":25,"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/posts\/11\/revisions\/25"}],"wp:attachment":[{"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/media?parent=11"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/categories?post=11"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/patrick.direct\/index.php\/wp-json\/wp\/v2\/tags?post=11"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}