Hi,
I have a friend who is looking to run a few simulations he has implemented in python and needs around 256GB of ram. He is estimating it will take a couple of hours, but he is studying economics so take that with a grain of salt 🤣
For this instance, I recommended GCP, but I felt a bit dirty doing that. So, I was wondering if any of you have a buttload of memory he can burrow? Generally, would you lend your RAM for a short amount of time to a stranger over the internet? (assuming internet acccess is limited to a signle ssh port, other necessary safeguards are in place)
got a bag full of SIMMs, probably not a whole buttload but I don’t think even that amount would add up to 256GB
Apply for compute time at a university cluster. It is free and usually easy.
AWS has an r4.8xlarge 244gb ram with 32 vcores for $2.13 an hour If they can handle Linux. $2.81 an hour for windows.
First, define what you are asking for.
Do you want someone to send you a cardboard box full of RAM? Then forget it. Nobody would be stupid enough to lend that much expensive hardware to someone on the internet.
Or are you asking for someone to let you run random code on their PC for a few hours? Then forget it. Nobody would be stupid enough to open “a single SSH port” to someone on the internet to run potential malware on their PC.
That’s exactly what cloud platforms are there for, and if you don’t like google, get any other cloud provider.
Seconded. If they can’t optimize their code (which, I have never seen applications require 256 gigs of ram even in FAANG so I find that doubtful), then they need to rent a machine. The cloud is where you rent it. If not Google, then AWS, Azure, Digital Ocean, any number of places let you rent compute
Yeah, it’s an economics student running something on python. I can guarantee that it’s horribly unoptimized.
Tell your friend to open source the algorithm… Somebody will surely point at a easy optimization. 100 others will just shit on your friend
I do not have any RAM to share, sorry.
An economics simulation in Python needing 200+GB of RAM sounds preventable.
In your friend’s shoes, I might start asking for pointers over on the programming.dev Lemmy.
As others have said, a rewrite in a faster language like C or goLang could help - but my guess is there’s also ways to cut that memory need way down, while still using Python.
i have 8 gigs thats been living on my desk for the last 4 years
First of all, he should drop Python for anything resource intensive as such a simulation. And then think about how to optimize the algorithm.
Why not get a 0.5 or 1 tb nvme ssd and set it all as swap?
It will run probably 10 times slower, but it’s cheap and doable.
This is the way.
Depending on the nature of the sim, it could probably even be done with ~80 GB or less of existing SSD space using zram w/ zstd.
Needing that much RAM is usually a red flag that the algo is not optimized.
*looking at the 14TB cluster I had running for 18 hours
Yep, nobody would ever need that much memory
Wow, yea I think you win that contest lol.
To be honest, it was a very paralell process. I could do a fraction of the compute, needing a fraction of the RAM, but taking a shit ton more time.
Also, theres no perfect machine for this use. I can have 3.5 times more RAM than needed, or start swapping and waste time.
Nope. Some algorithms are fastest when a whole data set is held into memory. You could design it to page data in from disk as needed, but it would be slower.
OpenTripPlanner as an example will hold the entire road network of the US in memory for example for fast driving directions, and it uses the amount of RAM in that ballpark.
Sure, that is why I said usually. The fact that 2 people replied with the same OpenStreetMap data set is kinda proving my point.
Also, do you need the entire US road system in memory if you are going somewhere 10 minutes away? Seems inefficient, but I am not an expert here. I guess it is one giant graph, if you slice it up, suddenly there are a bunch of loose ends that break the navigation.
I host routing for customers across the US, so yes I need it all. There are ways to solve the problem with less memory but the point is that some problems really do require a huge amount of memory because of data scale and performance requirements.
Researchers always make some of the worst coders unfortunately.
Scientists, pair up with an engineer to implement your code. You’ll thank yourself later.
True, but there are also some legitimate applications for 100s of gigabytes of RAM. I’ve been working on a thing for processing historical OpenStreetMap data and it is quite a few orders of magnitude faster to fill the database by loading the 300GiB or so of point data into memory, sorting it in memory, and then partitioning and compressing it into pre-sorted table files which RocksDB can ingest directly without additional processing. I had to get 24x16GiB of RAM in order to do that, though.
Yea, that makes sense. You could sort it in chunks, but it would probably be slower. If you are regularly doing that and can afford the ram go for it. Otherwise maybe extract the bits that need to be sorted and zip them back up later.
That’s kinda an insane amount of ram for most simulations. Is this like a machine learning thing? Is his python code just super unoptimized? Is it possible he’s making a bunch of big objects and then not freeing the references when he’s done with them so they’re never garbage collected?
All my extra RAM was super old and I let it get offed when I hired a junk hauling company clear out my last place when I moved (I’d been there for like 15 years, so there was a lot of worn out furniture and stuff).
As a hardware guy there is so little info here
DDR2, 3, 4, or 5? Clock speed? ECC? Registered?
Yeah I have boxes of older memory. But there needs to be a lot more specifics. Most of my home lab machines have at least 384gb (VMs need a lot of memory).
I don’t think OP wants you to lend them physical RAM modules but asks about letting his friend run random code on your high-RAM machine.
Maybe? After rereading it I’m really not sure …
That’s at least what I got from the comment with the SSH port.
Yeah I can definitely see your point.
that’s probably way too much for any sane Python algorithm. if they can’t run it, how do they even know how much is needed?
Probably they should only make a prototype in Python, and then reimplement it in a compiled language. it should reduce the resource usage massively
Borrow it from NewEgg, then return it
Newegg isn’t so bad. Do a shit corporate like best buy.