LL3M: Large Language 3D Modelers
threedle.github.io439 points by simonpure 4 days ago
439 points by simonpure 4 days ago
I've had surprising success with meshy.ai as part of a workflow to go from images my friends want to good 3D models. The workflow is
1. Have gpt5 or really any image model, midjourney retexture is also good, convert the original image to something closer to a matte rendered mesh, IE remove extraneous detail and any transparency / other confusing volumetric effects
2. Throw it in meshy.ai image to 3D mode, select the best one or maybe return to 1 with a different simplified image style if I don't like the results
3. Pull it into blender and make whatever mods I want in mesh editing mode, eg specific fits and sizing to assemble with other stuff, add some asymmetry to an almost-symmetric thing because the model has strong symmetry priors and turning them off in the UI doesn't realllyyy turn them off, or model on top of the AI'd mesh to get a cleaner one for further processing.
The meshes are fairly OK structure wise, clearly some sort of marching cubes or perhaps dual contouring approach on top of a NeRF-ish generator.
I'm an extremely fast mechanical CAD user and a mediocre blender artist, so getting an AI starting point is quite handy to block out the overall shape and let me just do edits. EG a friend wants to recreate a particular statue of a human, tweaking some T-posed generic human model into the right pose and proportions would have taken me "more hours than I'm willing to give him for this" ie I wouldn't have done it, but with this workflow it was 5 minutes of AI and then an hour of fussing in Blender to go from the solid model to the curvilinear wireframe style of the original statue.
> 1. […] convert the original image to something closer to a matte rendered mesh […]
Sounds interesting. Do you have any example images like that you could share? I understand the part about making transparent surfaces not transparent. But I’m not sure how the whole image looks like after this step.
Also, would you be willing to share the prompt you type to achieve this?
It works if you just plainly describe what you're looking for, I write a new prompt for different images just like "re-render this as a matte untextured 3d model, remove all details except geometric form"
GPT-5 is a text only model. ChatGPT uses 4o for images still
The naming is very confusing. I thought the underlying model was gpt image 1 in the api but transparently shown as part of the same chat model in the UI?
As someone using Blender for ~7 years, with over 1000 answers on Blender Stack Exchange and total score of 48.000:
This tool is maybe useful if you want to learn Python, in particular Blender Python API basics, I don't really see other usage of this. All examples given are extremely simple to do; please don't use a tool like this, because it takes your prompt and generates the most bland version of it possible. It really takes only about a day to go through some tutorials and learn how to make models like these in Blender, with solid color or some basic textures. The other thousands of days is what you would spend on creating correct topology, making an armature, animating, making more advanced shaders, creating parametric geometry nodes setups... But simple models like these you can create effortlessly, and those will be YOUR models, the way (roughly, of course) how you imagined them. After a few weeks you're probably going to model them faster than the time it takes for prompt engineering. By that time your imagination, skill in Blender and understanding of 3D technicalities will improve, and it will keep improving moving onward. And what will you learn using this AI?
I think meshy.ai is much more promising, but still I think I'd only consider using it if I wanted to convert photo/render into a mesh with a texture properly positioned onto it, to then refine the mesh by sculpting - and sculpting is one of my weakest skills in Blender. BTW I made a test showcasing how meshy.ai works: https://blender.stackexchange.com/a/319797/60486
As someone who has tried to go through blender tutorials for multiple days, I can tell you, there is no chance I can get close to any of these examples.
I think you might be projecting your abilities a bit too much.
As someone who wants to make and use 3d models, not someone who wants to be a 3d model artist, this tech is insanely useful.
0.0001% of the population can sculpt 3D and leverage complex 3D toolchains. The rest of us (80% or whatever - the market will be big) don't want to touch those systems. We don't have the time, patience, or energy for it, yet we'd love to have custom 3D games and content quickly and easily. For all sorts of use cases.
But that misses the fact that this is only the beginning. These models will soon generate entire worlds. They will eventually surpass human modeller capabilities and they'll deliver stunning results in 1/100,000th the time. From an idea, photo, or video. And easy to mold, like clay. With just a few words, a click, or a tap.
Blender's days are long in the tooth.
I'm short on Blender, Houdini, Unreal Engine, Godot, and the like. That entire industry is going to be reinvented from scratch and look nothing like what exists today.
That said, companies like CSM, Tripo, and Meshy are probably not the right solutions. They feel like steam-powered horses.
Something like Genie, but not from Google.
> These models will soon generate entire worlds.
They may. It's hard to expect this when we already see LLMs plateauing at their current abilities. Nothing you've said is certain.
I don't see them plateauing. I see that they are in their infancy. So far AI people were just persistently doing the dumbest possible thing that turned out to work, with very limited understanding, insight and innovation. At some point they will buy all the gpus and all the GWh they can and will be forced to actually figure out how to really improve what they are doing. Then the real breakthroughs will start showing up. There are probably improvements of 3-4 orders of magnitude right behind the finish line of the low hanging fruit picking contest.
> That entire industry is going to be reinvented from scratch
Hey, I heard that one before! The entire financial industry was supposed to have been reinvented from scratch by crypto.
Well it kinda did change things up a bit. Me being able to receive payments across borders without significant delay or crazy fees is a decent perk, you can hate crypto culture and grifters trying to make a quick buck but it's applications are very real.
Won’t you get taxed on “gains”when you do that and then eventually convert to fiat?
I was considering this path a few years ago but all my research pointed to me being taxed for moving my own money from one country to another. Which would’ve cost significantly more than a good ol’ bank transfer. (I needed the fiat on the other end)
My understanding was that as far as the receiving bank is concerned, the converted crypto would’ve appeared out of an investment/trading platform and needed to be taxed
The bank transfer cost like a couple of bucks anyway so it wasn’t worth the risk of trying the crypto route in the end for me.
> These models will soon generate entire worlds. They will eventually surpass human modeller capabilities and they'll deliver stunning results in 1/100,000th the time. From an idea, photo, or video. And easy to mold, like clay. With just a few words, a click, or a tap.
This is a pretty sweeping and unqualified claim. Are you sure you’re not just trying to sell snake oil?
I'm sure he is just trying to sell snake oil.
I've been predicting this since Deep Dream (which feels like a century ago) and HN loves to naysay.
I claimed three years ago that AI would totally disrupt the porn and film industries and we're practically on the cusp of it.
If you can't see how these models work and can't predict how they can be used to build amazing things, then that's on you. I have no reason to lift up anybody that doubts. More opportunity on the table.
FWIW I'm a 3D modeller (hard surface Blender modelling, ~10yrs) and I've been reading your comments for a while now. Reality wasn't disrupted quite as far as you suggested, most of the naysayers that advised restraint under your comments have largely been proven right. Time and time again, you made enormous claims and then refused to back them up with evidence or technical explanations. We waited just like you asked, and the piper still isn't paid.
Have you ever asked yourself why this revolution hasn't come yet? Why we're still "on the cusp" of it all? Because you can't push a button and generate better pornography than what two people can make with a VHS camera and some privacy. The platonic ideal of pornography and music and film and roleplaying video games and podcasting is already occupied by their human equivalent. The benchmark of quality in every artistic application of AI is inherently human, flawed, biased and petty. It isn't possible to commoditize human art with AI art unless there's a human element to it, no matter how good the AI gets.
There's merit to discussing the technical impetus for improvement (which I'm always interested in discussing), but the dependent variables here seem exclusively social; humanity simply might never have a Beatlemania for AI-generated content.
I don't work in the field but I observe it pretty closely and my feeling is that comments like this remind me of the people I spoke to in the 1990s who said that Windows and Intel would never replace their Unix workstations.
Right now if I go on LinkedIn most header images on people's posts are AI generated. On video posts on LinkedIn that's a lot less, but we are beginning to see it now. The static image transition has taken maybe 3 years? The video transition will probably take about the same.
There's a set of content where people care about the human content of art, but there is a lot of content where people just don't care.
The thing is that there is a lot of money in generating this content. That money drives tool improvement and those improved tools increase accessibility.
> Have you ever asked yourself why this revolution hasn't come yet?
We are in the middle of the revolution which makes it hard to see.
I hope the walls don't cave in on you. Eyes up. My friends in VFX are adopting AI workflows and they say that it's essential.
> Why OnlyFans May Sell for 75% Less Than It’s Worth [1, 2]
> Netflix uses AI effects for first time to cut costs [3]
Look at all of the jobs Netflix has posted for AI content production [4].
> Gabe Newell says AI is a 'significant technology transition' on a par with the emergence of computers or the internet, and will be 'a cheat code for people who want to take advantage of it' [5]
Jeffrey Katzenberg, the cofounder of DreamWorks [6]:
> "Well, the good old days when, you know, I made an animated movie, it took 500 artists five years to make a world-class animated movie," he said. "I don't think it will take 10% of that three years out from now," he added.
I can keep finding no shortage of sources, but I don't want to waste my time.
I've brushed shoulders with the C-suite at Disney and Pixar and talked at length about this with them. This world is absolutely changing.
The best evidence is what you can already see.
[1] https://www.theinformation.com/articles/onlyfans-may-sell-75...
[3] https://www.bbc.com/news/articles/c9vr4rymlw9o
[4] https://explore.jobs.netflix.net/careers?query=Machine%20Lea...
[5] https://www.pcgamer.com/software/ai/gabe-newell-says-ai-is-a...
[6] https://www.yahoo.com/entertainment/cofounder-dreamworks-say...
Frankly, that is all just speculative, once again. AI is hitting a significant roadblock. Look at how disappointing GPT-5 was. No amount of compute is ever going to match the hype matching those quotes.
The C-suite who don't realize how wrong they are about AIs potential are going to be facing a harsh reality. And artists will be the first to be hurt by their HYPE TRAIN management style and mindset.
Edit: most of all, the 3d generation in this LLM3d model is about the same as the genAI 3d models from a year ago... And two years ago... A good counterpoint would be Tubi's recently released, mostly AI gen short films. They were garbage and looked like garbage.
Netflix's foray, of memory serves, was a single scene where a building collapses. Hardly industry shattering. And 3d modeling and genAI images/videos are substantially different.