AWS in 2025: Stuff you think you know that's now wrong
lastweekinaws.com124 points by keithly 4 hours ago
124 points by keithly 4 hours ago
I think there is more of us who kind of degenerated from doing it the AWS way - API Gateway, serverless lambdas mess around with IAM roles until it works, ... - to - Give me EC2 / LightSail VPS instance maybe an S3 bucket let's set domain through Route53 and go away with the rest of your orchestrion AWS.
S3: "Block Public Access is now enabled by default on new buckets."
On the one hand, this is obviously the right decision. The number of giant data breeches caused by incorrectly configured S3 buckets is enormous.
But... every year or so I find myself wanting to create an S3 bucket with public read access to I can serve files out of it. And every time I need to do that I find something has changed and my old recipe doesn't work any more and I have to figure it out again from scratch!
The thing to keep in mind with the "Block Public Access" setting is that is a redundancy built in to save people from making really big mistakes.
Even if you have a terrible and permissive bucket policy or ACLs (legacy but still around) configured for the S3 bucket, if you have Block Public Access turned on - it won't matter. It still won't allow public access to the objects within.
If you turn it off but you have a well scoped and ironclad bucket policy - you're still good! The bucket policy will dictate who, if anyone, has access. Of course, you have to make sure nobody inadvertantly modifies that bucket policy over time, or adds an IAM role with access, or modifies the trust policy for an existing IAM role that has access, and so on.
This sort of thing drives me nuts in interviews, when people are like, are you familiar with such-and-such technology?
Yeah, what month?
I just stick CloudFront in front of those buckets. You don't need to expose the bucket at all then and can point it at a canonical hostname in your DNS.
That’s definitely the “correct” way of doing things if you’re writing infra professionally. But I do also get that more casual users might prefer not to incur the additional costs nor complexity of having CloudFront in front. Though at that point, one could reasonably ask if S3 is the right choice for causal users.
S3 + cloudfront is also incredibly popular so you can just find recipes for automating that in any technology you want, Terraform, ansible, plain bash scripts, Cloudformation (god forbid)
Yeah holy crap why is cloud formation so terrible?
Last time I tried to use CF, the third party IAC tools were faster to release new features than the functionality of CF itself. (Like Terraform would support some S3 bucket feature when creating a bucket, but CF did not).
I'm not sure if that's changed recently, I've stopped using it.
It's actually incredibly cheap. I think our software distribution costs, in the account I run, are around $2.00 a month. That's pushing out several thousand MSI packages a day.
I honestly don't mind that you have to jump through hurdles to make your bucket publically available and that it's annoying. That to me seems like a feature, not a bug
I think the OPs objection is not that hurdles exist but that they move them every time you try and run the track.
Sure... but last time I needed to jump through those hurdles I lost nearly an hour to them!
I'm still not sure I know how to do it if I need to again.
I have a preempt-able workload for which I could use Spot instances or Savings Plans.
Does anyone have experience running Spot in 2025? If you were to start over, would you keep using Spot?
- I observe with pricing that Spot is cheaper
- I am running on three different architectures, which should limit Spot unavailability
- I've been running about 50 Spot EC2 instances for a month without issue. I'm debating turning it on for many more instances
Strictly off topic:
Everything you know is wrong.
Weird Al. https://www.youtube.com/watch?v=W8tRDv9fZ_c
Firesign Theatre. https://www.youtube.com/watch?v=dAcHfymgh4Y
You know what's still stupid? That if you have an S3 bucket in the same region as your VPC that you will get billed on your NAT Gateway to send data out to the public internet and right back in to the same datacenter. There is simply no reason to not default that behavior to opt out vs opt in (via a VPC endpoint) beyond AWS profiting off of people's lack of knowledge in this realm. The amount of people who would want the current opt-in behavior is... if not zero, infinitesimally small.
It's a design that is secure by default. If you have no NAT gateway and no VPC Gateway Endpoint for S3 (and no other means of Internet egress) then workloads cannot access S3. Networking should be closed by default, and it is. If the user sets up things they don't understand (like NAT gateways), that's on them. Managed NAT gateways are not the only option for Internet egress and users are responsible for the networks they build on top of AWS's primitives (and yes, it is indeed important to remember that they are primitives, this is an IaaS, not a PaaS).
Fine for when you have no NAT gateway and have a subnet with truly no egress allowed. But if you're adding a NAT gateway, it's crazy that you need to setup the gateway endpoint for S3/DDB separately. And even crazier that you have to pay for private links per AWS service endpoint.
This is the intended use case for S3 VPC Gateway Endpoints, which are free of charge.
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpo...
(Disclaimer: I work for AWS, opinions are my own.)
I think they know it. They are complaining it's not enabled by default (and so do I).
AWS VPCs are secure by default, which means no traffic traverses their boundaries unless you intentionally enable it.
There are many IaC libraries, including the standard CloudFormation VPC template and CDK VPC class, that can create them automatically if you so choose. I suspect the same is also true of commonly-used Terraform templates.
The problem is that the default behavior for this is opt-in, rather than opt-out. No one prefers opt-in. So why is it opt-in?
AWS VPCs are secure by default, which means no traffic traverses their boundaries unless you intentionally enable it.
Having experienced the joy of setting up VPC, subnets and PrivateLink endpoints the whole thing just seems absurd.
They spent the effort of branding private VPC endpoints "PrivateLink". Maybe it took some engineering effort on their part, but it should be the default out of the box, and an entirely unremarkable feature.
In fact, I think if you have private subnets, the only way to use S3 etc is Private Link (correct me if I'm wrong).
It's just baffling.
You can provision gateway endpoints for S3 and DynamoDB. They are free and considered best practice. They are opt-in though, but easy to enable.
VPC endpoints in general should be free and enabled by default. That you need to pay extra to reach AWS' own API endpoints from your VPC feels egregious.
Gateway endpoints are free. Network endpoints (which are basically AWS-managed ENIs that can tunnel through VPC boundaries) are not free.
S3 can use either, and we recommend establishing VPC Gateway endpoints by default whenever you need S3 access.
(Disclaimer: I work for AWS, opinions are my own.)
That is price segmentation. People who are price insensitive will not invest the time to fix it
People who are probably shouldn't be on aws - but they usually have to for unrelated reasons, and they will work to reduce their bill.
> People who are price insensitive will not invest the time to fix it
This just sounds like a polite way of saying "we're taking peoples' money in exchange for nothing of value, and we can get away with it because they don't know any better".
It's more like: we made loads of stuff super cheap but here's where we make some money because it scales with use.
Price segmentation happens all the time in pretty much every industry.
There’s an entire Pandora’s box of shitty things that happen in pretty much every industry. I don’t think you want to use that defense.
>People who are price insensitive will not invest the time to fix it
Hideous.
The problem is that VPC endpoints aren't free.
They should be, of course, at least when the destination is an AWS service in the same region.
[edit: I'm speaking about interface endpoints, but S3 and DynamoDB can use gateway endpoints, which are free to the same region]
Gateway endpoints are free. Network endpoints (which are basically AWS-managed ENIs that can tunnel through VPC boundaries) are not free.
S3 can use either, and we recommend establishing VPC Gateway endpoints by default whenever you need S3 access.
(Disclaimer: I work for AWS, opinions are my own.)
Fair point, and valid for S3 (the topic at hand) and DynamoDB.
Other AWS services, though, don't support gateway endpoints.
Well yeah that's the point....why route through the public internet.
I doubt the traffic ever actually leaves AWS. Assuming it does make it all the way out to their edge routers, the destination ASN will still be one of their own. Not that the pricing will reflect this, of course.
The other problem with (interface) VPC endpoints is that they eat up IP addresses. Every service/region permutation needs a separate IP address drawn from your subnets. Immaterial if you're using IPv6, but can be quite limiting if you're using IPv4.
>VPC peering used to be annoying; now there are better options like Transit Gateway, VPC sharing between accounts, resource sharing between accounts, and Cloud WAN.
TGW is... twice as expensive as vpc peering?
VPC sharing is the sleeper here. You can do cross account networking all in the same VPC and skip all the expensive stuff.
More than twice as same AZ is free with peering. But if you're big enough you can get better deals on cost.
But unlike peering TGW traffic flows through an additional compute layer so it has additional cost.
I just saw Weird Al in concert, and one of my favorite songs of his is "Everything You Know is Wrong." This is the AWS version of that song! Nice work Corey!
> You don’t have to randomize the first part of your object keys to ensure they get spread around and avoid hotspots.
As of when? According to internal support, this is still required as of 1.5 years ago.
He's not talking about the prefix, just the beginning of the object key.
It would've been nice if each of those claims in the article also linked to either the relevant announcement or to the documentation. If I'm interested in any of these headline items, I'd like to learn more.
Would love an AWS equivalent to Cloud Run but the lambda changes are welcome nonetheless.
> Glacier restores are also no longer painfully slow.
Wouldn't this always depend on the length of the queue to access the robotic tape library? Once your tape is loaded it should move really quickly:
https://www.ibm.com/docs/en/ts4500-tape-library?topic=perfor...
> Once upon a time Glacier was its own service that had nothing to do with S3. If you look closely (hi, billing data!) you can see vestiges of how this used to be, before the S3 team absorbed it as a series of storage classes.
Your assumption holds if they still use tape. But this paragraph hints at it not being tape anymore. The eternal battle between tape versus drive backup takes another turn.
I am also assuming that Amazon intends for the Deep Archive tier to be a profitable offering. At $0.00099/gb-month, I don't see how it could be anything other than tape.
API gateway timeout increase has been nice.
It was always there but it required much more activity to get it done (document your use case & traffic levels and then work with your TAM to get the limit changed).
I've had two people tell me in the last week that SQS doesn't support FIFO queues.
> You don’t have to randomize the first part of your object keys to ensure they get spread around and avoid hotspots.
Not strictly true.
Elaborate.
The whole auto-balancing thing isn't instant. If you have a burst of writes with the same key prefix, you'll get throttled.
Not the OP but I’ve had AWS-staff recommend different prefixes even as recently as last year.
If key prefixes don’t matter much any more, then it’s a very recent change that I’ve missed.
Might just be that the AWS staff wasn't up to date on this
I have had the same experience within the last 18 months. The storage team came back to me and asked me to spread my ultra high throughput write workload across 52 (A-Za-z) prefixes and then they pre-partitioned the bucket for me.
S3 will automatically do this over time now, but I think there are/were edge cases still. I definitely hit one and experienced throttling at peak load until we made the change.
That’s sounds like the problem we were having. Lots of writes to a prefix over a short period of time and then low activity to it after about 2 weeks.
That’s possible but they did consult with the storage team prior to our consultation.
But I don’t know what conversations did or did not happen behind the scenes.
By the way, that happens quite frequently. I regularly ask them about new AWS technologies or recent changes, and most of the time they are not aware. They usually say they will call back later after doing some research.
CloudFront also has 1TB of free data transfer a month under the forever-free perks.
I'll add: When doing instance to instance communication (in the same AZ) always use private ips. If you use public ip routing (even the same AZ) this is charged as regional data transfer.
Even worse, if you run self hosted NAT instance(s) don't use a EIP attached to them. Just use a auto-assigned public IP (no EIP).
NAT instance with EIP
- AWS routes it through the public AWS network infrastructure (hairpinning).
- You get charged $0.01/GB regional data transfer, even if in the same AZ.
NAT instance with auto-assigned public IP (no EIP)
- Traffic routes through the NAT instance’s private IP, not its public IP.
- No regional data transfer fee — because all traffic stays within the private VPC network.
- auto-assigned public IP may change if the instance is shutdown or re-created so have automations to handle that. Though you should be using the network interface ID reference in your VPC routing tables.
[dead]