EXIF orientation info in PNGs isn't used for image-orientation: from-image
bugzilla.mozilla.org85 points by justin-reeves 10 hours ago
85 points by justin-reeves 10 hours ago
As a former metadata completionist, my mind starts to dissociate when I think about my battles with EXIF metadata, vendor-specific metadata, and the way different software supports, or refuses to support, any of it.
It gets even worse when ingesting images into Apple Photos, where you have to confront papercut bugs that you know will never be fixed.
I love ExifTool. It’s one of the great utilities. It works for almost every file I throw at it. But reading its output can be unsettling. It’s like getting a glimpse of eudaimonia, only to have it rudely interrupted by the reality of Apple Photos misreading every lens in your collection.
I guess orientation isn't even metadata any more, but data: the RGB(A) value of each pixel is data, and the location of this pixel is also data. But the location of the pixel changes depending on the orientation. Of course absent of any orientation it can be understood as "information of the image is stored in the file describing the image" in left-to-right, top-to-bottom order (but with BMPs it's bottom to top!), but with orientation "metadata", it's whatever is defined there.
So yeah, I think "Stripping all EXIF metadata doesn't change an image" deserves an entry as a "falsehood programmers believe about...".
The same applies to color space. You can’t interpret the pixel data without knowing the color space. If it’s not in the metadata, you just have to assume sRGB and hope for the best.
It’s the same with rotation. Both are essential information on how to interpret the pixel data for display, but we’re so very used to assuming certain defaults that it’s easy to forget about this.
I've been trying to create clean metadata for a collection of Blu-ray rips recently. The MKV format has a bunch of defined metadata fields but handling of it is inconsistent between players. VLC seems to be the worst in that it doesn't even bother displaying important pieces of the metadata. You can work around that by effectively duplicating the important parts in the track name, but then other software ends up doubling up on that because it's displaying both the track name and the values pulled from the other track metadata. And I'm being driven crazy on how I should use the subtitle track flags that indicate if a track is Forced or Default, because it seems like the auto-selection behavior based on those flags arbitrary from player to player.
I should probably just give up and let it all be a mess. Not sure I'll be able to though. The only thing that freed me from metadata obsession when it came to my music collection is that I switched to streaming services.
I'm comforted that it's not only me :D. I made a tool to index/exfiltrate media from phone backups and DSLR storage and the behavior has been changing over the years without me changing anything.
> It gets even worse when ingesting images into Apple Photos, where you have to confront papercut bugs that you know will never be fixed.
I wish they open-sourced their built-in macOS apps.
A nice compromise would be to open source the libraries that consume and emit data as well as core processing. Then they can add their own secret sauce UX and integration.
Likewise, that will never happen either.
Yes, they used to compromise more along these lines in the past. e.g., Samba's vfs_fruit would never have gotten as good as it is without Apple open sourcing their SMBClient. Everyone benefits, even Apple (I'm sure they're running vfs_fruit on their server storage arrays internally). Wish they'd do it more.
Browsers starting to rotate images based on EXIF is such a pain. I maintain an image annotation tool and all of a sudden images were shown differently to users depending on the browser they used. Then you have to jump through all sorts of hoops to ignore the EXIF orientation again. In some cases you are not allowed to see if the orientation was changed for security reasons. And then the only way to control this is through a CSS attribute which only works if the element is in the DOM.
The amount of time I've spent dealing with this over the years is just incredible.. It's gotten to the point where during ingestion we auto-rotate everything just in case and strip out exif orientation metadata and never have to deal with it again.
Yes, came to the same conclusion - it's a pain, but solves the problem permanently.
This is the correct approach in my opinion. Metadata should not be used to control rotation - there are just too many edge cases for where it can go wrong.
Yes. I wrote a little image uploader script to easily upload images from my phone for embedding in web forums etc, and it strips out all the EXIF orientation and just converts it to the correct orientation. Aside from that I'm always having to fiddle with it in my image tools and hope every software I use supports it. It's such a crap feature. Just rotate the damn image, phones!
Wouldn’t that degrade the quality for a lossy format, especially if done repeatedly? I see why people would not want their phone to do that. If you’re uploading it somewhere that might not be supported it would be worth it but I don’t want my phone to silently degrade images that are just sitting in my gallery
JPEG allows for lossless 90 degree rotation, not sure about other formats
Pretty sure that any software from after ~2005 that supports image rotation, isn’t doing so at decoding time, but rather is decoding the image into a GPU texture buffer and then rotating the quad that texture gets applied to. Which should always be lossless for multiple-of-90-degree rotations. (Presuming that the image doesn’t depend on sub-pixel rendering tricks.)
Even without a GPU, the JPEG format itself allows for totally lossless rotation. It is also quite fast, and doesn't require reading and writing every pixel in the image.
Isn't this only true for images where the resolution is divisible by the block size or something like that?
IIRC in other cases you have to cut the edge of the image off (rounded to the nearest block) or recompress those edge blocks
Is that still true if the image dimensions aren't a multiple of the block size?
Rotate it at capture time, before encoding. This would get rid of like 95% of these exif orientation tags. For images that need to be manually rotated after for whatever reason, sure I guess you have a point, though I'd argue the quality loss would be unnoticeable in practice unless you're like spinning the image in circles 100 times for some reason.
> In some cases you are not allowed to see if the orientation was changed for security reasons.
This is only true for cross-origin images, no? Which is expected: you can't access data loaded from another origin unless it's been loaded with CORS.
Isn't it a touch on the required side, though? I'm assuming the orientation is a common metadata element of phone produced images, in particular. I'd assume same for decent cameras.
Would love to see a good rundown of when you should rely on different approaches? Another thread pointed out that you should also use the color space metadata.
Some systems seem to produce images where the pixel arrangement matches the sensor layout, which moves when you rotate the device, and they'll add EXIF metadata to indicate the orientation.
Other cameras and phones and apps produce images where the device adjusts the aspect ratio and order of the array of pixels in the image regardless of the way the sensor was pointed, such that the EXIF orientation is always the default 0-degree rotation. I'd argue that this is simpler, it's the way that people ignorant of the existence of the metadata method would expect the system to work. That method always works on any device or browser, rotating with EXIF only works if your whole pipeline is aware of that method.
The advantage of the EXIF approach is you don't have to do nearly as much post processing of the data? In particular, I don't expect my camera application to need to change memory layout just because I have rotated my camera. So, if you want it to change the rows/columns on saving the image, that has to be post capture from the sensor. Right?
I think this is what you meant by "some systems" there. But, I would expect that of every sensor system? I legit never would have considered that they would try the transpose on saving the image off the sensor.
I maintain a photo library project, I feel your pain. JPEG format is so crooked in so many ways.
It's shame that after so many years of development we ended up with such horrible formats like jpeg and mp4.
> Further findings: neither Safari, Chrome, or Firefox respects exiftool's default output, which appends EXIF to the end of a PNG.
Makes sense. I have to imagine there is a performance impact to waiting until you've downloaded the entire image _just in case_ there's some metadata telling you to rotate it right at the end of the stream.
Interesting. I was not aware that was a thing. Orientation info seems way less useful in a lossless format like PNG. It makes sense in JPEG for instance because rotating and re-encoding would be lossy and slightly degrade the image.
JPEGs can be rotated losslessly as long as their size is a multiple of eight (or to be pedantic, the block size, which is usually eight).
It's hella useful when the encoder doesn't have the RAM to hold the entire image. But this is a pretty rare case.
JPEG rotation only has to be lossy when the image is not evenly divisible into macro blocks - rather than transcoding just rotate the macro blocks, and where they're placed.
Looping through inflate/deflate on rotated pixels still takes more time than updating a bit in the Exif (and the chunk’s associated CRC)
It's still negligible from the consumer standpoint.
Like, if you had millions of images you needed to rotate on a server in a batch job, then OK.
But if you're just rotating one photo, or even a hundred, that you've just taken, it's plenty fast enough.
The image could have been encoded with a high compression ratio, or even something like OxiPNG. In that case, while re-encoding it wouldn't lose quality, it could still have the side-effect of making the file bigger.
At they very least it will take time. Rotating is a fairly common operation, even simple photo viewers often have buttons to quickly rotate the image. Being able to do this efficently is beneficial.
You can also have other situations where this is useful like a primarily hardware pipeline that doesn't support rotation, but you can mark the rotation at the end. Although this is probably less of an issue for PNG than formats that typically come out of cameras and scanners.
The orientation data is defined as part of Exif. Both JPEG and PNG has officially supported ways of embedding Exif data. It's not defined specifically for PNG, but you would expect the Exif tag to work the same way regardless of image data format.
I dont think i would. Exif contains a bunch of metadata that affect the interpretation of image data in jpegs which dont make sense for pngs. I would expect exif in png would only be for metadata meant for humans like who the author is, not things that alter the display of the image.
This is super useful if you need to learn how to manipulate exif orientation https://github.com/recurser/exif-orientation-examples
Orientation in EXIF was an ugly hack and we're living with its fallout today.
Cameras should have just rotated the actual image pixels when saving, instead of cheating. If that's too slow, implement it in hardware, or schedule a deferred process and don't let the images be exported until that's done.
> Cameras should have just rotated the actual image pixels when saving, instead of cheating. If that's too slow, implement it in hardware, or schedule a deferred process and don't let the images be exported until that's done.
What if I want to rotate an image by 90 degrees because my camera didn't correctly detect up & down?
To my understanding rotation is lossless, where as moving the data will incur quality loss (except for certain exceptions).
JPEG rotation can be lossless for certain image dimensions (multiples of 8 or 16 pixels respectively, depending on chroma subsampling).
I suppose it's no coincidence that the native output format of many sensors (or ISPs, to be precise) is divisible by 16 in both width and height.
> Orientation in EXIF was an ugly hack and we're living with its fallout today.
No, it was an elegant hack given all the constraints which mostly no longer exist on modern hardware (although I wouldn't be so sure about really small embedded systems).
Sure, modern cameras will have no issues loading the full JPEG into memory, but how would you have implemented this in cameras that only have enough for exactly one line's worth of compression blocks?
> or schedule a deferred process and don't let the images be exported until that's done.
Good luck doing this on a battery-powered camera writing directly to an SD card that's expected to be mountable after removing it from the camera without an intransparent postprocessing step.
Eh, the coordinate frame can really be anything. It's important to disambiguate what is really meant. The convention in images is that images are +X-Y, but for certain applications, the PNG may represent data that is +X+Y, or mirrored -X+Y, landscape, or portrait. Is the coordinate system the camera coordinates or the world coordinates?
It's true that automatic handling of all input images is difficult, but imo it's important to document.
An example I recently encountered is that in neurological imaging, the axes are patient's right, anterior, superior whereas in radiology they are patient's left, anterior superior. Tricky to get right...
http://www.grahamwideman.com/gw/brain/orientation/orientterm...
> Eh, the coordinate frame can really be anything.
Well, in JPEG, there's exactly one coordinate frame in the absence of EXIF metadata: Left to right, top to bottom. So there's really only one.
I personally like the status quo that PNGs don't encode orientation. I can dump PNGs when I'm debugging and I know I'm looking at the bits the same way up as the code is!
PNG now does - and they've been as vague as they could be in the spec about whether any exif data should affect the image display or not. The spec says:
"It is recommended that unless a decoder has independent knowledge of the validity of the Exif data, the data should be considered to be of historical value only."
Instead of either saying: "yes you must rotate it" or "no you shall not rotate it" to make everyone do the same thing. And if it were yes, they should also have made this a mandatory chunk since now they made it optional to read.
That’s pretty typical in technical standards. It’s so that existing software isn’t forced to choose between the Scylla of not being able to claim conformance to the updated standard and the Charybdis of breaking backwards compatibility.
EXIF orientation has always been a massive pain to deal with. Specially with HEIC that keeps getting updated libheifs and causing all sorts of compatibility issues.
I don’t know “why”. Why the format defines a way to set the orientation instead of set a pixels matrix as it is.
It's setting one bit vs transforming each block with potential further loss of precision.
See also: The VLC bug that incorrectly applies right crops as left crops [1]. This bug report is from 2023, however the bug has existed as long as VLC has as far as I know.
I'm always surprised to see bugs like this where an extremely easy to test part of the spec just seemingly isn't tested and ends up as a bug that never gets fixed until many years later.
I firmly believe every product team needs to be split in two: one half works on the issue of highest importance, the other works on the easiest issues. If only to avoid the embarrassment of easy to fix bugs that were passed over for eons just because they weren't priority-high.
There's something to this, although I think the idea needs some refinement. Anyone who's worked on a real software product knows that the "easy" bugs usually aren't actually easy (or else they would've been fixed already!).
The way I've seen it implemented at a small company I worked at before was to explicitly endorse the "20% time" idea that Google made famous, where you may choose your own priorities for a fraction of your working time regardless of the bug tracker priority order. Even if in practice you don't actually have that spare time allocated in your schedule, it does give you some cover to tell your manager why you are prioritizing little UI papercuts over product features this week.
> Anyone who's worked on a real software product knows that the "easy" bugs usually aren't actually easy (or else they would've been fixed already!).
Not really. It's hard to see the difference from the outside without actually digging into it first, but in my experience while there's plenty of "easy" bugs that aren't actually easy, there's also plenty of easy bugs that are actually easy and that apparently everyone else assumed they're not, or else they would have been fixed already :P
Easy bugs might exist at small and medium size companies, but when you are a $1T+ company, there is no such thing as an easy fix. Your change could have unforeseen side effects that take down some critical revenue-generating service that causes us to lose $millions. It's got to go through multiple code reviews, have unit and integration tests written, be able to show those test passing more than once, it may need to get reviewed by legal, it may need to get reviewed by security and privacy teams. And tons of other process overhead I'm not even recalling. Just getting a one-liner from an engineer's fingertips properly deployed into production could take months.
Whether or not you fix a bug weighs on the scale against the cost of all of the above things, the cost of time, the cost of these people's attention, and the opportunity cost of them doing something else. And these costs tend to not scale with the size of the pull request. They're fixed costs that have to be paid no matter how small an issue is.
I work at a BigCo, and occasionally get comments from developer friends about "Hey, why doesn't BigCo fix this obvious bug I reported! It's simple! Why are you guys so incompetent??" I look at the bug internally, and it's either 1. got a huge internal comment chain showing it's not as simple as an outsider would think, or 2. it's indeed trivial, but the effort to fix it does not outweigh the costs I outlined above.
> Anyone who's worked on a real software product knows that the "easy" bugs usually aren't actually easy (or else they would've been fixed already!).
Well, could be many reasons, "priorities" is usually the reason I see as the top reason for things like that to not be fixed immediately, rather than "we looked into it and it was hard". Second most popular reason is "workaround exists", and then after that probably something like "looks easy but isn't".
I think the solution would be to stop consider "easy-but-isn't" as easy bugs, even if they might appear so. So the "easy bugs" team would have their worklog, and if they discover one of those bugs weren't actually easy and would need large changes, reject it and push it somewhere else, and start working on something that is actually easy instead.
Thinking about it more, maybe the better approach is the 2nd team works on the oldest tickets. That's an objective measure that has no surprises and more directly addresses the problem of long-standing issues that are sometimes embarrassing because they turned out to be easy.
But in general, I do believe that teams should be split on the priority issue in some way. If all you are doing is chasing the highest priority stuff, you're going to miss important things because priority isn't an exact science either.
> the "easy" bugs usually aren't actually easy (or else they would've been fixed already!).
This is a perfect expression of something I like to call Chesterton's Inertia. It's exactly the same as Chesterton's Fence.
What happens is that there is a mess on the floor, and somebody walks around it, maybe just because they were in a hurry, or maybe they didn't even see it. Somebody else walks in, maybe doesn't even notice the mess, just notices the faint trail that the last person left, and follows it. The next person walks in, sees a mess, and sees a trod path around the mess, and follows the path.
Years later, the path has been paved, has signage posted that doesn't refer to the mess, and has walls blocking the mess from sight. The mess has fused with the ground it used to just sit on, and is partially being used to support the wall. Every once in a while, someone asks why there's a weird bend in the path that makes no sense, and a old hand who's been around since the beginning tells him that the bend is a fence, and not for you to understand.
That will be great popcorn stuff to watch from the outside I think.
Or tragic, but I rather see drama than joy with this approach. The main thing with bugfixing is, that it can affect a whole lot of other areas, or introduce completely new bugs. So both teams then fighting over changes ..
Now a really trivial bug with no side effects, sure thing, no issue, but like a sibling commentor has said, the really trivial bugs are usually fixed already. And quick fixes of seemingly trivial things can induce a world of pain for someone else.
In other words, I think project management and prioritising things remain hard, with no magic bullets solutions avaiable. (But I wpuld also prefer a stronger emphasis on quality control in general, vs new feature)
I feel like with opensource projects those kind of "easy to fix but not priority" bugs are a really nice way to keep the door open to new contributors.
You're a new coder and would like to help a project, if possible a big one for your resume? Here are something to get started.
Yeah, people really underestimate how many low hanging fruits are left there to reach for even in fairly popular projects. Don't just assume that "surely someone must have tried to fix this already", it's not always the case.
I don't this makes business sense in general.
I do however think that there are quite a few bugs that might be triaged as "easy" but if worked on would reveal much more serious problems. Which is why some random selection of "easy" issues should make it to work queues.
Makes business sense if you want to fill in developers down time - instead of waiting for CR or QA feedback pick up small bug.
Working on two or more big features at the same time is not possible. But throwing in some pebbles and dev can take on it.
The business sense! Would someone please think of the business!
I've yet to find a business that really, truly knows what it wants. Whatever is "good for the business case" today could change overnight after the President reads some cockamamie article in Harvard Business Review, and again in two weeks after the CEO spends a weekend in Jackson Hole.
We do priority high stuff in sprint planning. Non prio bug fixes can be dragged in by devs when the are out of sprint tasks or need to switch context to unfold thoughts.
You cannot just push high prio stuff on people.
Business gets their predictable workload done bonus stuff like things team wants to fix gets of course second seat but it has its place.
Also, people in the first team should also alternate hard/complicated issues with easy fixes.
I remember EXIF orientation in JPEG also took a few years to get fixed:
https://issues.chromium.org/issues/40448628
When it got fixed, some sites were still depending on the old behavior of not rotating JPEGs, and had to add "image-orientation:none" to explicitly ignore EXIF:
I once had to deal with an old website that ignored the orientation flag in jpg, so my iPhone portrait photos showed up landscape when I uploaded them.
Thankfully Finder in macOS has a way to remove the flag:
How to remove orientation from portrait photo from iPhone on macOS https://youtu.be/lWOlfjVyes4
I couldn’t find a way to do it in Preview, but Finder could do it.
If this gets supported after the fact, aren’t people going to find some of their PNGs displaying upside down out of the blue?
Yeah, ideally the browsers would phase this change in gradually to minimize disruption.
If the EXIF data specifies a 180° rotation, then start at 0° and gradually increase the rotation by 1° per day until full spec compliance is reached.
On related note there is exiftool which tries to understand all those different formats
And it has it's own forums with tens of thousands of posts!
5 years is a short time for Firefox bugs.
Probably for WebKit and Chromium too; I have an open (silly) WebKit CSS bug for about 4 years now.
[dead]
Why does a bug report get shared on hn?
https://news.ycombinator.com/newsguidelines.html
> On-Topic: Anything that good hackers would find interesting. That includes more than hacking and startups. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity.
Someone probably thought it was interesting, and based on the fact it's on the front page and receiving comments, at least some other people agree.
One thing I miss in PNGs is text wrapping.
That, or even just a small javascript interpreter in PNG would greatly improve things for a lot of my clients.