AsuharietYgvar/AppleNeuralHash2ONNX

Working Collision?

dxoigmn opened this issue ยท 119 comments

Can you verify that these two images collide?
beagle360
collision

Here's what I see from following your directions:

$ python3 nnhash.py NeuralHash/model.onnx neuralhash_128x96_seed1.dat beagle360.png
59a34eabe31910abfb06f308
$ python3 nnhash.py NeuralHash/model.onnx neuralhash_128x96_seed1.dat collision.png
59a34eabe31910abfb06f308

Yes! I can confirm that both images generate the exact same hashes on my iPhone. And they are identical to what you generated here.

@dxoigmn Can you generate an image for any given hash (preimage attack) or do you need access to the source image first (second preimage attack)?

@dxoigmn Can you generate an image for any given hash (preimage attack) or do you need access to the source image first (second preimage attack)?

If I'm not mistaken a preimage attack was done here
Edit: this should be the working script (I haven't tested it)

If I'm not mistaken a preimage attack was done here

hmmm
"This is so fake it's not even funny. These are just images generated by the model from https://thisartworkdoesnotexist.com . It's hilarious to see so many people falling for it here" (in the comments)

@fuomag9 Interesting, but one ethical question remains -- how did they obtain the CSAM hashes? I was under the impression that the NeuralHash outputs of the NCMEC images were not readily available. This strongly suggests that the authors must have obtained child pornography, hashed it, and then generated spoofs.

"This is so fake it's not even funny. These are just images generated by the model from https://thisartworkdoesnotexist.com . It's hilarious to see so many people falling for it here"

Or they used images generated by that site as the starting point. As I noted in my previous comment, it's impossible to know without having the NeuralHash NCMEC database.

This is not only a collision, it is a pre-image, which breaks the algorithm even more.

Collison:
Find two random images with the same hash.

Pre-image:
Find an image with the same hash as a known, given image.

@fuomag9 Interesting, but one ethical question remains -- how did they obtain the CSAM hashes? I was under the impression that the NeuralHash outputs of the NCMEC images were not readily available. This strongly suggests that the authors must have obtained child pornography, hashed it, and then generated spoofs.

If I'm not mistaken the DB with the hashes is stored locally and you can extract it from the iOS 15 beta

Edit: the hashes are stored locally but not in a way that makes them recoverable to the end user, see below

Holy Shit.

@erlenmayr No, this is likely a second preimage attack. In a preimage attack you're just given the hash, not the image.

@fuomag9 so the list of known CP hashes is shipped on every device? Isn't this a huge security issue?
This was untrue. See this comment.

@fuomag9 so the list of known CP hashes is shipped on every device? Isn't this a huge security issue?

This is from apple's pdf on the technical details of their implementation. Feel free to correct me but from my understanding the blinded hash is the CSAM hash DB

image

Edit: I was wrong and we cannot extract them

Wow. Couldn't they have hashed the image locally, send it to the server and compare it there?

image
My bad. The hashes on the device have gone through a blinding process, so my train of thought was not correct. Does anyone know why they chose to compare it on the client instead of on the server, seeing as it is only ran on images uploaded to iCloud?

https://twitter.com/angelwolf71885/status/1427922279778881538?s=20 this seems iteresting tho, a CSAM sample hash seems to exist?

@dxoigmn Can you generate an image for any given hash (preimage attack) or do you need access to the source image first (second preimage attack)?

Not sure. My sense is that seed1 seems to be mixing bits from the output of model. But we already know from the literature it is very likely. At least, I am reasonably confident one could generate a noisy gray image that outputs some desired hash value.

@fuomag9 Interesting, but one ethical question remains -- how did they obtain the CSAM hashes? I was under the impression that the NeuralHash outputs of the NCMEC images were not readily available. This strongly suggests that the authors must have obtained child pornography, hashed it, and then generated spoofs.

No one has obtained the CSAM hashes, AFAIK. This repo is just a neural network that takes an input and produces a hash. It's as if someone released a new hash algorithm. (What you do with those hashes is a different story and the comparison of the hash with CSAM hashes is a whole other system.) This just shows that an image of a dog has the same hash value as a noisy gray image. That is, ask the model for the hash of the dog image, then ask the model how to change a gray image to make it output the same hash as the dog image. So it's a second-preimage, which we know has to be possible (by the pigeonhole principle); but it's really just a matter of feasibility. The interesting thing about neural network models (when compared to cryptographic hashes) is that they give you a gradient which tells you how you can change stuff to optimize some objective. This is the basis of deep dream, adversarial examples, and even just plain old training of neural network models.

@tmechen

If I'm not mistaken a preimage attack was done here

hmmm
"This is so fake it's not even funny. These are just images generated by the model from https://thisartworkdoesnotexist.com . It's hilarious to see so many people falling for it here" (in the comments)

From what I understood, the story the HN comment is referring to is AI generated porn, which can match as "real" porn. Aka the "send dunes" story: https://petapixel.com/2017/12/20/uk-police-porn-spotting-ai-gets-confused-desert-photos/

This issue here, however, is a hash collission. This is huge, if confirmed. Really ugly stuff.

Does anyone know why they chose to compare it on the client instead of on the server, seeing as it is only ran on images uploaded to iCloud?

If they compare on the server then Apple is able to know the outcome of the comparison even if there is only a single match. The idea is that Apple can only know that a certain user has matching images in their iCloud account as soon as a certain threshold number of matches is reached. That's my understanding as a non-expert in this field.

Does anyone know why they chose to compare it on the client instead of on the server, seeing as it is only ran on images uploaded to iCloud?

If they compare on the server then Apple is able to know the outcome of the comparison even if there is only a single match. The idea is that Apple can only know that a certain user has matching images in their iCloud account as soon as a certain threshold number of matches is reached. That's my understanding as a non-expert in this field.

I think they also inject false positives for results on device side, so they don't know if there is only a single match. They know only when threshold is reached. It was in the PSI paper.

This issue here, however, is a hash collission. This is huge, if confirmed. Really ugly stuff.

Every practical hashing algorithm is one-to-many. It is only a matter of time when collisions come. This repo contains the model from older release and we don't really know how they have tweaked parameters on Apple side or even how the huge data set is improving the accuracy. NN has many layers so training is rather important.
The goal of the algorithm is to seek matches from specific kind of material (CSAM), and that is used for training. Can we expect that it calculates hashes with similar accuracy for all kind of images? (e.g is it equally hard to make collision for CSAM material than the picture of the dog?)

This issue here, however, is a hash collission. This is huge, if confirmed. Really ugly stuff.

Every practical hashing algorithm is one-to-many. It is only a matter of time when collisions come. This repo contains the model from older release and we don't really know how they have tweaked parameters on Apple side or even how the huge data set is improving the accuracy. NN has many layers so training is rather important.

Let me quote from this article of someone who can explain this better than me:

https://www.hackerfactor.com/blog/index.php?/archives/929-One-Bad-Apple.html

In the six years that I've been using these hashes at FotoForensics, I've only matched 5 of these 3 million MD5 hashes. (They really are not that useful.) In addition, one of them was definitely a false-positive. (The false-positive was a fully clothed man holding a monkey -- I think it's a rhesus macaque. No children, no nudity.)

and:

According to NCMEC, I submitted 608 reports to NCMEC in 2019, and 523 reports in 2020. In those same years, Apple submitted 205 and 265 reports (respectively). It isn't that Apple doesn't receive more picture than my service, or that they don't have more CP than I receive. Rather, it's that they don't seem to notice and therefore, don't report.

Let that sink in.

Now back to our issue:

While hash detection is a really bad idea in general for such purposes, the specific implementation from apple does not matter, because it is closed source. You literally don't know what apple is doing, with what data, and what result comes out of it.

Even if apples specific algorithm is the best in the world and does not have these drawbacks: You would not know. You would have to live in constant fear of your phone "thinking" you may are doing something wrong. That's scary.

Even if apples specific algorithm is the best in the world and does not have this drawbacks: You would not know. You would have to live in constant fear of your phone "thinking" you may are doing something wrong. That's scary.

That is very true. We can either give total trust or nothing at all on closed systems.

@fuomag9 so the list of known CP hashes is shipped on every device? Isn't this a huge security issue?

It's not. It's encrypted using elliptic-curve cryptography first. They keep the actual CSAM neuralhash db on their server.

Why do they do that? A) To prevent reversing the hashes, as mentioned. B) So they can derive the encryption keys for the image locally, which is what the blinded hash table is also used for. I'm leaving out a lot of details that are mentioned in the whitepaper they released, so read that.

@weskerfoot Yeah I edited my other comment with an explanation of the blinded hash.

all we need now is a hash of what is considered 'csam' to iphone, and start making images collide with it,
if we could somehow make collisions with arbitrary content, (eg. like a funny meme that someone would be likely to save to there phone) that would be great-

I'm not sure how to get a hash without having such a file to hash it though ..

Oh just curious, since the hashing is done on the client side, would it be possible to tell iCloud that the hash matched every time? and if so, what's stopping you from just flooding it with random shit?

Oh just curious, since the hashing is done on the client side, would it be possible to tell iCloud that the hash matched every time? and if so, what's stopping you from just flooding it with random shit?

In fact it already does that by design, in order to obscure how many matching images there are before the threshold is crossed where it can decrypt all matching images
image

So if you can figure out a way to generate synthetic matches on demand, you could make it think there are lots of matches, but it would soon discover they're "fake" once the threshold is crossed, since it wouldn't be able to decrypt them all. Not sure what it would do if you did that repeatedly, maybe it would cause issues.

Edit: to be clear, the inner encryption key is associated with the NeuralHash. So if you have a false positive NeuralHash output, it would trigger manual review, but you would need to actually have that, which is why they keep them a secret.

56C19CA9-E56E-4742-A403-A8D478ECE688

I wonder if this code can be ported so you can generate the collision an Android device...

can we take any given image and then make its hash totally different, despite the image looking basically identical?
just curious if this even works for the one thing its suppost to do.. haha

taking a given image and then make its hash totally different, despite the image looking basically identical. can we do that?
just curious if this even works for the one thing its suppost to do..

cropping the image seems to work since the algorithm (at least what we have now) is vulnerable to that

cropping the image seems to work since the algorithm (at least what we have now) is vulnerable to that

ah yes, because pedos are being so sophisticated to use end to end encryption these days that we need to backdoor everyone's phones, because think of the children but there NOT sophisticated enough to ..checks notes.. crop images,

The cropping sensitivity is likely due to aliasing in convolutional neural networks. [1] is a starting point into that line of research. By starting point I really mean: read the related works and ignore the actual paper.

[1] https://openaccess.thecvf.com/content/CVPR2021/html/Chaman_Truly_Shift-Invariant_Convolutional_Neural_Networks_CVPR_2021_paper.html

The cropping sensitivity is likely due to aliasing in convolutional neural networks. [1] is a starting point into that line of research.

So, your saying if you just alias the fuck out of the image it wont be detected ?
can this thing even handle like basic shit like JPEG-ing a png ?

taking a given image and then make its hash totally different, despite the image looking basically identical. can we do that?
just curious if this even works for the one thing its suppost to do..

cropping the image seems to work since the algorithm (at least what we have now) is vulnerable to that

Technically though, the cropped image could end up with a colliding hash with another CSAM image, couldn't it?

Technically though, the cropped image could end up with a colliding hash with another CSAM image, couldn't it?

i guess its possible, (though, so could cropping any image, right?)
what are the chances of that happening? can someone whos smarter than me calculate the odds? :?

Technically though, the cropped image could end up with a colliding hash with another CSAM image, couldn't it?

i guess its possible, (though, so could cropping any image, right?)
what are the chances of that happening? can someone whos smarter than me calculate the odds? :?

An illegal image containing something like a vase that gets cropped only to the vase (which by itself it's not illegal) and then gets sent to you as an innocuous image could be considered illegal since it'd match part of a bigger picture?

An illegal image containing something like a vase that gets cropped only to the vase (which by itself it's not illegal) and then gets sent to you as an innocuous image could be considered illegal since it'd match part of a bigger picture?

now imagine another scenario, say you happen to have the same/simular looking vase, and happen to put it on a simular/same table with simular/same looking background and took a photo of it.\ would your completely unrelated but similar looking image now also be flagged?

Apple had good intentions but Jesus this is not the way. I prefer how things currently are. Not this. Fals positives are a real thing and also I don't want a fucking bitcoin mining ai cringe daemon running amock chowint my battery.

A goal for someone producing a second-preimage-image is to construct one that appears to be some arbitary innocuous image. An actual attacker wouldn't use an innocuous image, but would likely instead use nude photographs-- as they would be more likely to be confused for true positives for longer. A random innocuous image would be a reasonable POC that it could also be done with nudes.

Comments above that the target database would be needed by an attacker are mistaken in my view. The whole point of the database they're claiming to use is that it contains millions of highly circulated abuse images. An attacker could probably scan through darknet sites for a few hours and find a collection of sutiable images. They don't have to know for sure exactly what images in the database: Any widely circulated child abuse image is very likely to be in the database. If they obtain, say, 50 of them and use them to create 50 matching second-preimage-images then it should also be likely get the 30 or so hits required to trigger the apple infrastructure.

A real attacker looking to frame someone isn't going to worry too much that the possessing the images to create an attack is a crime-- so would the framing. That fact will just stand in the way of researchers producing POC to show that the system is broken and endangers the public.

I think it's important to understand that even though second-preimage-image attacks exist, concealing the database does not meaningfully protect the users-- as real attackers won't mind handling widely circulated child porn. The database encryption serves only to protect Apple and its (partially undisclosed!) list sources from accountability, and from criticism-via-POC by researchers who aren't out to break the law.

now imagine another scenario, say you happen to have the same/simular looking vase, and happen to put it on a simular/same table with simular/same looking background and took a photo of it.\ would your completely unrelated but similar looking image now also be flagged?

I have lived in the illusion, that CSAM material has been used during development to make it more accurate for that kind of material, to outline this kind of abuse away. Is this just some false information, or misunderstanding by me? Otherwise, this is just general perceptual hashing function and purpose can be changed on any given point, and indeed will be less accurate.

@gmaxwell, you have misunderstood how the system works. If you generate 30 matching NeuralHashes, that just means that the 30 images are then matched on Apple's end with a different system. If that also says the images match, only then is it escalated to something actionable.

Hm, does anyone know if it counts anime loli/shota content as CSAM? because then there's a legal way to get hashes-
just get someone from Japan to hash the files for you. and send the hashes back. easy-

its legal over there so no laws broken there, and assuming its not where you live.. your only getting hashes not the real files, so it would be totally legal.. right?

i mean its ethically questionable, maybe. (i guess it depends where you stand on that issue) but it should be legal right??
i donno, im not a lawyer >_<

Hm, does anyone know if it counts anime loli/shota content as CSAM? because then there's a legal way to get hashes-
just get someone from Japan to hash the files for you. and send the hashes back. easy-

its legal over there so no laws broken there, and assuming its not where you live.. your only getting hashes not the real files, so it would be totally legal.. right?

i mean its ethically questionable, maybe. (i guess it depends where you stand on that issue) but it should be legal right??
i donno, im not a lawyer >_<

That would surely be an interesting question if apple starts adding more databases or expands the system to other countries

cmsj commented

That would surely be an interesting question if apple starts adding more databases or expands the system to other countries

FWIW, Apple clarified recently that hashes in their CSAM database would need to be sourced from two separate countries.

@cmsj for reference:

https://www.theverge.com/2021/8/13/22623859/apple-icloud-photos-csam-scanning-security-multiple-jurisdictions-safeguard

image

And the hashes are indeed included in the iOS source image and can't be updated remotely without an iOS update.

Laim commented

Hm, does anyone know if it counts anime loli/shota content as CSAM? because then there's a legal way to get hashes-
just get someone from Japan to hash the files for you. and send the hashes back. easy-
its legal over there so no laws broken there, and assuming its not where you live.. your only getting hashes not the real files, so it would be totally legal.. right?
i mean its ethically questionable, maybe. (i guess it depends where you stand on that issue) but it should be legal right??
i donno, im not a lawyer >_<

That would surely be an interesting question if apple starts adding more databases or expands the system to other countries

Would be interesting what they're considering as CP, is it all going under the US laws or will it be broken down to laws per a country? Since countries have different definitions to what is and what isn't CP, in some cases, such as Loli.

taking a given image and then make its hash totally different, despite the image looking basically identical. can we do that?
just curious if this even works for the one thing its suppost to do..

cropping the image seems to work since the algorithm (at least what we have now) is vulnerable to that

Has anyone worked out the percentage of cropping needed to avoid a match using NeuralHash?
With PhotoDNA, it's about 2% off the width or height.

Has anyone worked out the percentage of cropping needed to avoid a match using NeuralHash?
With PhotoDNA, it's about 2% off the width or height.

i tried cropping it seems to only change the hash by 1 or 2 bits. is that really enough ?

that just means that the 30 images are then matched on Apple's end with a different system.

Apple has stated that they are reviewed by a human at that point.

From a legal perspective a human review prior to report is absolutely necessary to prevent the subsequent search by an agent by the government is absolutely required to prevent the search from being a fourth amendment violation. (see e.g. US v. Miller (6th Cir. 2020))

For PR reasons apple has claimed that the human review has protects people against governments secretly expanding the scope of the databases without apple's knowledge.

Because possession of child porn images is a strict liability crime, apple could not perform a second pass matching with a comparison with the actual image. They could use another, different, fingerprint as a prefilter before human review-- but if that fingerprint isn't similarly constructed the tolerance to resizing will be lost and they might have well just used sha256 over the decoded pixels in the first step and completely escaped the attack described in this thread.

From a legal perspective a human review prior to report is absolutely necessary to prevent the subsequent search by an agent by the government is absolutely required to prevent the search from being a fourth amendment violation. (see e.g. US v. Miller (6th Cir. 2020))

When reporting to NCMEC's CyberTipline, they have a checkbox: have you reviewed it? If you say "no", then NCMEC's staff will review it. If you say "yes", then they may still review it, but it's not a guarantee.

Also, NCMEC forwards reports to the appropriate ICAC, LEO, or other enforcement organization. The report includes a copy of the picture and the recipient enforcement group reviews it.

It seems like it would be much harder to create a collision that also passes a sanity test like running it through OpenAI's CLIP model and verifying that the image is indeed plausibly CSAM.

I tested it out and CLIP identifies the generated image above as generated (in fact, of the top 10,000 words in the English language that has the closest match, followed by IR, computed, lcd, tile, and canvas): https://blog.roboflow.com/apples-csam-neuralhash-collision/

When reporting to NCMEC's CyberTipline, they have a checkbox: have you reviewed it? If you say "no", then NCMEC's staff will review it.

There is currently no such checkbox: https://report.cybertip.org/submit Perhaps they had one in the past prior to adverse rulings in Federal court.

Again, see US v. Miller, if you perform a match based on an NCMEC database and send the image to NCMEC without reviewing, then the user's fourth amendment protection against warrantless searches is preserved, and NCMEC (as an agent of the government with special statutory authority) cannot lawfully inspect the image. The suppression of the user's fourth amendment rights is dependent on a prior review by the provider which is lawful under the color of their private commercial agreement. Apple's human review is a critical step in the suppression of the constitutional rights of their users, without it any subsequent review by NCMEC or law enforcement would be unlawful and the whole reporting process would be useless (except to build databases of kompromat and other such due-process-less results).

this seems to be enough to drastically change any images hash:
add borders, about 1000px larger than the main image, fill them with as much entropy as you can.
Untitled
hash: ff0dcf8b9371ebd28a4f5d2d

windows_xp_bliss-wide
hash: 9f3bce9b9d716bf399cf4f21

this process is completely lossless since i just added more pixels around the edges of the original image, and in the photo viewer, you can simply zoom in and see the original image anyway, >-<

@gmaxwell Can't Apple just take consent from a user's previous agreement to the icloud TOS/EULA? Be interested to hear if that works in a court

Again, see US v. Miller, if you perform a match based on an NCMEC database and send the image to NCMEC without reviewing, then the user's fourth amendment protection against warrantless searches is preserved, and NCMEC (as an agent of the government with special statutory authority) cannot lawfully inspect the image. The suppression of the user's fourth amendment rights is dependent on a prior review by the provider which is lawful under the color of their private commercial agreement. Apple's human review is a critical step in the suppression of the constitutional rights of their users, without it any subsequent review by NCMEC or law enforcement would be unlawful and the whole reporting process would be useless (except to build databases of kompromat and other such due-process-less results).

@gmaxwell Can't Apple just take consent from a user's previous agreement to the icloud TOS/EULA? Be interested to hear if that works in a court

obviously it wont because apple plans to use this for more than just iCloud

@deftdawg Yes, Apple's commercial relationship with the target of the search is what makes it lawful for Apple to search the user's otherwise private data. And if Apple staff searches the data and finds something the believe to be unlawful, when they pass it onto an agent of the government Apple's search can lawfully be "repeated" without a warrant. But if Apple didn't review the material, and just continued matching against some government provided database, then when they hand the content over to a government agent a warrant would be required to inspect the image.

In any case, this is getting pretty far afield. The point I was making, which I don't think anyone has seriously disputed, is that ideally an attack would manage to create second-preimage-images which are visually similar to an arbitrarily selected image. A noise image like has been constructed so far is an interesting first step and could potentially cause problems, but it's not as powerful of an attack as an arbitrary benign-looking image which is a hash match against a specific fixed entirely different image.

Edit: E.g. see this HN comment where someone is being distracted by the fact that the current POC example image is random noise: https://news.ycombinator.com/item?id=28222041 I think it's an example for why the public discourse would be improved with second-preimage-images that looked like natural-ish images.

@gmaxwell, getting back to the security issues of how this system can be used to attack victims. Given that the network here is neural network based in nature, it seems implicitly possible to also guide collisions towards an arbitrarily selected image.

If there are millions of images in the CSAM database, it would be almost certain that there are close-ups of body genitalia. As CSAM covers everything up to 17 years and 364 days old, and the law treats it the same, then there would be images that are CSAM and illegal, but visually very similar (at least to a human inspecting a visual derivative) to images that are adult and legal.

With this in mind, I propose a different and perhaps more sinister attack:

  1. Attacker A goes on the dark-web and collects known-CSAM, particularly those that are close-ups of genitalia, and do not visibly appear to be 'child porn'; just normal porn. Attacker A gets this list of hashes, and sends it to attacker B.

  2. Attacker B receives a list of 96-bit hashes, and then finds (or produces) legal amateur pornography of 18+ consenting adults that are also similar close-ups of genitalia. Attacker B never works or interacts with CSAM.

  3. Attacker B disturbs the legal close-up images, until they match the neural hashes provided by Attacker A.

  4. The disturbed legal images get sent to the victim's device via some means.

  5. Apple's CSAM detection will then flag a neural hash match, and once the threshold is reached, Apple will be alerted. An Apple employee will then review the 'visual derivative', and they will see body genitalia. At this point, they will have a legal obligation to report and forward this to law enforcement.

  6. Law enforcement, after seeing 30 matches of close-up pornography that perceptually hashes to CSAM, may raid the victim's place, despite the fact that no CSAM material was ever sent or interacted with by Attacker B and the victim.

In this scenario:

  • Attacker B does not have to ever interact with CSAM in any way, shape, or form, however can generate images that will fool both Apple's systems, and the law enforcement / judicial system, into imprisoning the victim.

  • Should Attacker B be caught, as they never were in procession of any CSAM material in the first place, and has only taken the act of sending legal pornography that is CSAM-matched, it is debatable what criminal exposure (if any) they have.

  • However, the victim on the other hand, is under the full wraith of child pornography possessions charge.

  • This would be an example of a "low-risk, high-impact" attack, where you can get someone in jail for possessing "CSAM", without ever touching CASM yourself.

There is currently no such checkbox: https://report.cybertip.org/submit Perhaps they had one in the past prior to adverse rulings in Federal court.

Here's a screenshot from the cybertipline's current reporting form. (The checkbox has been there for years...)
It's the checkbox that says "File Viewed By Company".
ncmec

this seems to be enough to drastically change any images hash:
add borders, about 1000px larger than the main image, fill them with as much entropy as you can.
...
this process is completely lossless since i just added more pixels around the edges of the original image, and in the photo viewer, you can simply zoom in and see the original image anyway, >-<

You're losing sight of what NeuralHash claims to do. It is NOT a content identification or sub-picture search. It is a "perceptual hash". Think of it this way: If you print out both photos, scaled to the same size, and mount them on a wall 20 feet away, would a human say they look the same?

Your "add a thick border of noise" would be a resounding "no, they look different because one has a big noisy border." And lo and behold, the hashes are very different.

A significant crop? "No, one has been cropped."

Flip? Rotate? "Nope, they look different because one has been flipped and/or rotated."

Try this: Take the dog photo, draw a tiny mustach on him, and check the hash. A minor alteration should result in little or no difference in the hash.

The things to test:
How much of a difference must be made to make the hash signifiantly different? (Lots of bits different.)
Does the amount of change needed vary based on where in the picture it occurs?
How much cropping off any each edge? Is it symmetrical or asymmetrical?
Can we associate specific bits in the hash with a specific type of alteration?

Here's a fun one that I'd like to see someone test:
Start with the dog. Erase (not crop) the left side. Do half of the bits remain the same?

Picture "AA" has hash "aa". Picture "BB" has hash "bb"
If I create a side-by-side splice "AB", do I get the hash "ab"? If so, then we can easily force a hash collision.

Can you verify that these two images collide?
beagle360
collision

Here's what I see from following your directions:

$ python3 nnhash.py NeuralHash/model.onnx neuralhash_128x96_seed1.dat beagle360.png
59a34eabe31910abfb06f308
$ python3 nnhash.py NeuralHash/model.onnx neuralhash_128x96_seed1.dat collision.png
59a34eabe31910abfb06f308

Both white noice images are the same and I can't see any differences?! Somebody is seeing dog on the 1st photo...

I generated another working collision:

Target image
lena
32da5083d4f9db9d45b4c397

Generated image
download-60
32da5083c4f9db9d45b4c397

Lena: 32da5083d4f9db9d45b4c397
 Dog: 32da5083c4f9db9d45b4c397

@kjsman I ran those images through CLIP as described here and the dog image is still identified by CLIP as generated (but it does see the dog as well as the 3rd most similar word).

Top matching English words according to CLIP for each of those two colliding images (and their feature vector's cosine similarity with the image's):

Lenna:

britney 0.2710131274330054
hat 0.2695037749031152
caroline 0.267320863072384
hats 0.261636163397781
catherine 0.2614189858805191
judy 0.25487058977059807
claire 0.2535490374489525
heather 0.25337776325078754
sapphire 0.25308272433397716
blake 0.2522795140967653

Dog:

generated 0.2695470207125587
regard 0.26522406265702764
dog 0.2647775811672407
shepherd 0.26429458470715683
vincent 0.26314263493135504
lycos 0.26132000000095534
marilyn 0.2608674788806449
lion 0.2595975315789243
face 0.25944818166185624
gnu 0.2579544381392359

@kjsman Awesome! That's closer! Attacks only get better.

@hackerfactor Thanks for the screenshot! it appears that particular form isn't something the public has access to (or at least I can't find it). It's not on the form I linked to. Regardless, the case law is now completely unambiguous. If the NCMEC is getting scanning results from companies who have not actually inspected the image and believe it to be child porn on the basis of their inspection, then the NCMEC's subsequent review is a search by an agent of the government and requires a warrant. It seems really foolish to have that checkbox there, a footgun to let child predators go free. (and as much as I disapprove of the warrantless searching of user's private files, regardless of whos doing it-- I find it really disappointing to find that once the invasion has happened they're not doing everything to make sure charges can actually be filed)

lena-output

python3 nnhash.py model.onnx neuralhash_128x96_seed1.dat lena-output.png 
59a34eabe31910abfb06f308

59a34eabe31910abfb06f308 (Dog Hash)

You are aware that hashes aren't conventional hashes, right? It's just a series of comma separated numbers.

@unrealwill that one at least doesn't match generated higher than other word in the dictionary, but CLIP still picks it out quite highly (10th most relevant word in the dictionary):

CLIP feature vector cosine similarity with that image & English dictionary words:

lcd 0.2685186813767101
cover 0.2656788641452026
duration 0.2610448567100487
banner 0.2610146163956426
ebook 0.2607660721001148
pixels 0.2596127825121441
dvd 0.25817185406107845
poster 0.258116692755925
gif 0.25807298663266715
generated 0.25786197122020243

Is it possible to steer it away from CLIP's generated feature vector as you permute from the base?

impose a constraint that would get rid of the the high frequencies in the image. such as an l1 norm in the DCT domain.

Adding an L1 DCT penalty should result in much more natural images generally, not just getting rid of the HF noise. Just reiterating this good suggestion.

@yeldarby Is it possible to steer it away from CLIP's generated feature vector as you permute from the base?
yes you can steer it away easily, just add some term to the loss function (But obviously you will need to compute CLIP every iteration so it will take more time to converge)

The L1 DCT penalty is probably a good suggestion, but you can probably also use the discriminator of a GAN network to maintain naturalness.

To give an idea of the time complexity :
My above picture was generated using a laptop CPU only in less than 5 minutes using non-optimized code.
Starting from a given image + basic gradient descent + clipping image to its [-1,1] domain at each iteration (not even lbfgs yet)
Optimized code would probably run ~100x faster. So there is plenty of room to add quality.

Sorry for the NSFW, but it was to show that you can produce a soft-porn image of any given hash, that a user can know it's perfectly legal and have legitimate reasons to store, but a manual reviewer will have more problem to know the age of the person involved.

Put yourself in the shoes of the manual reviewer presented with a nude picture with a "matching CSAM " and ask yourself from the grainy picture that looks like a home-made scanned slide whether you can determine if the girl is underage or not and if you should report the picture or not.

I'm convinced you can take almost any image and smoothly transition it to very close to any other image while retaining its original perceptual hash value. The properties of what a perceptual hash needs to do practically guarantee it.

If you think of the space of all possible images of a given height / width, the perceptual hash should divide that space up into clustered continuous chunks where locality is determined by tiny edits. An obvious way to define the space is as one dimension for each color of each pixel with values from 0 to 255. The issue is that as we increase the number of dimensions, these clusters of identical hashes are more forced to stretch to fill that space.

To get some intuition on this: consider as you increase dimensions, the "volume" of an n-dimensional sphere divided by the "volume" of the corresponding n-dimensional cube approaches 0. This is concerning because the sphere is the best possible packing you can get, in other words the ability of well packed objects to fill space gets worse as dimensions go up. Consider that even a tiny 360 by 360 image would represent a huge 388,800 dimensional space. Instead of getting nice clusters, we're getting a kind of interwoven series of sponges for each hash value which completely fills the space, which is admittedly hard to visualize.

Not being able to tightly pack these hash values together defeats the main goals of the perceptual hash (or any perceptual hash?) to have similar images give similar values. I'm about a third of the way done with turning a picture of samwise into a bowl of frootloops with the identical perceptual hash.

@unrealwill I used the code in your gist and swapped the actual model and hashing function but got nowhere near the performance described. Any chance you could share the updated code?

@yeldarby Is it possible to steer it away from CLIP's generated feature vector as you permute from the base?
yes you can steer it away easily, just add some term to the loss function (But obviously you will need to compute CLIP every iteration so it will take more time to converge)

@unrealwill can you maintain the same NeuralHash while doing so though?

@mcdallas Here is an example implementation that I hacked together in a couple hours: https://github.com/anishathalye/neural-hash-collider It doesn't have any particularly fancy features, just uses vanilla gradient descent, and the loss currently doesn't have a term for distance from starting image, but it seems to work all right on some examples I've tried so far. It seems a bit sensitive to some parameter choices.

For example, starting from this cat picture I found online, the program finds a mutation that collides with the dog in the original post above, taking about 2.5 minutes on a 2015-era CPU:

x

$ python3 nnhash.py adv.png
59a34eabe31910abfb06f308

It's all well and good for apple to run images through CLIP and discard the generated ones, but consider @KuromeSan's suggestion:

add borders, about 1000px larger than the main image, fill them with as much entropy as you can.

CLIP would surely identify this modified image as generated.

@thecadams It's absolutely trivial for someone with interdicted images to reliably evade the hash match with trivial modifications, even simpler and less visually obnoxious than that. So their threat model must not assume that the child porn peepers are actively trying to evade detection.

Given that, one might wonder what they used this vulnerable ML hash instead of just sha256ing a downsampled and quantized copy of the decompressed image-- getting a slight amount of robustness to scaling/recompression but avoiding any real risk of generating second pre-images. It's hard to resist be uncharitable and suggest that someone might have been justifying headcount for a low value high buzzword factor enhancement that ended up adding a vulnerability.

Yes! I can confirm that both images generate the exact same hashes on my iPhone.

Sorry if I've missed something (I'm a bit of a brainlet), but this makes it sound like you're hashing on your iPhone somehow, rather than using the extracted model on a computer.
Is there currently any knows way of generating hashes directly on a (presumably jailbroken) iPhone using Apple's own implementation?
Sorry if this is a dumb question, but I'd be interested in tinkering around with the device side hashing implementation if it's already there somewhere.

Yes! I can confirm that both images generate the exact same hashes on my iPhone.

Sorry if I've missed something (I'm a bit of a brainlet), but this makes it sound like you're hashing on your iPhone somehow, rather than using the extracted model on a computer.
Is there currently any knows way of generating hashes directly on a (presumably jailbroken) iPhone using Apple's own implementation?
Sorry if this is a dumb question, but I'd be interested in tinkering around with the device side hashing implementation if it's already there somewhere.

https://github.com/KhaosT/nhcalc

You can port the code to iOS, it's trivial.

"check this cute dog photo, set it as your wallpaper"
"ok"
5 hours later
"why am I in jail"

really hope the above doesn't happen and Apple can actually moderate

@mcdallas I used the code in your gist and swapped the actual model and hashing function but got nowhere near the performance described. Any chance you could share the updated code?

Your wish is my command :

https://gist.github.com/unrealwill/d64d653a7626b825ef332aa3b2aa1a43

Above is a script to generate collisions.
It has some additional amelioration with regard to my previous more general script.

Most notably Sci-py optimizer, applying to the real weights, and an optional alternative dual loss function.

Some images may take longer to converge, but other can be really quick.
It takes between 10 sec and 10 minutes using CPU only.

Jesus

"check this cute dog photo, set it as your wallpaper"
"ok"
5 hours later
"why am I in jail"

really hope the above doesn't happen and Apple can actually moderate

I would not trust my personal freedom to be moderated by a corporate entity.

"check this cute dog photo, set it as your wallpaper"
"ok"
5 hours later
"why am I in jail"
really hope the above doesn't happen and Apple can actually moderate

I would not trust my personal freedom to be moderated by a corporate entity.

This. This entire thing shouldn't happen in the first place.

"check this cute dog photo, set it as your wallpaper"
"ok"
5 hours later
"why am I in jail"
really hope the above doesn't happen and Apple can actually moderate

Oh good heavens. If you would bother to read all of the documentation that Apple has produced about the system, then you would know that a) these generated collisions are useless because you need the hashes to generate them and the CSAM hashes aren't available to users and b) flagged accounts are moderated.

In discussion earlier, someone devious pointed out it would be relatively easy to find sample CSAM on a low-level scumbag fest via your local dorknet. One common enough it is probably known by the fuzz, then get the hash from that.

Hi all. I can now generate second-preimage images that don't look like a glitch-art horror show.

They still aren't perfect, but I'm totally new to all these tools and only spent a bit over an hour on it. It took me more time trying to get onnx/python/tensorflow working than I spent breaking the hash, so I didn't get around to trying anything too fancy. I found essentially everything I tried more or less worked, so I'm confident that with some more intelligence much better can be done.

The way I got this result is to apply the same iteration as @anishathalye: taking the derivative of the apple hash function and applying it to the image, but I also perform noise shaping by using a gaussian high-pass filter on the error signal relative to the original image and then feeding that back in. The appearance is pretty sensitive to the feedback schedule so I twiddled until I got a good result. In this case I started with an initially high feedback decreasing hyperbolicly and then once it hit a hash hamming distance of zero I had it switch to a control loop to try to exponentially increase feedback while the distance was zero and decrease feedback while it was above zero. Then I picked the diff=0 image that had the lower PSNR vs the input (which wasn't the last one, because none of my objectives were MSE).

Maybe now that I've shown it can be done, someone who knows these tools will do it better. I'm sure bolting on a proper optimizer would at least make it require less fiddling to get an okay result, and something smarter than a dumb high-pass filter will assuredly get better result even with this simple noise shaping approach. The next noise-feedback I would want to try is to apply an 8x8 jpeg-like dct on the noise, quantize it and round down the coefficients, and feed that back in.

image

$ python3 ./nnhash.py ./model.onnx ./neuralhash_128x96_seed1.dat /tmp/a.png
59a34eebe31910abfb06f308

(matches the dog at the top of the thread.)

Input image:

image

Feel free to use this image in further examples. I was tired of unimaginative people who thought images of dogs were an irrelevant consideration, and my partner volunteered a mid-80s picture of herself.

@gmaxwell my version (lost in the thread above) ^^ has scipy optimizer lbfgs-b https://gist.github.com/unrealwill/d64d653a7626b825ef332aa3b2aa1a43 working, It should help you get familiarized with the tooling.

@gmaxwell - that version appears genuine to CLIP. Similarity to generated is only 0.19051728 (top matches: costume, britney, dress, hs, costumes).

But the cosine similarity between your image and the original dog photo is only 0.4458430417488104 (compared to 0.7915248896442771 for this arbitrary image of a dog I pulled from Google).

If their server-side model truly is independent and has a different hash database, you'd need to create a single adversarial image that fools both. Not sure how easy this is; has anyone tried it or seen any studies?

Feel free to use this image in further examples. I was tired of unimaginative people who thought images of dogs were an irrelevant consideration, and my partner volunteered a mid-80s picture of herself.

Be careful: your partner's childhood picture may become the new Lena.

I found two pairs of naturally occurring collisions in the ImageNet dataset. I created a repo to track these organic NeuralHash collisions (please do not submit adversarially generated pairs to this repo; only examples found in the wild).

I wrote up the findings here as well: https://blog.roboflow.com/nerualhash-collision/

Has anyone tried creating a universal perturbation? That is, a perturbation that when be added to a non-trivial amount of images causes this model to hash them to the same value? That would be an interesting datapoint to have from a robustness perspective.

The other interesting data point would be an image that hashes to an some interesting value. I had initially tried generating an image that would hash to the zero vector, but had trouble flipping some of the bits. I also tried things like 0xdeadc0de and 0xbaadbeef with no luck. Would be interesting to understand this! I imagine seed0 may have something to do with this, but I haven't really given it much analysis.

Be careful: your partner's childhood picture may become the new Lena.

I'm not concerned. It's obviously a bit different. In Apple's case... literally: these images have a single bit difference in their neuralhashes:

image
image

$ ./nnhash.py model.onnx neuralhash_128x96_seed1.dat 130190095-1c83424f-fb50-4586-a9bf-f659d8094250.png
32dac883f7b91bbf45a48696
$ ./nnhash.py model.onnx neuralhash_128x96_seed1.dat 130190121-5f452c06-e9a8-4290-9b8f-d300ead94c13.png
32dac883f7b91bbf45a48296

but had trouble flipping some of the bits.

Indeed, they're far from uniformly easy to flip. From my experience producing collisions, I expect that no single universal perturbation exists, but suspect there are small numbers of distinct clusters of images where its easier to produce collisions between them (or apply a specific adversarial perturbation for them), but harder across groups. I say this based on the fact that there are some image pairs that I've found it extremely easy to produce collisions between them, and other pairs where I can get them as close a 1 bit different but not collide (as above).

If I were trying to produce many adversarial images I'd want to try to be flexible with my starting images.

I found that if I sample a region around the current working image and average the gradients before taking an update, in addition to the noise shaping, I get a less noisy and better looking result. The results also appear to be more robust-- they keep their hashes better even with compression noise etc.

So I updated my earlier example preimage of the dog hash, the girl/lena near collision was already using these techniques (which is why those looked better).

One thing I thought I should make clear: I created this image without ever using the dog image (or even downloading it, until I fetched it for comparison after the fact). I used only dog image's hash (and the original picture of the girl, of course).

image
image

$ ./nnhash.py model.onnx neuralhash_128x96_seed1.dat 129860794-e7eb0132-d929-4c9d-b92e-4e4faba9e849.png
59a34eabe31910abfb06f308
$ ./nnhash.py model.onnx neuralhash_128x96_seed1.dat 130296784-bdf329d0-06f6-454e-ba87-d83b010614a7.png
59a34eabe31910abfb06f308

This is not only a collision, it is a pre-image, which breaks the algorithm even more.

Collison:
Find two random images with the same hash.

Pre-image:
Find an image with the same hash as a known, given image.

I don't see how that differs at all. Basically to match two random image hashes you have to have both as known given images before hashing them.

I'm convinced you can take almost any image and smoothly transition it to very close to any other image while retaining its original perceptual hash value. The properties of what a perceptual hash needs to do practically guarantee it.

If you think of the space of all possible images of a given height / width, the perceptual hash should divide that space up into clustered continuous chunks where locality is determined by tiny edits. An obvious way to define the space is as one dimension for each color of each pixel with values from 0 to 255. The issue is that as we increase the number of dimensions, these clusters of identical hashes are more forced to stretch to fill that space.

To get some intuition on this: consider as you increase dimensions, the "volume" of an n-dimensional sphere divided by the "volume" of the corresponding n-dimensional cube approaches 0. This is concerning because the sphere is the best possible packing you can get, in other words the ability of well packed objects to fill space gets worse as dimensions go up. Consider that even a tiny 360 by 360 image would represent a huge 388,800 dimensional space. Instead of getting nice clusters, we're getting a kind of interwoven series of sponges for each hash value which completely fills the space, which is admittedly hard to visualize.

Not being able to tightly pack these hash values together defeats the main goals of the perceptual hash (or any perceptual hash?) to have similar images give similar values. I'm about a third of the way done with turning a picture of samwise into a bowl of frootloops with the identical perceptual hash.

So one attack vector for a Perp of Mass Deception (PMD) would be to create actual csam then create an identical hashing image that looks like something totally different and spread that second image widely to create an infinite stream of false positives. Basically if this is done for each csam image produced it renders this detection method totally moot.

And... looks like it was done #1 (comment) , #1 (comment) ,#1 (comment)

Actually according to that second linked post above, a typical computer could produce dozens of random matching images per hour. So you could easily create a dozen false positive images for every csam image you produce, making this detection method by apple totally useless and will cost apple tons of money for hiring people looking for matches.

Another attack vector is the PMD modulates thier CSAM images multiple times so each csam image has multiple hashes. Another way to throw a wrench in this detection method making everyones jobs much harder. Each image could have infinite perturbations making the database balloon to an impossibly large size from just perturbations of the same CSAM image. Also the hashes stored on the apple device are much larger than the actual neural hashes and so would take up infinite space on each device.

This is not only a collision, it is a pre-image, which breaks the algorithm even more.
Collison:
Find two random images with the same hash.
Pre-image:
Find an image with the same hash as a known, given image.

I don't see how that differs at all. Basically to match two random image hashes you have to have both as known given images before hashing them.

Because this 'hash' function is entirely differentiable you can easily modify existing images to match an arbitrary hash without ever having seen the other image that the arbitrary hash matches yourself. There may not even be another image that exists.

Collisions modify both images (I gave an example with the lena/girl pair above), so they're fundamentally easier to compute. Not only do you have to have access to both images you have to modify them. If only collisions were possible than I couldn't make false matches against their CSAM database except by distributing images that get included with the database. Since preimages are possible, I can construct matches if someone gives me a list of hashes which are likely to be in it.

So your disruptive attacker that distributes images all over the place doesn't need to create or distribute abuse images, though the attack you describe would also work.

Perhaps a better version is this: Imagine you are an authoritarian regime that dislikes certain ideological factions or ethniticities. You identify images which may be possessed by participants in those groups. You then obtain child porn images (very easy as an authoritarian regime, you even command the creation of new images if it'll work better) and then modify the child porn images to have hashes matching the political images that you're targeting. You place these hashes in your child abuse image database and also forward on the abuse images to other governments and NGOs and they will enter them into their databases with plausible deniability (or not even knowing at all), mooting any requirement that multiple reporting groups must have the hashes in question.

This attack can still work even when there is a second non-public perceptual hash, assuming that it's also of a flawed design like Apple's published hash function: as a state actor you have access to this second hash function (as you needed to generate the hashes, as parties are not sending child porn to apple). So you can generate images which match with both functions.

This is not only a collision, it is a pre-image, which breaks the algorithm even more.
Collison:
Find two random images with the same hash.
Pre-image:
Find an image with the same hash as a known, given image.

I don't see how that differs at all. Basically to match two random image hashes you have to have both as known given images before hashing them.

Because this 'hash' function is entirely differentiable you can easily modify existing images to match an arbitrary hash without ever having seen the other image that the arbitrary hash matches yourself. There may not even be another image that exists.

Collisions modify both images (I gave an example with the lena/girl pair above), so they're fundamentally easier to compute. Not only do you have to have access to both images you have to modify them. If only collisions were possible than I couldn't make false matches against their CSAM database except by distributing images that get included with the database. Since preimages are possible, I can construct matches if someone gives me a list of hashes which are likely to be in it.

So your disruptive attacker that distributes images all over the place doesn't need to create or distribute abuse images, though the attack you describe would also work.

Perhaps a better version is this: Imagine you are an authoritarian regime that dislikes certain ideological factions or ethniticities. You identify images which may be possessed by participants in those groups. You then obtain child porn images (very easy as an authoritarian regime, you even command the creation of new images if it'll work better) and then modify the child porn images to have hashes matching the political images that you're targeting. You place these hashes in your child abuse image database and also forward on the abuse images to other governments and NGOs and they will enter them into their databases with plausible deniability (or not even knowing at all), mooting any requirement that multiple reporting groups must have the hashes in question.

This attack can still work even when there is a second non-public perceptual hash, assuming that it's also of a flawed design like Apple's published hash function: as a state actor you have access to this second hash function (as you needed to generate the hashes, as parties are not sending child porn to apple). So you can generate images which match with both functions.

That doesn't change the fact that it is just as easy to "find" two images that match hashes as to "find" an image whose hash matches the hash of a known image. As people have shown above it takes just minutes to do either.

Perhaps a better version is this: Imagine you are an authoritarian regime that dislikes certain ideological factions or ethniticities. You identify images which may be possessed by participants in those groups. You then obtain child porn images (very easy as an authoritarian regime, you even command the creation of new images if it'll work better) and then modify the child porn images to have hashes matching the political images that you're targeting.

That is a very strong attack of a government on the people. If nothing other than to get people on a government database like america's version of a "suspected terrorist" list.

That doesn't change the fact that it is just as easy to "find" two images that match hashes as to "find" an image whose hash matches the hash of a known image. As people have shown above it takes just minutes to do either.

It's easier to find two images that match than it is to find an image that hashes to a known hash, even in this case. But this is always the case: even for a black box arbitrary hash function finding a collision needs sqrt() the amount of work to find a preimage, even with a totally generic attack, due to the birthday paradox.

Both attacks (preimage and collision) are easy with this hash function because it has a fundamentally weak design. For stronger hash functions like, say md5, one can now compute collisions easily but a second preimage is still unachievable. The poster you were responding to was emphasizing the point that the attack was a preimage attack rather than a collision because we all expect collision attacks to be much easier, but they're less useful for the attacker because the attacker has to influence both inputs, rather than just one.

It's easier to find two images that match than it is to find an image that hashes to a known hash, even in this case.

I just don't see it because one hash always has to be held constant when comparing it to another. You can vary both, or you can vary one twice as many times.

It depends on the type of attack, and when it is brute force - just varying inputs and comparing outputs, you can vary one or both for the same cost.

I just don't see it because one hash always has to be held constant when comparing it to another. You can vary both, or you can vary one twice as many times.

It's a surprising fact. The reason that in a generic attack a collision is fundamentally easier is because the number of things you match against increases.

Imagine that you want to find some x such that H(x) matches either H(y) or H(z) . Clearly that's easier than finding something that matches H(y) alone because you get two chances instead of one: it takes half as many comparisons on average.

So when you want to find x,y such that H(x)==H(y) you can alternate between varying x and varying y, and each time you compare against all the previously computed values, not just the most recent. Each operation the chance of success increases because you're comparing with more and more alternatives. This takes a sqrt() of the amount of operations as only changing one input. This is due to the same mathematical fact that makes it so a room with a surprisingly few number of people is likely to have at least two people sharing a birthday: When there are 23 people the odds of two sharing the same birthday is >50% even though there are 365 days in a year.

There are fancy algorithms that eliminate the storage and search costs for comparing all the prior hashes, by using a little more computation. A good place to read about memory efficient collision algorithms is the wikipedia article on cycle detection-- because if you choose the next value to try based on the hash of the prior value, finding a collision is equivalent to finding a cycle in the graph created by feeding the output of the hash back into itself.

That applies to generic attacks that work even without any knowledge of the internal structure. With the apple hash function the hash is differentiable, so we're not limited to generic attacks. In this case, it's easier to find a collision because we're actually not limited to updating one at a time. Instead, we can construct a function that computes the difference between two images hashes and use autodiff to find the partial derivatives of the difference output with respect to all of the pixels at once and modify both images at the same step. The extra dimensions gives the attack a lot more degrees of freedom.

Here are some more good looking examples: (prior examples here)

Altering Lena to match barbara:

image
image

$ .nnhash.py ./model.onnx ./neuralhash_128x96_seed1.dat 130310372-d6b2c633-c5d3-4b3e-bc12-d3b2f3104ec6.png
a426dae78cc63799d01adc32
$ .nnhash.py ./model.onnx ./neuralhash_128x96_seed1.dat 130310383-9fcc80ba-117e-4383-9177-0df24bc99f9a.png
a426dae78cc63799d01adc32

Hey @gmaxwell

What happens if you resize the image down to say 1/4 then upsample back to the original size, then run the hash again? Do they still collide?

What happens if you resize the image down to say 1/4 then upsample back to the original size, then run the hash again? Do they still collide?

They're still fairly close, e.g. using the lena/barbara above:

$ convert -scale 90x90 130310372-d6b2c633-c5d3-4b3e-bc12-d3b2f3104ec6.png x.png ; convert -scale 360x360 x.png 1.png
$ convert -scale 90x90 130310383-9fcc80ba-117e-4383-9177-0df24bc99f9a.png x.png ; convert -scale 360x360 x.png 2.png
$ ./nnhash.py ./model.onnx ./neuralhash_128x96_seed1.dat 1.png
a426dae78cc23399501acc32
$ ./nnhash.py ./model.onnx ./neuralhash_128x96_seed1.dat 2.png
a42692e78ce22b99d11ace32

I suspect the property could be met by making the search work that way. E.g. stick an upsampler on the computation graph for the model so that it works with a 90x90 image, and do the search there. Then upsample it and use it as a starting point for the full res search (or jointly test the full res and down/up path).

Is there some particular reason that this property is important? I could imagine that doing the search in a multiscale way might result in better looking images... but is there something about the apple setup that works this way?

I won't be posting any more preimages for the moment. I've come to learn that Apple has begun responding to this issue by telling journalists that they will deploy a different version of the hash function.

Given Apple's consistent dishonest conduct on the subject I'm concerned that they'll simply add the examples here to their training set to make sure they fix those, without resolving the fundamental weaknesses of the approach, or that they'll use improvements in the hashing function to obscure the gross recklessness of their whole proposal. I don't want to be complicit in improving a system with such a potential for human rights abuses.

I'd like to encourage people to read some of my posts on the Apple proposal to scan user's data which were made prior to the hash function being available. I'm doubtful they'll meaningfully fix the hash function-- this entire approach is flawed-- but even if they do, it hardly improves the ethics of the system at all. In my view the gross vulnerability of the hash function is mostly relevant because it speaks to a pattern of incompetence and a failure to adequately consider attacks and their consequences.

And this post written after: