Avoid Duplicate Submissions with SHA1

I had actually forgotten that I had done this in a project that I worked on a few months ago but came across it again today.  I figured I’d blog about it for a moment.  Not because I think it likely that too many of you, dear readers, will care deeply about my PHP tricks, but in the hopes that I’ll remember the technique and be able to find it here if I need it again in the future.

So here’s the scenario: you want to be sure that a visitor doesn’t submit the same information through a web form multiple times.  Maybe you’re worried about them refreshing the page, for example.  Regardless of the circumstances, all you have is the data they submitted and you need a quick way to try and make sure it doesn’t match earlier data.  The way I recently did that was to use an anonymous function, a handy feature that shipped with PHP 5.3 if I remember correctly, to reduce the date posted to the server to a string and then hash it.  It looked like this:

$hash = sha1(array_reduce($values, function($reduction, $value) { return $reduction.$value; }));

And that’s it.  Here’s what’s going on though:

  1. The array_reduce function takes an array and a callback function as parameters and returns a string.
    • Before we had anonymous functions you needed to use create_function or define your function elsewhere in your code.  But with anonymous functions, you can just jam it right in there.
    • That callback takes two parameters; I’ve named them $reduction and $value.  That callback is called as the array is walked and, in the code above, simply concatenates all of the array values into one long string.
    • That string is then the return value from the function.
  2. The sha1 function then takes that string and makes a hash out of it.

I suppose you could just leave the string as a string, but in case you need to jam the information into a database sometimes it’s nice to know exactly the size of the data you’re about to insert or update to optimize your table. A cursory glance at the PHP array functions didn’t seem to indicate that my code above duplicates something that PHP could do for me out of the box, so I’m pretty pleased with it.

How does this prevent duplication?  Well if you have a record of these hashes — in a database or in the Session or whatever — you can compare the hash for new information against one or more of the hashes for the old and if you find a match you can be pretty sure that you’ve got a duplicate.  Sure, it’s possible to collide but Wikipedia tells me it’s pretty unlikely to happen.

Hope it’s useful to you.  And if you’re me, then thanks me!

Comments are closed.

The Fuller Dash:

Social Stuff