Sorting for Humans : Natural Sort Order

December 12, 2007

The default sort functions in almost every programming language are poorly suited for human consumption. What do I mean by that? Well, consider the difference between sorting filenames in Windows explorer, and sorting those very same filenames via Array.Sort() code:

Explorer shell sort Array.Sort()
alphabetical sort ASCIIbetical sort

Quite a difference.

I can say without the slightest hint of exaggeration that this exact sorting problem has been a sore point on every single project I've ever worked on. Users will inevitably complain that their items aren't sorting properly, and file bugs on these "errors". Being card-carrying members of the homo logicus club, we programmers produce a weary sigh, and try to keep any obvious eye-rolling in check as we patiently inform our users that this isn't an error. Items are sorting in proper order. Proper ASCII order, that is. As we're walking away, hopefully you won't hear us mutter under our breath what we're actually thinking-- "Stupid users! They don't even understand how sorting works!"

I always felt a pang of regret when rejecting these requests. Honestly, look at those two lists-- what sane person would want ASCII order? It's a completely nonsensical ordering to anyone who doesn't have the ASCII chart committed to memory (and by the way, uppercase A is decimal 65). I never really understood that there was another way to sort, even though natural sort has been right in front of us all along in the form of Mac Finder and Windows Explorer file listings. I had language-induced blinders on. If our built-in sort returns in ASCII order, then that must be correct. It was bequeathed upon us by the Language Gods. Can there be any other way?

Kate Rhodes is a bit up in arms about our collective ignorance of ASCIIbetical vs. Alphabetical. Can't say I blame her. I'm as guilty as anyone. Turns out the users weren't the stupid ones after all -- I was.

Silly me, I just figured that alphabetical sorting was such a common need (judging by the number of people asking how to do it I'm not wrong either) that I wouldn't have to write the damn thing. But I didn't count on the stupid factor. Jesus Christ people. You're programmers. You're almost all college graduates and none of you know what the f**k "Alphabetical" means. You should all be ashamed. If any of you are using your language's default sort algorithm, which is almost guaranteed to be ASCIIbetical (for good reason) to get alphabetical sorting you proceed to the nearest mirror and slap yourself repeatedly before returning to your desks and fixing your unit tests that didn't catch this problem.

It isn't called "Alphabetical sort"; it's collectively known as natural sort. But she's right about one thing: it's hard to find information on natural sorting, and many programmers are completely ignorant of it. None of the common computer languages (that I know of) implement anything other than ASCIIbetical sorts. There are a few places you can find natural sort algorithms, however:

Don't let Ned's clever Python ten-liner fool you. Implementing a natural sort is more complex than it seems, and not just for the gnarly i18n issues I've hinted at, above. But the Python implementations are impressively succinct. One of Ned's commenters posted this version, which is even shorter:

import re 

def sort_nicely( l ): 
  """ Sort the given list in the way that humans expect. 
  """ 
  convert = lambda text: int(text) if text.isdigit() else text 
  alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
  l.sort( key=alphanum_key ) 

I tried to come up with a clever, similarly succinct C# 3.0 natural sort implementation, but I failed. I'm not interested in a one-liner contest, necessarily, but it does seem to me that a basic natural sort shouldn't require the 40+ lines of code it takes in most languages.

As programmers, we'd do well to keep Kate's lesson in mind: ASCIIbetical does not equal alphabetical. ASCII sorting serves the needs of the computer and the compiler, but what about us human beings? Perhaps a more human-friendly natural sort option should be built into mainstream programming languages, too.

Posted by Jeff Atwood
205 Comments

This discussion is funny.

The problem isn't with sorting. The problem is with the data structure. If you want a hierarchy use one. Make a tree and sort the levels of the tree.

Take this shopping list for an example:

Wheat bread
White Bread
White Milk
Chocolate Milk


It's a list. If i sort the list i'll get:
Chocolate milk
Wheat Bread
White Bread
White Milk


Whaaah /cry the user says 'cause milk is seperated and not sorted propery!
Whaaah /cry the developer says "get used to it" this is the way computers have worked since the iron age.

the problem isnt the sort, the problem is the list. What we want is a Tree:

Bread:
Wheat
White
Milk:
Chocolate
White

Then when we sort we sort each level of the tree.

When you sort the strings:

dumbuser10
dumbuser2
dumbdeveloper10
dumbdeveloper2

If you expect the result:

dumbdeveloper2
dumbdeveloper10
dumbuser2
dumbuser10

Then you are expecting the tree:

levels of the tree.

Take this shopping list for an example:

Wheat bread
White Bread
White Milk
Chocolate Milk


It's a list. If i sort the list i'll get:
Chocolate milk
Wheat Bread
White Bread
White Milk


Whaaah /cry the user says 'cause milk is seperated and not sorted propery!
Whaaah /cry the developer says "get used to it" this is the way computers have worked since the iron age.

the problem isnt the sort, the problem is the list. What we want is a Tree:

Bread:
Wheat
White
Milk:
Chocolate
White

Then when we sort we sort each level of the tree.

When you sort the strings:

dumbuser10
dumbuser2
dumbdeveloper15
dumbdeveloper3
smartdeveloper1

If you expect the result:

dumbdeveloper3
dumbdeveloper15
dumbuser2
dumbuser10
smartdeveloper1


You are expecting a tree:
dumb:
developer:
3
15
user:
2
10

smart:
developer:
1


So fricken use a tree user and developer alike.

brian on December 13, 2007 1:03 AM

omg sorry about that post, i guess i'm dumb user 2 'cause i cut and pasted all fooked

brian on December 13, 2007 1:04 AM

"natural sort groups numerical substrings"

Yes, but if "natural" means "the way humans do it", it has to do much more and handle other popular ordered substrings...

Days of week:
•tuesday
•wednesday
•thursday

Date/time:
•15-8-2004
•1-10-2004
•1-1-05

•5:00 AM
•14:00
•4:00 PM
(special tricky for 12:00-1:00 AM/PM)

Street adresses:
•9, James street
•10, James street
•10 bis, James street
•10 ter, James street
•5, Rudolph street

Product reference, or anywhere digits are not numerical, like phone "number":
-01K305
-1K205

Hexadecimal:
0001
F01A
f01b
(0-9 and A-Z/a-z in straight order)

Version:
•v1.9
•v1.10
•v1.10.1
(a period is not a decimal separator)

Sorting with or without leading article (Mr, Miss, St, "The" etc).

etc etc

And then there is the issue that some sorts depend on locale.
And then there is the issue that some strings were built with *other*, untold locale (like filenames).
And then there is the issue that some strings are ambiguous and require looking at the entire set.
And then there is the issue that some sets are fundamentally ambiguous.
And then there is the issue that since the sort depends on the set, it is not stable nor predictable.


If you want to make good human-like sort... good luck.

Musaran on December 13, 2007 1:35 AM

And in the case of the version numbering... thats a tree:

v:
--1:
----9:
------0
----10:
------0
------1:
--------0

If human want tree... human should use tree.

brian on December 13, 2007 1:42 AM

Windows Explorer in your example sorts by file name, not by file name with extension. Hence the difference.

If you apply Array.Sort() to file names only -- you would get result that is pleasant for users.

Dennis Gorelik on December 13, 2007 1:52 AM

I have both learned to use padding zeroes in the filenames, as well as written thesorting a couple of times in my profession.

Hmm.. I bet there's an ISO standard for this somewhere. If people liked communicating across borders more, they'd be less exited about localization and more into internationalization. There is, seriously, no need to have a local way or writing dates. But then again, what do you expect from people who don't use the metric system...

Peterh on December 13, 2007 1:52 AM

Musaran, I agree with you totally - natural sorting is not well-defined. I was simply describing how most natural sort algorithms are implemented.

Aaron G, funny you should say nobody needs to sort decimal numbers as a sub-string. This natural sort algorithm actually supports sorting decimal sub-strings by NOT ignoring leading zeroes (although that behaviour can be disabled):
http://sourcefrog.net/projects/natsort/
Before you say "well, that's just some random guy's sort routine", he claims that his code has been incorporated into PHP:
http://us3.php.net/manual/en/function.strnatcmp.php

Actually, this whole discussion boils down to a central issue in human factors: How Software Works (well-defined algorithms) versus What Users Want (fuzzy, ill-defined specifications).

Once AI is smart enough to figure what users really want, we'll all be out of jobs. (The code will write itself).

Will on December 13, 2007 1:54 AM

Daniel 'Dang' Griffiths:

Didn't your teacher tell you you'd go blind if you continued to "Embrace the Python".

Maybe you could try "polishing the Ruby"?

On the topic of this post:

One sort approach is never going to work in all situations. Let your users try out the sort with various filenames etc, get feedback from them, change the sort to behave as close to their expectations as possible. Easy really.

Ash on December 13, 2007 2:00 AM

Ash wrote:
"One sort approach is never going to work in all situations. Let your users try out the sort with various filenames etc, get feedback from them, change the sort to behave as close to their expectations as possible. Easy really."

------

Yeah, I totally can see how that would work with Windows Explorer.
Each time you open Explorer, Clippy could pop out of nowhere and say "You seem to be viewing a list of files. How do you like the way I sorted them?"

Will on December 13, 2007 2:06 AM

Aaron G wrote:

"As if you'd ever need to sort hexadecimal in a system with actual users. If you find yourself encountering this issue frequently, it's a strong indication of awful UI design and potentially terrible back-end design."
---------

Are you sure nobody uses hexadecimal?

What if your users are developers (e.g. Visual Studio)? Or your application is highly technical (Modbus master/slave, network packet sniffer such as WireShark, debugger, etc.)?

Will on December 13, 2007 2:12 AM

Peterh wrote: "If people liked communicating across borders more, they'd be less exited about localization and more into internationalization."

------

The funny thing is, people do it all the time. It's called the Internet.

And what do you think happens when someone posts on an "American" message board using non-American spelling/grammar (e.g. British English treats companies and other groups as plural nouns) or non-American date conventions (e.g. DAY/MONTH/YEAR instead of the American MONTH/DAY/YEAR). They get flamed. People yell at them for using "bad grammar" or "not knowing how to write dates".

Of course, I'm sure this also happens when Americans post on non-American websites.

The point is that people hold on to their little cultural conventions and idiosyncracies as if they were authoritative and universal.

Good luck on shifting the focus from "localization" to "internationalization".

Will on December 13, 2007 2:25 AM

Hey Jeff, first time commenting. :)

Great blog, your recent blog about the 80/20 actually inspired me to start my own blog.

Saw this one and have to give it my own try:
Quick Java Comparator:
http://simplesql.tigris.org/servlets/ProjectDocumentList?folderID=0

Cheers!

Eric B on December 13, 2007 2:26 AM

@Tom:
"What? Really? Hex?
Not only do only only you and DEAD people know Hex, but most of them don't use it that regularly."

That's not the point. The point is what is the natural order for hex strings? The 'natural order' is under defined, you can't have it in the language (or in its standard library).
Only the programmer can decide what is the natural order. No, the user can't, the user sees only what she'd like you have to take into account all your users.

E[X] on December 13, 2007 2:35 AM

@ Aaron G:

"If some of you are really so concerned about this obscure edge case, put in an heuristic that checks for a string composed of only decimal digits and same-case letters A-F, with at least two characters from each category. "

Oh yes... right... lets put there an heuristcs so that we use two different ordering depending on the elements of the set. Wait... what appens if only some strings are hexadecimal?

"Jesus guys, it IS important to consider edge cases, but that doesn't mean you should scrap the whole idea if a few of those edge cases will produce incorrect results. You just make sure that you've defined this behaviour, and set expectations accordingly"

No, this is the problem. A lexicographical ordering is the same everywhere. The 'natural order' is not, you can't have it as a function in the programming language, it depends on the domain.

If you want to show a sorted list of strings to the user you have to choose one ordering. If, depending on the domain, there is an ordering that is *always*correct* you should implement that. If the list of strings is general enough you have to choose an ordering and screw some of your users. The lexicographical ordering has the undeniable advantage of screwing the users in a predictable fashion with a solid mathematical background.
The natural order is just something you pulled off your @ss, subject to change when you find a "bug".

E[X] on December 13, 2007 2:46 AM

E[X], well whether you like it or not, strnatcmp seems to be standard in PHP:
http://ca.php.net/strnatcmp

Just because "natural sorting" is ambiguous, doesn't mean someone can't make up a "standard" for it.

Will on December 13, 2007 2:47 AM

@Will
well PHP is also the language where "no one needs namespaces", calling that a *standard* is quite a bit far fetched. Is just the way they think it should be.

E[X] on December 13, 2007 2:52 AM

E[X] wrote: "well PHP is also the language where "no one needs namespaces", calling that a *standard* is quite a bit far fetched. Is just the way they think it should be."

-------------------

E[X], fair enough.

Natural sorting has its place (e.g. sorting lists of numbers, if the list is in text form, for some reason). Lexicographical ordering also makes sense in a lot of places. Unfortunately, as this blog post and many of its replies show, most people do not even understand the concept behind lexicographical (uh, sorry, should I say "ASCIIbetical") ordering, so good luck explaining to the end-user why his digital photos are listed in the "wrong order".

I agree with the sentiment that computer software should strive to do "what the user wants". I disagree with the knee-jerk reaction that developers and computer scientists are stupid because the language doesn't always help the developer do what the users want.

Will on December 13, 2007 2:57 AM

Leading zeros are your friends.

mpbk on December 13, 2007 3:09 AM

PHP has a built in natural sort function as well: http://be2.php.net/natsort

Would be nice to have it built in .NET as well.

David Cumps on December 13, 2007 3:10 AM

I don't see how this would actually be a "sorting" problem for .NET. Nothing should be wrong with how .NET sorts, but rather you would need to provide it a custom comparer. You can still use the same Sort method, just provide a comparer that can tell if "z2" is greater than "z10" or not.

John Chapman on December 13, 2007 3:18 AM

Valid point.

Natural sort would increase productivity across America. (White House paying attention?)

Rightnow, workaround is the best:

z001.txt
z011.txt
z100.txt

Kalman Toth on December 13, 2007 3:18 AM

You're absolutely right, Jeff - there's no reason that there shouldn't, somewhere in the libraries for .NET or Java, be a natural sorting algorithm. I know that in Java there's more than enough locale specific functionality that the argument of different character sets or sorting rules (someone in one of the links mentions that, in Sweden, 'V' and 'W' sort together) doesn't really hold water. Besides, it's Sun and Microsoft - don't they have motivated interns to do that sort of work?

BrotherBeal on December 13, 2007 3:25 AM

Though I don't disagree with the theme that we should do what we can as programmers to reduce cognative friction, make lives easier, and exceed user expectations, I have to wonder, what exactly is coming out unsorted here? Obviously the file names are a contrived example, because let's be honest, no (rational) human would name files with such a scheme if they ever intended those files for human consumption.

I can think of only a few situations (though admittedly I may be missing some) where sorting ASCIIbetically produces counter-intuitive results. For example, numbers/dates tacked on to the beginning (or end) of file names as an ad hoc method of version control. Maybe, a user ID number (perhaps an integer) that's used as a primary key in a database might sort out of order. In answer to these "incorrect" sorts, I would argue that the users are not using the proper tools (perhaps because we as programmers haven't supplied them, or due to other reasons such as cost).

My point is that the types of things that tend to sort poorly in ASCIIbetical are the type of things that we as programmers should be abstracting away in our software so that users don't have to consider whether something's sorted "incorrectly." Certainly there will be situations where abstracting such things is infeasible, or unnecessary, and such a "natural" sort might be appropriate, but when that happens, write the sort, port it out to a library assembly (especially now with .NET 3.0 extension methods this becomes tractable), and get on with it. If you come across a situation where it's necessary later, link the library into your assembly.

Just don't put the cart before the horse Jeff.

Michael Starke on December 13, 2007 3:26 AM

Leading zeros are your friends.

because let's be honest, no (rational) human would name files with such a scheme if they ever intended those files for human consumption.

We are thinking as programmers here, ask any normal person to write a list of file names and he/she will write
photo1
photo2
photo3
photo..
photo10
photo11

Natural sort is the way!

Edddy on December 13, 2007 3:34 AM

If you want to sort like Explorer, you are apparently supposed to use StrCmpLogicalW. See: (http://msdn2.microsoft.com/en-us/library/bb759947.aspx) and (http://www.pinvoke.net/default.aspx/shlwapi/StrCmpLogicalW.html). I think it is only suited to sorting filenames though, because it may not take account of sorting properly for different languages/cultures.

Chris L on December 13, 2007 3:35 AM

I'm sorry, but I don't agree. Let's call it alphanumerical.

z100 should always go BEFORE z2, the same way that table goes before tar.

Also, anyone naming their files *name*-*number* and not putting in the extra digits should not order by name, but by date.

And I take offense at Kate Rhodes' comments. It's yet another case of stupid users.

paketep on December 13, 2007 3:38 AM

Off the top of my head...

Can't you just treat the whole "name" as a base-62* numbering system?

The number line would look something like:
0 1 2 3 4 5 6 7 8 9 a A b B c C d D e E f F g G h H i I ... etc

Convert to decimal in the usual way so:

z1 = (61 * 62) + (1) = 3783
z2 = 3784
z10 = (61 * 62^2) + (1 * 62) + 0 = 234546
z11 = 234547
z100 = (61 * 62^3) + (1 * 62^2) + (0 * 62) + 0 = 14541852
z101 = 14541853

That seems to work.

Downsides are that you need to be able to handle very large numbers to get support a useful filename length and base-62 only covers alphanumeric, you'd need a much bigger base to cover punctuation and unicode.

(Apologies if my maths is a bit wonky on the above. It's before lunch and my coffee isn't kicking in. But I think the principle is sound.)

Graham Stewart on December 13, 2007 3:39 AM

We are thinking as programmers here, ask any normal person to write a list of file names and he/she will write
photo1
photo2
photo3
photo..
photo10
photo11

Why are they choosing file names like that though? Could it be because we've failed as developers to provide them tools to give their files more descriptive names?
Why not name those same photos:
birthday cake.jpg
football cheerleader.jpg
Pam smiling.jpg
zoo animals.jpg

That certainly seems more like the kind of names a user would pick. I would wager if you asked a person if they'd rather have files named rationally with a description of their content, rather than some arbitrary numbering scheme, just about everyone would rather have descriptive file names.

Let's solve the root of the problem before so that we don't have to continually solve the symptoms.

Michael Starke on December 13, 2007 3:41 AM

My Ruby version of Dave Koelle's algorithm:
http://rikkus.info/arch/sensible_sort.rb

Rik Hemsley on December 13, 2007 3:43 AM

Leading zeros are your friends

How many leading zeros?? 3, 4, 10??

Then you have to type them in to access whatever the thing is you've indexed with them - really unfriendly when there are more than about 3.

More thought required methinks.

Why are they choosing file names like that though?

How many keyboards have you ever seen on a digital camera?

Francis Fish on December 13, 2007 3:46 AM

Certainly computers should always do what the user wants, but in this circumstance I doubt that you would find a consensus amongst users at what the proper order should be anyhow.

Appending numbers to the end of filenames is a dead give away that the file was produced by a machine and in that case the name itself is not relevant. I mean if every single one of your digital camera pictures start with “PIC” and you had dozens of folders with PIC1.JPG then putting them together and sorting them will just end up with a nonsensical order anyhow.

Sorting them by the date would be much more meaningful to the user.

I also agree with one of the earliest posters who said this wasn’t a sorting problem at all but a compare routine problem that the sort program calls.

Davide on December 13, 2007 3:47 AM

How many keyboards have you ever seen on a digital camera?

Well, duh! The camera has no way of knowing if you're taking a picture of a pumpkin or a polar bear, and without a keyboard you cannot tell it. But most digital cameras these days come with a suite of software that you use to sink the photos from the camera to the computer. Why is that software not gently prodding those users to provide descriptions of the content of those files so that an intelligible value can be used for the file name? If a user chooses not to provide those descriptions, then they have to remember that DSC01722.jpg is the picture of Pam from the office party. If that's what they want to do, you can't stop them. But you can put the right tools in the hands of users who do want the feature.

Michael Starke on December 13, 2007 3:57 AM

Cocoa has numerical comparison as follows: [someString compare:someOtherString options:NSNumericalSearch];.

It's trivial to implement a function for NSArray's sortUsingFunction:context: that uses it, or a category on NSString to add a -numericalCompare: method to be used with sortUsingSelector:.

millenomi on December 13, 2007 3:57 AM

Anyone coding for European languages will have certainly come across a variant of this where bugs crop up complaining that accented characters (like ) end up at the bottom of a sort instead of next to their a-z equivalents. It gets even more fun when you start considering compound letters like ''.

To the poster who suggested treating the name in base-62, what you're essentially doing is sorting by string length first, then alphabetically - not addressing the core problem, e.g. if you filenames were all the same length (z1.., z10., z100) the system falls down.

Dino on December 13, 2007 3:58 AM

Davide wrote: "Certainly computers should always do what the user wants, but in this circumstance I doubt that you would find a consensus amongst users at what the proper order should be anyhow."
--------

No kidding, that's sort of the point of this whole discussion.

The problem is that the "human factors" crowd is flaming the "stupid developers" because programming languages don't have a built-in "natural sort" function. Look at the excerpt of the blog Jeff posted:

"Jesus Christ people. You're programmers. You're almost all college graduates and none of you know what the f**k "Alphabetical" means. You should all be ashamed."

This idiot doesn't even realize that computers DO sort in alphabetical order. The issue is that the ordering is lexicographical, not that it isn't alphabetical.

BTW, human beings produce numbered lists all the time. Even if you think they don't, your compare function should certainly handle that case, don't you think?

Will on December 13, 2007 3:58 AM

@Francis: most digicams I've seen actually use padding zeros. They mostly pad to 5 digits, since it's usually IMG*.JPG, and since we are still using useless FAT16 as a common denominator filesystem on the memory cards, we are not getting long file names. Smells like 1995, don't you think? And it's actually almost 2008 outside... That's just sad.

Nikolai on December 13, 2007 4:00 AM

(Side note: Cocoa calls numerical sorting what you refer to as natural sorting.)

millenomi on December 13, 2007 4:01 AM

How many keyboards have you seen on a digital camera?

That made me laugh out loud. Good shit, Francis.

And i think the person who wrote about using leading zeros meant that the leading zeros would be stripped after the sort was done. Or atleast that's how i pictured it. That would be quite a task on bigger lists though.

Jazz on December 13, 2007 4:03 AM

paketep:
Sorry, but if your users are asking for something and your response is "no thats stupid", instead of "ok, I'll do it that way" then you have some problems with your logic. Users are called users for a reason: they _use_ the stuff.

John on December 13, 2007 4:08 AM

On the Michael Herf website, you can find that :
http://www.stereopsis.com/strcmp4humans.html

I must mention that on his website you can find lots of nice peaces of code.

Alex on December 13, 2007 4:08 AM

@Dino: not sure I understand your criticism?

In my example z100 and z101 are the same length and would be sorted correctly.

If you want to sort "z1..", "z10.", "z100" then it would still work, provided you assign the "." character a value on the number line to determine where it appears relative to other values.

On reflection I think where it falls down is on filenames with DIFFERENT lengths (e.g. apple, ball, cab would be sorted as cab, ball, apple - which isn't quite right!)

Graham Stewart on December 13, 2007 4:11 AM

Why is that software not gently prodding those users to provide descriptions of the content of those files so that an intelligible value can be used for the file name?

Well duh, probably because it's kind of silly to rename the file if you're going to add tags to it later anyway? Besides, if I download 150 pics from my cam, I'm not really eager to have to name them all...

Gabri on December 13, 2007 4:33 AM

@Andrew: Meta-data should be your friend, EXIF when talking about photos. Filenames are obsolete.

Nikolai on December 13, 2007 4:49 AM

It's not as straightforward as it seems -- there are some really nasty corner cases. Can you think of an algorithm that will do "the right thing" in all of these cases?

1_1 vs 1_10

1.1 vs 1.10

1.1 vs 1.01

1.1.1 vs 1.10.1

12.25.2007 vs 9.9.08

Christmas 2007 vs My 32nd Birthday

If you can't get it right every time, you might be better off doing something that breaks simply and consistently.

So not giving numbers special handling is actually quite a good solution. Treating the digits at the *end* of the filename (photo1, photo2, photo10 ...) also makes sense, but going any further is just asking for trouble.

Peter on December 13, 2007 4:51 AM

@Michael Starke
Because when I take my camera out for a day out I can easily take over 100 pictures. It is not a fun job naming them all, and is quite labor intensive, despite the tools we are given. Instead I just whack them in a folder with a name and date that sums up that day.

Ross on December 13, 2007 4:59 AM

Implement NaturalSortComparer as an extension method for IEnumerablestring.

Job Done.

Stuart Cam on December 13, 2007 5:02 AM

If you want the same sort order as explorer just use the same function explorer use, it's already there.

Here is "natural sort" in 6 lines:
First, somewhere in your class import StrCmpLogicalW:

[DllImport("Shlwapi.dll", CharSet = CharSet.Unicode)]
private static extern int StrCmpLogicalW(string psz1, string psz2);

And then just use it with Array.Sort:

Array.Sortstring(myStringArray, delegate(string l, string r)
{
return StrCmpLogicalW(l, r);
});

That's all.

NirD on December 13, 2007 5:04 AM

Oh, and for those of you who still think this is an "ASCII sorting" problem, check this out:

CompareString()
http://msdn2.microsoft.com/en-us/library/ms647476.aspx

This is a Win32 string comparison function that does NOT use ASCII comparisons. It works with different locales and character sets (including double-byte characters), it supports multi-character elements and has options to "intelligently" sort words and it supports case-insensitive comparisons.

However, it will STILL produce the following sorted list, because it doesn't do "natural sort" with numerical grouping:
z1.txt
z100.txt
z1000.txt
z2.txt

Go ahead, try it with strings like "z1" and "z100". You will get the same result as plain old "ASCII-betical" strcmp().

And there IS a built-in natural sort function with numerical grouping in Win32:
StrCmpLogicalW()
http://msdn2.microsoft.com/en-us/library/bb759947.aspx
"Compares two Unicode strings. Digits in the strings are considered as numerical content rather than text. This test is not case sensitive."

While "ASCII-betical sorting" does have a meaning, it is not the problem here.


Will on December 13, 2007 5:05 AM

Oh, and for those of you who still think this is an "ASCII sorting" problem, check this out:

CompareString()
http://msdn2.microsoft.com/en-us/library/ms647476.aspx

This is a Win32 string comparison function that does NOT use ASCII comparisons. It works with different locales and character sets (including double-byte characters), it supports multi-character elements and has options to "intelligently" sort words and it supports case-insensitive comparisons.

However, it will STILL produce the following sorted list, because it doesn't do "natural sort" with numerical grouping:
z1.txt
z100.txt
z1000.txt
z2.txt

Go ahead, try it with strings like "z1" and "z100". You will get the same result as plain old "ASCII-betical" strcmp().

And there IS a built-in natural sort function with numerical grouping in Win32:
StrCmpLogicalW()
http://msdn2.microsoft.com/en-us/library/bb759947.aspx
"Compares two Unicode strings. Digits in the strings are considered as numerical content rather than text. This test is not case sensitive."

While "ASCII-betical sorting" does have a meaning, it is not the problem here.


Will on December 13, 2007 5:06 AM

Sorting is also a problem with internationalization.
Hebrew is written right to left for example...

You also have to think about equality and comparison.
How do you sort between z1.txt and z01.txt?

Tsahi Levent-Levi on December 13, 2007 5:14 AM

If that's the best we can do for the internationalized version I don't even want to start. Basically it relies on having a dictionary of all possible special characters you might encounter. I think such a feature might actually exist in unicode (to find out if something is just A with an accent) but putting the dictionary together yourself is code worthy of posting on the daily WTF. :-)

wpp on December 13, 2007 5:25 AM

I think that you are correct in certain instances. Sure, it would be preferable to do a natural sort, but in most instances an alphabetical sort is just fine. SQL Server does all of its sorting by alphabetical order (they refer to it as Dictionary Order) and I would argue that this sorting is what is used quite often when displaying data, especially on the web. I honestly think the problem is nowhere near as big as you are stating it (since it only affect leading or trailing numbers), but in the case of a file manager, I can see where it would be a problem.

Justin Etheredge on December 13, 2007 5:27 AM

What's all this fuss about filenames, anyway? I think filenames are becoming increasingly irrelevant to users, you know, regular, user-type users. Users dont care about filenames, but about content. It's all tags and metadata and stuff.

I think natural sorting is an improvement for displaying filenames, but it's about a decade too late

Rik on December 13, 2007 5:28 AM

I think there is a much deeper problem here. They need to sort their file names... (Almost) Human like sorting of file names is a non-solution to a much deeper problem (very common actually).

Also, these functions will do an almost job as they will only work for us and with the numbers at the end and might or not work with extensions. It sucks and is not constant. Users will use it but will still have to adapt (albeit in an easier way).

So why do they need to sort those hundred and twenty files again?

JF on December 13, 2007 5:28 AM

The problem isn't well defined.

z1.txt and z01.txt were mentioned by Tsahi, but what about zz.txt and z1.txt?

With that Python code, I think the first are going to stay in the order in which they were, and the second order is undefined, as far as I know.

Remco Gerlich on December 13, 2007 5:33 AM

First of all, the problem is not ASCII sort order or whatever. It's called "sorting with lexicographical order". Calling it ASCII sort only muddles the issue (a different character encoding will no solve this issue).
Someone has already mentioned earlier, the problem is that you are conflating the sort operation with the element comparison operation.

Yes, users should not be expected to know or care. They want "natural sorting". But we as programmers should be expected to know the difference.
A programmer should know what comparison sort is (and the difference between it and other types), and how it's composed of discrete comparisons.
A programmer should know more than just to use "items.Sort()".

M on December 13, 2007 5:39 AM

Asks a bunch of users to sort physical objects (or a list of them). They will all come up with different schemes. There is only one that makes sense in the case that was described and its sorting by object number of a specific type.

. on December 13, 2007 5:42 AM

If you want to sort considering international alphabet you should add also a collation algorithm: http://en.wikipedia.org/wiki/Unicode_collation_algorithm

Eduardo Diaz on December 13, 2007 5:52 AM

This statement bugs me, and I'm sort of surprised that no one as brought this up:

"Perhaps a more human-friendly natural sort option should be built into mainstream programming languages, too"

The programming language is NOT the collection of standard libraries or framework that are generally provided with a compiler. This statement would lead one to believe C# and the .NET framework are one and the same, or that C is not C unless you have implementations of stdio, etc. Too many people already think like this, so please don't perpetuate this point of view.

Daniel on December 13, 2007 5:54 AM

The Python implementation use some methods, like isdigit(), split,...

if you implement a few routines its possible to write C, C++, C#
code to natural compare strings using a couple of lines!

Here is a simple algorithm:
//1., 'ax103cool' 'ax2008hot' - 'ax'
offset = GetCommonStringPrefix(strA, strB)
No(offset =0): use the usual compare routine

//2., '103cool' '2008hot' - '103', '2008'
endOfsA = AreAnyNumbers( strA +offset )
endOfsB = AreAnyNumbers( strB +offset )

No: use the usual compare routine
Yes: CompareNumbers( strA +offset, strB +offset, endOfsA, endOfsB )

Nikos on December 13, 2007 5:54 AM

"Perhaps a more human-friendly natural sort option should be built into mainstream programming languages"

Enter Tcl's "lsort -dictionary". Not surprising from a language whose aim is to be as human-friendly as possible.

lsort {a1 a10 a100 a2 a3 a20 a30 a200 a300} ; # default ASCII sort
a1 a10 a100 a2 a20 a200 a3 a30 a300

lsort -dictionary {a1 a10 a100 a2 a3 a20 a30 a200 a300} ; natural sort
a1 a2 a3 a10 a20 a30 a100 a200 a300

See how life can be simple?

BTW, Tcl's license is BSD-style. Help yourself, grab the code.

El Fredo on December 13, 2007 5:58 AM

For differences in Human-sorting, consider the issue of surnames that start with Mc or Mac (e.g. McDonald, MacFarlane, McFarlane etc). A common problem here in Scotland.

Some organisations/individuals will sort "Mc" as if it reads "Mac" (I believe British Telecom do this in the phone book).
Others treat them separately.
Others place MC and Mac before all other names starting with M.
Others ignore the construct altogether and would file McFarlane under "F"!

Likewise, if we are sorting 'sensibly' then we also need to consider hyphenated and non-hyphenated words (a human would put "e-mail" in with the rest of the "email").

It is worth looking at Schwartzian Transforms for this:
http://en.wikipedia.org/wiki/Schwartzian_transform

Joe Cheng offers some examples of writing Schwartzian Transforms in C# 3.0:
http://jcheng.wordpress.com/2007/08/23/schwartzian-transform-in-c-30/

Graham Stewart on December 13, 2007 6:01 AM

Well windows 2000's explorer didn't sort file the natural way, so i've learned to add leading zeros..

Manu on December 13, 2007 6:09 AM

There are even thornier issues about sorting and searching when a user finds that Etienne is in the E section and tienne is listed past the Z area.
Any special language characters require special consideration to deal with and if you have more than one language set possible as input values then get set for some fun.

Mistral on December 13, 2007 6:11 AM

Hey Now Jeff,
This is an issue we all see often just like the when my focus is stolen. (http://www.codinghorror.com/blog/archives/001011.html) I like both of these posts since it's an issue we all deal with notice.
Coding Horror Fan,
Catto

Catto on December 13, 2007 6:14 AM

My solution for the number problem:

http://undev.spaces.live.com/blog/cns!6526A337A687E45D!142.entry

Of course realizing domain concerns may cause you to write your own alogorithm like Graham suggests. Most times you don't need that level of control.

Josh on December 13, 2007 6:15 AM

The filename isn't a good interface to photo tracking anyway. @Ross is right when he says it's no fun (re)naming the photos after a day of taking them. Even if you could name them as you were taking them, you wouldn't, it would interrupt the adhoc nature of taking photos. And names like birthday-cake.jpg, football-cheerleader.jpg and zoo-animals.jpg like @Michael Starke suggested arn't good either, those names are hardly descriptive enough to find one picture a group of files of any useful size (would you really take ONE picture that had MANY animals in it on a trip to the zoo anyway?)

The proper interface for sorting photos is VISUALLY, since photos are visual media. Displaying photo file icons as thumbnails in the file manager was a huge step in the right direction there. But it's still deficient. The photo file manager should work visually too, letting me move files around to "sort" them and remembering those locations (the big grid of photos in many photo management apps doesn't really help, except to browse all photos -- and improperly mirrors the real-world "contact sheet" concept which exists for different purposes).

But back to the issue: sorting. Users are ambiguous in what they want, or they don't even know what they want often times. I've had to handle support calls of the form (not recently, thank god):
"I can't find the file I saved yesterday, can you help me find it?"
"Sure, what did you name it?"
"I don't know."
The filename is THE interface to file management currently. I'm not sure there is much hope, unfortunately, for users who are unable to adjust to that by now (not saying it's perfect in 2007, but if you are going to be productive with certain tools (computers), you should know what's important, especially when the interface arguably sucks).

Our goals should be to make their jobs easier (but easier does not mean "requires no thinking"). This means, in many cases, providing MORE easily accessible options. Changing the sort algo should maybe be provided as part of the sorting interface. Click column headings to change the sort order (asc or desc), right click to select a sort algo for that column of data. Can't navigate your files when sorted ASCIIbetical? Change it to natural order. Still not finding it, then change to an algo that sorts by trying to interpret date formats within the filenames (prime candidate for providing after-market "sort algo plugin" functionality -- since you can't predict what users are going to want, so enable someone else to build it).

Some people think giving users FEWER options is better (uh, ahem, a lot of the Gnome interface), but all that does is make the user change their habits to do things a certain way. It's not about reducing the number of options or hiding options, it's about making them more easily and obviously accessible. It doesn't make sense to force someone to go to Edit | Preferences, then vgrep a huge list of options or checkboxes to change their default sort algo (if even it was an option), but it does make sense to build those options into the UI widget that is the common or accepted endpoint for that functionality.
Rather than debate which sort algo the users are expecting, we should enable users to exploit multiple sort algos and switch between them easily based on their mood or work style.

Axel on December 13, 2007 6:17 AM

If you dim the array as a string and use the sort method, it will produce results in the literal way, but if you dim the array as an integer, it usually works in the logical way. (In VB.NET)

Zak on December 13, 2007 6:17 AM

Yea for Ruby's sort_by method on Enumerable objects!

a = ["x1x10", "x10x1", "x5x5", "b55c5", "x3b4", "x3b2"]

p a.sort_by {|string| string.downcase.scan(/([a-z]+|\d+)/).flatten.collect {|s| s.match(/\d/) ? s.to_i : s } }

Ok, so it's a bit more complicated than that (of course). Wasn't taking into account non alphanumeric characters:

p a.sort_by {|string| string.downcase.scan(/([a-z]+|[^\d]+)/i).flatten.collect {|s| s.match(/\d/) ? s.to_i : s } }

It's probably still not perfect, but it's workable now, at least.

Brian on December 13, 2007 6:20 AM

Will,
"Clippy could pop up out of nowhere", nice ridiculing of my point. But then you've called other posters idiots and the like so I think your level of class is obvious for all to see.

Also, you completely miss my point. It is simply that the user could ask for sorting by their favorite colour, the magnitude of the planets, dates closest to christmas etc etc. As a developer you must inform them that it is not necessarily the best approach (for example how will other users understand the sort). But if they still want it, it's then your job to give them what they expect.

You're not a Python zealot by any chance?

Ash on December 13, 2007 6:20 AM

An interesting side-effect of the Explorer "human" sort is that some programs create numbered files without the leading zeros because they know they'll be shown in the right order in Explorer. Then you bring that into another program, and you have the issues.

Scott on December 13, 2007 6:23 AM

@Michael Starke Paketep:
There are no stupid users, only stupid developers. Forcing users to do things a specific way or refusing to recognize that a users way of seeing things could be right is arrogant and blatantly idiotic. If a user wants to name his or her files yyyyy1, yyyyy2, etc. who do you think you are to tell they are wrong? You should not gently be prodding anyone or anything except your own ignorance and arrogance.

Right, sorry about the flaming, but the above posts came across so mindnumbingly stupid that it couldn't be helped. I'll get my coat.

Fake

Fake51 on December 13, 2007 6:26 AM

the gnarly i20n issues I've hinted at

I've seen (and created) more than my share of internationalization bugs, but this self-referential mis-shortening (should be i18n) is exceptionally sweet!

Mark Nelson on December 13, 2007 6:32 AM

Why all the drama about sorts? Sometimes it makes sense to sort it one way, sometimes the other. There's no way a programming language could intrinsically know when either is appropriate. So occasionally you need to insert a few lines of code. Oh horrors.

It's not like you need to re-invent the wheel each time either. Most shops have a pre-written function or at worst code to post in.

Next are we going to mull over the tragedy of lower/upper case conversion?

Karl on December 13, 2007 6:35 AM

Why are they choosing file names like that though?

Because they feel like it. Because they are mad. Because they are masochists.
Since when do they have to explain why they want something?
They are the users. They want something. It is not your business to know why or to tell them what they need. If you think you know better (and many times you really do know better what they need) then by all means do it, implement it, talk to them, show it to them, but you have to convince them to accept your suggestion. They dont have to convince you for the necessity of their demand.

As programmers It is our business to offer the best but in respect of the limits users pose to us no matter how we judge these limits. We can always warn or suggest or even refuse to do something we dont want for whatever reason, but we can never force the user to accept our suggestion. In IT like in life, purpose DOES NOT justify the means.

If software was politics, you would be a 'fasist'. (No offense)

In fact every kind of programmer can be described as a politician. Jeff do it. Write about it. Please. It will be fun.

bill on December 13, 2007 6:37 AM

What's iinternationalizationn?

JJM on December 13, 2007 6:39 AM

IanG on Tap (http://www.interact-sw.co.uk/iangblog) just blogged about this this morning. His solution: http://www.interact-sw.co.uk/iangblog/2007/12/13/natural-sorting

Max on December 13, 2007 6:42 AM

I think a lot of you have missed the point here. The post was not about filenames, it was about sorting *strings*. Natural sort makes more sense than ASCIIbetical in almost all situations.

Our bug tracking system sends us emails about bugs with a subject like "[ProjectName 10] Something went wrong". A natural sort on subject gives me bug 9 next to bug 10, rather than seperated by 8990 other bug reports.

Making a room booking application? Why isn't F9 before F10 in the room list?

Also, the most important thing to remember when deciding wether to "fix" this sort of problem is not wether or not the user is being stupid, but wether making the change will make your users happier and more productive. Forget everything else.

Mat Scales on December 13, 2007 6:44 AM

In case anyone is interested: My opinion on the matter is sufficiently described in http://nikolai.prokoschenko.de/archives/161 (technorati is slow to update)

Nikolai on December 13, 2007 6:44 AM

Leading zeros have the problem that they limit the collection size at 10^(z+1) - 1, and people just don't like the idea of having such a limit, because they don't know what to do if they change their mind. It's like line numbers in BASIC: you shouldn't have to deal with it yourself.

Also, normal users usually don't deal with mass renaming and Windows Explorer is not everything and the kitchen sink for renaming (which is a design decision). It's much easier to solve this using a tool like Lupas Rename, which has (for most users) a daunting complexity, if users even know such a tool exists.

@ people calling it a case of "stupid users":
Little natural language tricks like extracting the address from an e-mail signature, the name below "regards, xxx" and understanding that "thursday next week" means Date.Now().AddDays(7) are what adds UI polish to your application. It's the "they thought of -that-, too!" that's sorely lacking in a lot of applications.

Rob Janssen on December 13, 2007 6:48 AM

*DO* let Ned's 10-liner tempt you. You know you want to use Python. Embrace the Python. It is your destiny.

Daniel 'Dang' Griffith on December 13, 2007 6:49 AM

Leading zeros breaks down every time you need to increase the number of leading zeros in your file names, so it's at best a temporary fix.

Even if the file names aren't computer generated, people may use temporary names with the intent of making them more meaningful later, only to end up with large numbers of files sitting around with nothing more meaningful than the date and order in which they were generated (and often that's not much use, either).

Many people like the idea of sorting by date, but that only works if your brain works well with dates in the first place, and/or you have a rough idea of when you generated the particular file you're looking for. The only time I've found dates useful (being a person that rarely knows what day of the week it is, never mind the actual date), was sorting photographs after a long trip. In that particular case at least one of the photos taken within a given week (Wednesday through Tuesday) had a relevant name in it, so I could rename the folder that contained all of the photos for that week with that name.

Other than that, I always sort my data by subject matter and revision information. Photographs are sorted similarly, dates simply help with grouping them together before sorting by subject in cases where I took a number of related photos within a given range of dates (or on a particular date).

I guess the point I'm getting to is that everyone has different expectations when they're storing and sorting data. They may also have different expectations when they're dealing with different types of data. On the other hand, the only people that expect data to be sorted the way most computer applications sort it are people that are used to computers doing this sort of thing. The majority of the people that understand why computers sort the data that way should be working to fix it (and obviously some of them have, as file managers wouldn't do any differently from array sorts otherwise, and for a long time they didn't).

Vizeroth on December 13, 2007 6:56 AM

The leading zeros perspective is interesting. Apartment and office buildings will commonly number the units using the hundreds place for the floor and the tens and ones place for the unit number (limiting the number of units per floor to 99, a seemingly arbitrary limit, albeit practical), making the first nine units on the third floor as 301 through 309. Why aren't these 31 through 39?

Axel on December 13, 2007 7:00 AM

As a data warehouse developer, I'm no stranger to the way business people expect to see things sorted, and it's never ASCII... One thing I do is to add extra columns to my databases and put in "corrected" versions of the keys I'm using, so my table might look like:

Customer Number|Customer|SortKey1
MS948|Fred Smith|MS00000948
MS9215|John Star|MS00009215

..which means I get to use SQL's standard ASCII sorting (indexed for speed, if need be) rather than muck around with anything in client applications. The client presents the real key to the user, but sorts by the pre-calculated ASCII-sort-safe key, and I only have to do the hard work once, up front.

Matt Gibson on December 13, 2007 7:04 AM

@Rob Janssen I wholeheartedly agree. I got an email the other day and was mightily impressed when Apple's Mail.app detected the words "7pm next Tuesday" in it and wrapped them in a subtle little mouseover drop-down that prompted me to create a new appointment in my calendar for the right time and date, with the title being the subject of the email. Now _that's_ impressively user-friendly and integrated...

Matt Gibson on December 13, 2007 7:08 AM

What about z1.2.txt? Do these natural sort algorithms handle decimals? :)

troll on December 13, 2007 7:16 AM

Shouldn't any user-facing sorts be done with the locale's collation function? Isn't this its responsibility?

Robert Fisher on December 13, 2007 7:16 AM

Ash,

I am not a python zealot. I mostly code in C for Linux embedded systems and C++ (MFC) for Windows.

And I only called 1 person an idiot - the blogger (quoted by Jeff) who said:
"Silly me, I just figured that alphabetical sorting was such a common need (judging by the number of people asking how to do it I'm not wrong either) that I wouldn't have to write the damn thing. But I didn't count on the stupid factor. Jesus Christ people. You're programmers. You're almost all college graduates and none of you know what the f**k "Alphabetical" means. You should all be ashamed. If any of you are using your language's default sort algorithm, which is almost guaranteed to be ASCIIbetical (for good reason) to get alphabetical sorting you proceed to the nearest mirror and slap yourself repeatedly before returning to your desks and fixing your unit tests that didn't catch this problem."

I guess you didn't find any of that offensive, rude, or low-class.

And my point was that, in some cases, you CAN'T adapt to your users needs. You can only try to please some of them, some of the time (which is what Explorer does).

Maybe some of my posts were strongly worded. Two reasons:
1) I dislike that implication that developers are too stupid to give the users what they want.
2) I'm disappointed that everyone thinks this issue is an artifact of "asciibetical" sorting, rather than lexicographical ordering. I don't see how you can solve a problem without fully understanding it first.

Will on December 13, 2007 7:18 AM

@Matt: I'm constantly pissed off by Google Mail, which considers any long number a FedEx tracking number -- in Germany. This whole heuristics is entirely context- and language-specific and it needs better means.

Nikolai on December 13, 2007 7:25 AM

known Problem, easy to avoid. If i list such a directory in shell, i just use ls|sort -g. My approach implementing this would be, to determine first, which sort is wanted (ask the user). Then sort ascii as long as there are non-number chars, then sort the numbers as ints.

allo on December 13, 2007 7:29 AM

BTW, the fact that Win32, PHP, Objective-C have BUILT-IN support for "natural sort" ordering and that numerous 3rd-party libraries are available (including a C language library) proves that developers DO care about human factors and that we are not all stupid or hopelessly afflicted with Asperger's Syndrome, as Kate Rhodes and Jeff seem to think.

So I don't think we need to "proceed to the nearest mirror and slap [ourselves] repeatedly before returning to [our] desks", like Kate Rhodes suggested.

And I still have no problem with calling her an idiot. Especially when she thinks we don't know what the f--- "alphabetical order" means. I found that statement quite ironic, since "asciibetical order" is PRETTY MUCH the same as "alphabetical order", as long you're talking about case-insensitive comparisons. This problem has nothing to do with "alphabetical" versus "asciibetical". It is about "lexicographical order" (comparing one character at a time) versus "natural sort order" (i.e. treating groups of digits as numbers, rather than individual characters of text.).

Will on December 13, 2007 7:30 AM

Numeric collation in Cocoa, and the sort order in the Mac Finder is a result of the property UCOL_NUMERIC_COLLATION being set on an ICU collator. ICU (available in C/C++ and Java from icu-project.org) underlies MacOSX collation, and Apple contributed this functionality to ICU.

http://www.icu-project.org/apiref/icu4c/ucol_8h.html#583fbe7fc4a850e2fcc692e766d2826c2efca83794416797ef04abda570c6f5b

This will also work with any non-ASCII digits, arabic, indic, etc.

Srl on December 13, 2007 7:34 AM

As was mentioned by David Cumps early on, PHP has had natural sorting (natsort) for quite some time. The earliest comments for that function are from May of 2003.

More than enough time for Microsoft to have copied yet another idea... errr... have invented natural sort for .Net.

Same for Ruby and all those other lame languages that you know Jeff.
And I use the word lame because as you said:
"None of the common computer languages (that I know of) implement anything other than ASCIIbetical sorts."

And PHP cannot be called a non-common computer language.

rustyvz on December 13, 2007 7:35 AM

One last comment. Many people have pointed out that no "natural sorting" algorithm can be perfect, or please everyone in all situations. Others have pointed out that "natural sorting" is an ambiguous spec, whereas lexicographical ordering is mathematically well-defined.

Maybe, just maybe, this is why we haven't seen "natural order" string comparisons built into every single popular high-level language out there? Could that be the reason, as opposed to the knee-jerk assumption by the human factors crowd that all developers are morons and are out of touch with normal people?

Will on December 13, 2007 7:36 AM

This is the implementation I always use, is very good IMO.

http://www.codeproject.com/KB/recipes/csnsort.aspx

Iceman on December 13, 2007 7:41 AM

BTW, some digital cameras are smarter than others. Read about it here: "http://www.bbspot.com/News/2005/11/sony_photo_sharing.html".

Daniel 'Dang' Griffith on December 13, 2007 7:43 AM

Okay, one last comment.

On the same page as the C library implementation of natural string comparison...
http://sourcefrog.net/projects/natsort/

...there are links for the following implementations:
Javascript
Java
Python
Ruby
Perl
Macintosh (as a system extension)

Hope that's enough platforms for everyone.

Will on December 13, 2007 7:48 AM

paketep you surely make a great programmer with that attitude hahahah

LOL on December 13, 2007 7:51 AM

More about StrCmpLogicalW on MichKa's blog:
http://blogs.msdn.com/michkap/archive/2005/01/05/346933.aspx

Serge Wautier on December 13, 2007 7:52 AM

Probably I'm the only person here who has this problem, but try using Windows Explorer to work with a long list of files whose file names are GUIDs. Whatever order Windows Explorer uses to "sort" those, it sure ain't human-intuitivable.

Programmer 1's solution: "Why are you using GUIDs for file names?"
Programmer 2's solution: "They're sorted, you just don't see how."
Programmer 3's solution: "Who needs sorting, anyway? You can just search the files."
Programmer 4's solution: "Windows Explorer is stupid."

Beep! Sorry, you are not a winner. Please play again.

mike on December 13, 2007 7:57 AM

More comments»

The comments to this entry are closed.