BlogFileMaker

SortValues and UniqueValues: New FileMaker 16 Functions

By May 9, 2017 4 Comments
FileMaker 16 includes a host of new functions for parsing and manipulating text and binary data. The primary motivation for including these new functions was to support interactions with REST APIs.

Many of the text functions are targeted exclusively at working with JSON. However, two of them are more general-purpose and have been a long time coming.

These two functions are SortValues and UniqueValues.

FileMaker developers have had to make due with custom functions to do the job of these two, but now we can make use of native functions instead.

Let’s take a closer look at each one.

SortValues

SortValues ( values {; datatype ; locale } )

Datatype is an optional parameter taking one of these values:

  • 1 – text
  • 2 – number
  • 3 – date
  • 4 – time
  • 5 – timestamp

A positive value will sort the values in ascending order. A negative value will use a descending sort order. For example, -3 will sort a list of dates in descending order.

If you leave the data type blank or give it an invalid value, it will default to 1 (text, ascending).

As is the case with most of the other FileMaker list functions, a trailing return is included at the end of each non-empty result.

The other optional parameter is the locale. It allows you to specify the language of the values being sorted. Different languages follow different collating rules when sorting text. For example, SortValues ( “Tom¶Brian¶Aaron” ; 1 ; “Norwegian” ) = “Brian¶Tom¶Aaron¶”. This is because, in Norwegian (and in Danish), “AA” is treated as “Å” and sorted below “Z”.

If you leave the locale blank, it will default to the file locale, which is set based on the operating system used when the file was initially created. You can read more about this here:

The valid locale options are listed in the SortValues help page:

The SortValues function is not case-sensitive; however, case-sensitivity can be achieved by specifying “Unicode_Raw” as the locale. Doing so will sort uppercase letters before lowercase letters. For example: SortValues ( “b¶a¶c¶ZZZ” ; 1 ; “Unicode_Raw” ) = “ZZZ¶a¶b¶c¶”

Blank lines will not be ignored – except for the last trailing return. If the input list has one or more blank lines, those lines will sort to the top of the list (in the case of an ascending sort).

From my testing, the algorithm used for SortValues appears to be stable, meaning that values that are considered identical with respect to sorting will retain their relative order in the result. For example, “a” and “A” are considered equivalent values, and sorting a list of values which has “a” appearing before “A” will keep “a” in front of “A” in the result. This may seem obvious, but note that there are non-stable sorting algorithms out there where relative order isn’t guaranteed in the result.

UniqueValues

UniqueValues ( values {; datatype ; locale } )

The UniqueValues function removes duplicates from a list of values based on the specified data type and locale.

For the data type and locale parameters, the same considerations apply as with the SortValues function.

To determine if two values are equivalent, FileMaker uses the indexing definition of what a duplicate is. For text values, this means case insensitivity for most locales. But if you want to achieve case sensitivity, you can use the “Unicode_Raw” locale.

For example: UniqueValues ( “aaa¶AAA” ) = “aaa¶”, but  UniqueValues ( “aaa¶AAA” ; 1 ; “Unicode_Raw” ) = “aaa¶AAA¶”.

UniqueValues will retain the original order of the list, keeping the first value encountered; i.e. the values in the result will not be sorted or otherwise shuffled around.

For example: UniqueValues ( “B¶a¶Z¶b¶A¶z” ) = “B¶a¶Z¶”.

With the release of FileMaker 14, FileMaker Inc. has deprecated the ability to create Runtime versions of FileMaker databases. As a result, SortValues and UniqueValues (and indeed all the other new FileMaker 16 text parsing functions) are not supported with Runtimes and will return “?” if evaluated within a Runtime application.

Performance

Big O notation is used in computer science to describe how an algorithm performs in best-case, worst-case, and typical scenarios. For example, for a sorting algorithm, best-case means that the list is already sorted.

Typical big O complexities are O(n), O(n log n), and O(n^2), where n is the number of elements being operated on.

An algorithm’s big O complexity matters when dealing with larger data sets (or if you have to process a smaller set many times over). That is when you (or your users) will start to feel the pain of exponential-time algorithms.

Big O Complexity for SortValues

I ran some tests to see if I could identify the big O time complexity for SortValues. I plotted out the times it took to sort text lists with increasingly larger numbers of values and then overlaid that on top of O(n log n) and O(n^2) graphs. The input values consisted of randomly selected words, so this data is representative of neither best-case nor worst-case scenarios, but somewhere in between.

Big O Complexity chart

Figure 1. Big O time complexity for SortValues

I conducted 4 test runs, and they suggest that SortValues is an average-case O(n^2) algorithm.

Based on these observations (n^2 and stable sort), the underlying sorting algorithm is likely one of these:

  • Insertion sort
  • Bubble sort
  • Strand sort
  • Cocktail sort
  • Gnome sort
  • Odd-even sort

Comparison to Custom Functions

I was curious to see how the new native function compares to some of the sorting custom functions available in the FileMaker community.

The two custom functions used as comparisons are:

The new native SortValues function did perform better than the SortList custom function. However, when sorting very large lists (more than 10,000 values), the QuickSort custom function was faster. This is both surprising and unsurprising. Surprising, because I had expected that a native function would outperform a custom function. Unsurprising, because the QuickSort algorithm is known to be average-case O(n log n). In other words, now that we’ve observed SortValues as being O(n^2), it’s no surprise that an O(n log n) algorithm will overtake it once the size of the list being sorted reaches a certain point.

Chart of SortValues compared to Custom Functions

Figure 2. SortValues compared to Custom Functions

The good news is that sorting lists that large tends to be quite rare. And, when working with smaller data sets (less than 10-20K), the new native SortValues function performed faster than even QuickSort, which seems to indicate that its execution is more efficient per-iteration.

Chart comparing SortVlues with custom functions with smaller data sets

Figure 3. SortValues compared to custom functions with smaller data sets

Get the SortValues and UniqueValues Demo File

If you’d like to take a closer look at the test results or run your own tests, you can download the demo file:

Summary

SortValues and UniqueValues are welcome additions to the FileMaker text function library.

You can control case sensitivity by using the Unicode_Raw locale.

The underlying algorithm for SortValues appears to be O(n^2); however, for most real-world FileMaker applications, the data sets are too small for that to matter.

Reference

Mislav Kos

Mislav Kos

Mislav is a FileMaker developer and a Technical Solution Architect at Soliant Consulting.

4 Comments

  • Avatar Vincent says:

    Hi,
    I’m using BaseElement’s plugin sortvalues and unique for years. Could you add them in your very interesting performance comparison

    • Mislav Kos Mislav Kos says:

      Hi Vincent, thanks for the suggestion. I may do this later on, but it won’t happen in the near future as I am quite busy at the moment.

  • […] Soliant Consulting FileMaker 16 is Here Executive Summary┬áΓÇô┬áSummary of SoliantΓÇÖs Top 3 List: Card windows, Layout Object Inspector Palette, and Added JSON support Parsing JSON Create JSON Layout Objects Inspector Card Window SortValues and UniqueValues functions […]

  • Avatar Jeremy says:

    I haven’t seen that QuickSort custom function in years! I’ve used this more recently (https://github.com/jbante/FileMaker-Techniques/blob/master/CustomFunctions/Value/ValueSort-TimSort.fmfn). It’s roughly based on timsort, which has theoretically better best-case and worst-case performance than quicksort, though I haven’t empirically confirmed how these two implementations compare.

    I’m skeptical of the custom functions with larger lists. Linearithmic (O ( N Log N )) performance easily exceeds FileMaker’s custom function recursion limits. Note that the results from the quick sort function for even medium-size lists is “?”. The SortList function is able to extend this limit due to Agn├¿s’ characteristically clever use of Evaluate.

    Even then, the sort algorithm isn’t the only constraint on performance. Na├»ve parsing of any return-delimited list in FileMaker is quadratic (O ( n ^2 )), and more sophisticated parsing only improves that to O ( n^1.5 ). Parsing the list with functions exposed in FileMaker’s calculation engine is as much a limiting factor as the choice of sort algorithm. A linearithmic sort algorithm (e.g., quicksort, timsort, mergesort) on a return-delimited list implemented as a FileMaker custom function can only achieve O ( n^1.5 ) performance at best. One would hope that the built-in function could achieve better performance from being able to used cursor-based parsing of the list rather than the functions in the calculation engine.

Leave a Reply

Need to adjust your business processes quickly? We're helping clients use technology to keep their teams productive and running smoothly in these times of uncertainty. Our team can guide yours if you need help in these areas.

Talk to a Consultant