diff --git a/natsort-FEX/html/natsortfiles_doc.html b/natsort-FEX/html/natsortfiles_doc.html new file mode 100644 index 0000000000000000000000000000000000000000..d779ee6242d1018874bff1c6e6ed8e5a94036948 --- /dev/null +++ b/natsort-FEX/html/natsortfiles_doc.html @@ -0,0 +1,294 @@ + +<!DOCTYPE html + PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head> + <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> + <!-- +This HTML was auto-generated from MATLAB code. +To make changes, update the MATLAB code and republish this document. + --><title>NATSORTFILES Examples</title><meta name="generator" content="MATLAB 7.11"><link rel="schema.DC" href="http://purl.org/dc/elements/1.1/"><meta name="DC.date" content="2018-03-25"><meta name="DC.source" content="natsortfiles_doc.m"><style type="text/css"> + +body { + background-color: white; + margin:10px; +} + +h1 { + color: #990000; + font-size: x-large; +} + +h2 { + color: #990000; + font-size: medium; +} + +/* Make the text shrink to fit narrow windows, but not stretch too far in +wide windows. */ +p,h1,h2,div.content div { + max-width: 600px; + /* Hack for IE6 */ + width: auto !important; width: 600px; +} + +pre.codeinput { + background: #EEEEEE; + padding: 10px; +} +@media print { + pre.codeinput {word-wrap:break-word; width:100%;} +} + +span.keyword {color: #0000FF} +span.comment {color: #228B22} +span.string {color: #A020F0} +span.untermstring {color: #B20000} +span.syscmd {color: #B28C00} + +pre.codeoutput { + color: #666666; + padding: 10px; +} + +pre.error { + color: red; +} + +p.footer { + text-align: right; + font-size: xx-small; + font-weight: lighter; + font-style: italic; + color: gray; +} + + </style></head><body><div class="content"><h1>NATSORTFILES Examples</h1><!--introduction--><p>The function <a href="https://www.mathworks.com/matlabcentral/fileexchange/47434"><tt>NATSORTFILES</tt></a> sorts a cell array of filenames or filepaths (1xN char), taking into account any number values within the strings. This is known as a <i>natural order sort</i> or an <i>alphanumeric sort</i>. Note that MATLAB's inbuilt <a href="http://www.mathworks.com/help/matlab/ref/sort.html"><tt>SORT</tt></a> function sorts the character codes only (as does <tt>SORT</tt> in most programming languages).</p><p><tt>NATSORTFILES</tt> is not a naive natural-order sort, but splits and sorts filenames and file extensions separately, which means that <tt>NATSORTFILES</tt> sorts shorter filenames before longer ones: this is known as a <i>dictionary sort</i>. For the same reason filepaths are split at every path-separator character (either <tt>'\'</tt> or <tt>'/'</tt>), and each directory level is sorted separately. See the "Explanation" sections below for more details.</p><p>For sorting the rows of a cell array of strings use <a href="https://www.mathworks.com/matlabcentral/fileexchange/47433"><tt>NATSORTROWS</tt></a>.</p><p>For sorting a cell array of strings use <a href="https://www.mathworks.com/matlabcentral/fileexchange/34464"><tt>NATSORT</tt></a>.</p><!--/introduction--><h2>Contents</h2><div><ul><li><a href="#1">Basic Usage</a></li><li><a href="#2">Output 2: Sort Index</a></li><li><a href="#3">Output 3: Debugging Array</a></li><li><a href="#4">Example with DIR and a Cell Array</a></li><li><a href="#5">Example with DIR and a Structure</a></li><li><a href="#6">Explanation: Dictionary Sort</a></li><li><a href="#7">Explanation: Filenames</a></li><li><a href="#8">Explanation: Filepaths</a></li><li><a href="#9">Regular Expression: Decimal Numbers, E-notation, +/- Sign</a></li><li><a href="#10">Regular Expression: Interactive Regular Expression Tool</a></li></ul></div><h2>Basic Usage<a name="1"></a></h2><p>By default <tt>NATSORTFILES</tt> interprets consecutive digits as being part of a single integer, each number is considered to be as wide as one letter:</p><pre class="codeinput">A = {<span class="string">'a2.txt'</span>, <span class="string">'a10.txt'</span>, <span class="string">'a1.txt'</span>}; +sort(A) +natsortfiles(A) +</pre><pre class="codeoutput">ans = + 'a1.txt' 'a10.txt' 'a2.txt' +ans = + 'a1.txt' 'a2.txt' 'a10.txt' +</pre><h2>Output 2: Sort Index<a name="2"></a></h2><p>The second output argument is a numeric array of the sort indices <tt>ndx</tt>, such that <tt>Y = X(ndx)</tt> where <tt>Y = natsortfiles(X)</tt>:</p><pre class="codeinput">[~,ndx] = natsortfiles(A) +</pre><pre class="codeoutput">ndx = + 3 1 2 +</pre><h2>Output 3: Debugging Array<a name="3"></a></h2><p>The third output is a cell vector of cell arrays, where each cell array contains individual characters and numbers (after converting to numeric). This is useful for confirming that the numbers are being correctly identified by the regular expression. The cells of the cell vector correspond to the split directories, filenames, and file extensions. Note that the rows of the debugging cell arrays are <a href="https://www.mathworks.com/company/newsletters/articles/matrix-indexing-in-matlab.html">linearly indexed</a> from the input cell array.</p><pre class="codeinput">[~,~,dbg] = natsortfiles(A); +dbg{:} +</pre><pre class="codeoutput">ans = + 'a' [ 2] + 'a' [10] + 'a' [ 1] +ans = + '.' 't' 'x' 't' + '.' 't' 'x' 't' + '.' 't' 'x' 't' +</pre><h2>Example with DIR and a Cell Array<a name="4"></a></h2><p>One common situation is to use <a href="https://www.mathworks.com/help/matlab/ref/dir.html"><tt>DIR</tt></a> to identify files in a folder, sort them into the correct order, and then loop over them: below is an example of how to do this. Remember to <a href="https://www.mathworks.com/help/matlab/matlab_prog/preallocating-arrays.html">preallocate</a> all output arrays before the loop!</p><pre class="codeinput">D = <span class="string">'natsortfiles_test'</span>; <span class="comment">% directory path</span> +S = dir(fullfile(D,<span class="string">'*.txt'</span>)); <span class="comment">% get list of files in directory</span> +N = natsortfiles({S.name}); <span class="comment">% sort file names into order</span> +<span class="keyword">for</span> k = 1:numel(N) + disp(fullfile(D,N{k})) +<span class="keyword">end</span> +</pre><pre class="codeoutput">natsortfiles_test\A_1.txt +natsortfiles_test\A_1-new.txt +natsortfiles_test\A_1_new.txt +natsortfiles_test\A_2.txt +natsortfiles_test\A_3.txt +natsortfiles_test\A_10.txt +natsortfiles_test\A_100.txt +natsortfiles_test\A_200.txt +</pre><h2>Example with DIR and a Structure<a name="5"></a></h2><p>Users who need to access the <tt>DIR</tt> structure fields can use <tt>NATSORTFILE</tt>'s second output to sort <tt>DIR</tt>'s output structure into the correct order:</p><pre class="codeinput">D = <span class="string">'natsortfiles_test'</span>; <span class="comment">% directory path</span> +S = dir(fullfile(D,<span class="string">'*.txt'</span>)); <span class="comment">% get list of files in directory</span> +[~,ndx] = natsortfiles({S.name}); <span class="comment">% indices of correct order</span> +S = S(ndx); <span class="comment">% sort structure using indices</span> +<span class="keyword">for</span> k = 1:numel(S) + fprintf(<span class="string">'%-13s%s\n'</span>,S(k).name,S(k).date) +<span class="keyword">end</span> +</pre><pre class="codeoutput">A_1.txt 22-Jul-2017 09:13:23 +A_1-new.txt 22-Jul-2017 09:13:23 +A_1_new.txt 22-Jul-2017 09:13:23 +A_2.txt 22-Jul-2017 09:13:23 +A_3.txt 22-Jul-2017 09:13:23 +A_10.txt 22-Jul-2017 09:13:23 +A_100.txt 22-Jul-2017 09:13:23 +A_200.txt 22-Jul-2017 09:13:23 +</pre><h2>Explanation: Dictionary Sort<a name="6"></a></h2><p>Filenames and file extensions are separated by the extension separator: the period character <tt>'.'</tt>. Using a normal <tt>SORT</tt> the period gets sorted <i>after</i> all of the characters from 0 to 45 (including <tt>!"#$%&'()*+,-</tt>, the space character, and all of the control characters, e.g. newlines, tabs, etc). This means that a naive <tt>SORT</tt> or natural-order sort will sort some short filenames after longer filenames. In order to provide the correct dictionary sort, with shorter filenames first, <tt>NATSORTFILES</tt> splits and sorts filenames and file extensions separately:</p><pre class="codeinput">B = {<span class="string">'test_ccc.m'</span>; <span class="string">'test-aaa.m'</span>; <span class="string">'test.m'</span>; <span class="string">'test.bbb.m'</span>}; +sort(B) <span class="comment">% '-' sorts before '.'</span> +natsort(B) <span class="comment">% '-' sorts before '.'</span> +natsortfiles(B) <span class="comment">% correct dictionary sort</span> +</pre><pre class="codeoutput">ans = + 'test-aaa.m' + 'test.bbb.m' + 'test.m' + 'test_ccc.m' +ans = + 'test-aaa.m' + 'test.bbb.m' + 'test.m' + 'test_ccc.m' +ans = + 'test.m' + 'test-aaa.m' + 'test.bbb.m' + 'test_ccc.m' +</pre><h2>Explanation: Filenames<a name="7"></a></h2><p><tt>NATSORTFILES</tt> combines a dictionary sort with a natural-order sort, so that the number values within the filenames are taken into consideration:</p><pre class="codeinput">C = {<span class="string">'test2.m'</span>; <span class="string">'test10-old.m'</span>; <span class="string">'test.m'</span>; <span class="string">'test10.m'</span>; <span class="string">'test1.m'</span>}; +sort(C) <span class="comment">% Wrong numeric order.</span> +natsort(C) <span class="comment">% Correct numeric order, but longer before shorter.</span> +natsortfiles(C) <span class="comment">% Correct numeric order and dictionary sort.</span> +</pre><pre class="codeoutput">ans = + 'test.m' + 'test1.m' + 'test10-old.m' + 'test10.m' + 'test2.m' +ans = + 'test.m' + 'test1.m' + 'test2.m' + 'test10-old.m' + 'test10.m' +ans = + 'test.m' + 'test1.m' + 'test2.m' + 'test10.m' + 'test10-old.m' +</pre><h2>Explanation: Filepaths<a name="8"></a></h2><p>For the same reason, filepaths are split at each file path separator character (both <tt>'/'</tt> and <tt>'\'</tt> are considered to be file path separators) and every level of directory names are sorted separately. This ensures that the directory names are sorted with a dictionary sort and that any numbers are taken into consideration:</p><pre class="codeinput">D = {<span class="string">'A2-old\test.m'</span>;<span class="string">'A10\test.m'</span>;<span class="string">'A2\test.m'</span>;<span class="string">'AXarchive.zip'</span>;<span class="string">'A1\test.m'</span>}; +sort(D) <span class="comment">% Wrong numeric order, and '-' sorts before '\':</span> +natsort(D) <span class="comment">% correct numeric order, but longer before shorter.</span> +natsortfiles(D) <span class="comment">% correct numeric order and dictionary sort.</span> +</pre><pre class="codeoutput">ans = + 'A10\test.m' + 'A1\test.m' + 'A2-old\test.m' + 'A2\test.m' + 'AXarchive.zip' +ans = + 'A1\test.m' + 'A2-old\test.m' + 'A2\test.m' + 'A10\test.m' + 'AXarchive.zip' +ans = + 'AXarchive.zip' + 'A1\test.m' + 'A2\test.m' + 'A2-old\test.m' + 'A10\test.m' +</pre><h2>Regular Expression: Decimal Numbers, E-notation, +/- Sign<a name="9"></a></h2><p><tt>NATSORTFILES</tt> is a wrapper for <tt>NATSORT</tt>, which means all of <tt>NATSORT</tt>'s options are also supported. In particular the number recognition can be customized to detect numbers with decimal digits, E-notation, a +/- sign, or other specific features. This detection is defined by providing an appropriate regular expression: see <tt>NATSORT</tt> for details and examples.</p><pre class="codeinput">E = {<span class="string">'test24.csv'</span>,<span class="string">'test1.8.csv'</span>,<span class="string">'test5.csv'</span>,<span class="string">'test3.3.csv'</span>,<span class="string">'test12.csv'</span>}; +natsortfiles(E,<span class="string">'\d+\.?\d*'</span>) +</pre><pre class="codeoutput">ans = + 'test1.8.csv' 'test3.3.csv' 'test5.csv' 'test12.csv' 'test24.csv' +</pre><h2>Regular Expression: Interactive Regular Expression Tool<a name="10"></a></h2><p>Regular expressions are powerful and compact, but getting them right is not always easy. One assistance is to download my interactive tool <a href="https://www.mathworks.com/matlabcentral/fileexchange/48930"><tt>IREGEXP</tt></a>, which lets you quickly try different regular expressions and see all of <a href="https://www.mathworks.com/help/matlab/ref/regexp.html"><tt>REGEXP</tt></a>'s outputs displayed and updated as you type.</p><p class="footer"><br> + Published with MATLAB® 7.11<br></p></div><!-- +##### SOURCE BEGIN ##### +%% NATSORTFILES Examples +% The function <https://www.mathworks.com/matlabcentral/fileexchange/47434 +% |NATSORTFILES|> sorts a cell array of filenames or filepaths (1xN char), +% taking into account any number values within the strings. This is known +% as a _natural order sort_ or an _alphanumeric sort_. Note that MATLAB's +% inbuilt <http://www.mathworks.com/help/matlab/ref/sort.html |SORT|> function +% sorts the character codes only (as does |SORT| in most programming languages). +% +% |NATSORTFILES| is not a naive natural-order sort, but splits and sorts +% filenames and file extensions separately, which means that |NATSORTFILES| +% sorts shorter filenames before longer ones: this is known as a _dictionary +% sort_. For the same reason filepaths are split at every path-separator +% character (either |'\'| or |'/'|), and each directory level is sorted +% separately. See the "Explanation" sections below for more details. +% +% For sorting the rows of a cell array of strings use +% <https://www.mathworks.com/matlabcentral/fileexchange/47433 |NATSORTROWS|>. +% +% For sorting a cell array of strings use +% <https://www.mathworks.com/matlabcentral/fileexchange/34464 |NATSORT|>. +% +%% Basic Usage +% By default |NATSORTFILES| interprets consecutive digits as being part of +% a single integer, each number is considered to be as wide as one letter: +A = {'a2.txt', 'a10.txt', 'a1.txt'}; +sort(A) +natsortfiles(A) +%% Output 2: Sort Index +% The second output argument is a numeric array of the sort indices |ndx|, +% such that |Y = X(ndx)| where |Y = natsortfiles(X)|: +[~,ndx] = natsortfiles(A) +%% Output 3: Debugging Array +% The third output is a cell vector of cell arrays, where each cell array +% contains individual characters and numbers (after converting to numeric). +% This is useful for confirming that the numbers are being correctly +% identified by the regular expression. The cells of the cell vector +% correspond to the split directories, filenames, and file extensions. +% Note that the rows of the debugging cell arrays are +% <https://www.mathworks.com/company/newsletters/articles/matrix-indexing-in-matlab.html +% linearly indexed> from the input cell array. +[~,~,dbg] = natsortfiles(A); +dbg{:} +%% Example with DIR and a Cell Array +% One common situation is to use <https://www.mathworks.com/help/matlab/ref/dir.html +% |DIR|> to identify files in a folder, sort them into the correct order, +% and then loop over them: below is an example of how to do this. +% Remember to <https://www.mathworks.com/help/matlab/matlab_prog/preallocating-arrays.html +% preallocate> all output arrays before the loop! +D = 'natsortfiles_test'; % directory path +S = dir(fullfile(D,'*.txt')); % get list of files in directory +N = natsortfiles({S.name}); % sort file names into order +for k = 1:numel(N) + disp(fullfile(D,N{k})) +end +%% Example with DIR and a Structure +% Users who need to access the |DIR| structure fields can use |NATSORTFILE|'s +% second output to sort |DIR|'s output structure into the correct order: +D = 'natsortfiles_test'; % directory path +S = dir(fullfile(D,'*.txt')); % get list of files in directory +[~,ndx] = natsortfiles({S.name}); % indices of correct order +S = S(ndx); % sort structure using indices +for k = 1:numel(S) + fprintf('%-13s%s\n',S(k).name,S(k).date) +end +%% Explanation: Dictionary Sort +% Filenames and file extensions are separated by the extension separator: +% the period character |'.'|. Using a normal |SORT| the period gets sorted +% _after_ all of the characters from 0 to 45 (including |!"#$%&'()*+,-|, +% the space character, and all of the control characters, e.g. newlines, +% tabs, etc). This means that a naive |SORT| or natural-order sort will +% sort some short filenames after longer filenames. In order to provide +% the correct dictionary sort, with shorter filenames first, |NATSORTFILES| +% splits and sorts filenames and file extensions separately: +B = {'test_ccc.m'; 'test-aaa.m'; 'test.m'; 'test.bbb.m'}; +sort(B) % '-' sorts before '.' +natsort(B) % '-' sorts before '.' +natsortfiles(B) % correct dictionary sort +%% Explanation: Filenames +% |NATSORTFILES| combines a dictionary sort with a natural-order sort, so +% that the number values within the filenames are taken into consideration: +C = {'test2.m'; 'test10-old.m'; 'test.m'; 'test10.m'; 'test1.m'}; +sort(C) % Wrong numeric order. +natsort(C) % Correct numeric order, but longer before shorter. +natsortfiles(C) % Correct numeric order and dictionary sort. +%% Explanation: Filepaths +% For the same reason, filepaths are split at each file path separator +% character (both |'/'| and |'\'| are considered to be file path separators) +% and every level of directory names are sorted separately. This ensures +% that the directory names are sorted with a dictionary sort and that any +% numbers are taken into consideration: +D = {'A2-old\test.m';'A10\test.m';'A2\test.m';'AXarchive.zip';'A1\test.m'}; +sort(D) % Wrong numeric order, and '-' sorts before '\': +natsort(D) % correct numeric order, but longer before shorter. +natsortfiles(D) % correct numeric order and dictionary sort. +%% Regular Expression: Decimal Numbers, E-notation, +/- Sign +% |NATSORTFILES| is a wrapper for |NATSORT|, which means all of |NATSORT|'s +% options are also supported. In particular the number recognition can be +% customized to detect numbers with decimal digits, E-notation, a +/- sign, +% or other specific features. This detection is defined by providing an +% appropriate regular expression: see |NATSORT| for details and examples. +E = {'test24.csv','test1.8.csv','test5.csv','test3.3.csv','test12.csv'}; +natsortfiles(E,'\d+\.?\d*') +%% Regular Expression: Interactive Regular Expression Tool +% Regular expressions are powerful and compact, but getting them right is +% not always easy. One assistance is to download my interactive tool +% <https://www.mathworks.com/matlabcentral/fileexchange/48930 |IREGEXP|>, +% which lets you quickly try different regular expressions and see all of +% <https://www.mathworks.com/help/matlab/ref/regexp.html |REGEXP|>'s +% outputs displayed and updated as you type. +##### SOURCE END ##### +--></body></html> \ No newline at end of file diff --git a/natsort-FEX/license.txt b/natsort-FEX/license.txt new file mode 100644 index 0000000000000000000000000000000000000000..3885884db04e8c263b7aea1a92685155bf2c7b5f --- /dev/null +++ b/natsort-FEX/license.txt @@ -0,0 +1,24 @@ +Copyright (c) 2018, Stephen Cobeldick +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the distribution + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. diff --git a/natsort-FEX/natsort.m b/natsort-FEX/natsort.m new file mode 100644 index 0000000000000000000000000000000000000000..44cea491bae178dfd99bebe64f0cab3c9b374ee5 --- /dev/null +++ b/natsort-FEX/natsort.m @@ -0,0 +1,330 @@ +function [X,ndx,dbg] = natsort(X,xpr,varargin) %#ok<*SPERR> +% Alphanumeric / Natural-Order sort the strings in a cell array of strings (1xN char). +% +% (c) 2012 Stephen Cobeldick +% +% Alphanumeric sort of a cell array of strings: sorts by character order +% and also by the values of any numbers that are within the strings. The +% default is case-insensitive ascending with integer number substrings: +% optional inputs control the sort direction, case sensitivity, and number +% matching (see the section "Number Substrings" below). +% +%%% Example: +% X = {'x2', 'x10', 'x1'}; +% sort(X) +% ans = 'x1' 'x10' 'x2' +% natsort(X) +% ans = 'x1' 'x2' 'x10' +% +%%% Syntax: +% Y = natsort(X) +% Y = natsort(X,xpr) +% Y = natsort(X,xpr,<options>) +% [Y,ndx] = natsort(X,...) +% [Y,ndx,dbg] = natsort(X,...) +% +% To sort filenames or filepaths use NATSORTFILES (File Exchange 47434). +% To sort the rows of a cell array of strings use NATSORTROWS (File Exchange 47433). +% +% See also NATSORTFILES NATSORTROWS SORT CELLSTR IREGEXP REGEXP SSCANF INTMAX +% +%% Number Substrings %% +% +% By default consecutive digit characters are interpreted as an integer. +% The optional regular expression pattern <xpr> permits the numbers to also +% include a +/- sign, decimal digits, exponent E-notation, or any literal +% characters, quantifiers, or look-around requirements. For more information: +% http://www.mathworks.com/help/matlab/matlab_prog/regular-expressions.html +% +% The substrings are then parsed by SSCANF into numeric variables, using +% either the *default format '%f' or the user-supplied format specifier. +% +% This table shows some example regular expression patterns for some common +% notations and ways of writing numbers (see section "Examples" for more): +% +% <xpr> Regular | Number Substring | Number Substring | SSCANF +% Expression: | Match Examples: | Match Description: | Format Specifier: +% ==============|==================|===============================|================== +% * \d+ | 0, 1, 234, 56789 | unsigned integer | %f %u %lu %i +% --------------|------------------|-------------------------------|------------------ +% (-|+)?\d+ | -1, 23, +45, 678 | integer with optional +/- sign| %f %d %ld %i +% --------------|------------------|-------------------------------|------------------ +% \d+\.?\d* | 012, 3.45, 678.9 | integer or decimal | %f +% --------------|------------------|-------------------------------|------------------ +% \d+|Inf|NaN | 123, 4, Inf, NaN | integer, infinite or NaN value| %f +% --------------|------------------|-------------------------------|------------------ +% \d+\.\d+e\d+ | 0.123e4, 5.67e08 | exponential notation | %f +% --------------|------------------|-------------------------------|------------------ +% 0[0-7]+ | 012, 03456, 0700 | octal prefix & notation | %o %i +% --------------|------------------|-------------------------------|------------------ +% 0X[0-9A-F]+ | 0X0, 0XFF, 0X7C4 | hexadecimal prefix & notation | %x %i +% --------------|------------------|-------------------------------|------------------ +% 0B[01]+ | 0B101, 0B0010111 | binary prefix & notation | %b (not SSCANF) +% --------------|------------------|-------------------------------|------------------ +% +% The SSCANF format specifier (including %b) can include literal characters +% and skipped fields. The octal, hexadecimal and binary prefixes are optional. +% For more information: http://www.mathworks.com/help/matlab/ref/sscanf.html +% +%% Debugging Output Array %% +% +% The third output is a cell array <dbg>, to check if the numbers have +% been matched by the regular expression <rgx> and converted to numeric +% by the SSCANF format. The rows of <dbg> are linearly indexed from <X>: +% +% [~,~,dbg] = natsort(X) +% dbg = +% 'x' [ 2] +% 'x' [10] +% 'x' [ 1] +% +%% Relative Sort Order %% +% +% The sort order of the number substrings relative to the characters +% can be controlled by providing one of the following string options: +% +% Option Token:| Relative Sort Order: | Example: +% =============|======================================|==================== +% 'beforechar' | numbers < char(0:end) | '1' < '#' < 'A' +% -------------|--------------------------------------|-------------------- +% 'afterchar' | char(0:end) < numbers | '#' < 'A' < '1' +% -------------|--------------------------------------|-------------------- +% 'asdigit' *| char(0:47) < numbers < char(48:end) | '#' < '1' < 'A' +% -------------|--------------------------------------|-------------------- +% +% Note that the digit characters have character values 48 to 57, inclusive. +% +%% Examples %% +% +%%% Multiple integer substrings (e.g. release version numbers): +% B = {'v10.6', 'v9.10', 'v9.5', 'v10.10', 'v9.10.20', 'v9.10.8'}; +% sort(B) +% ans = 'v10.10' 'v10.6' 'v9.10' 'v9.10.20' 'v9.10.8' 'v9.5' +% natsort(B) +% ans = 'v9.5' 'v9.10' 'v9.10.8' 'v9.10.20' 'v10.6' 'v10.10' +% +%%% Integer, decimal or Inf number substrings, possibly with +/- signs: +% C = {'test+Inf', 'test11.5', 'test-1.4', 'test', 'test-Inf', 'test+0.3'}; +% sort(C) +% ans = 'test' 'test+0.3' 'test+Inf' 'test-1.4' 'test-Inf' 'test11.5' +% natsort(C, '(-|+)?(Inf|\d+\.?\d*)') +% ans = 'test' 'test-Inf' 'test-1.4' 'test+0.3' 'test11.5' 'test+Inf' +% +%%% Integer or decimal number substrings, possibly with an exponent: +% D = {'0.56e007', '', '4.3E-2', '10000', '9.8'}; +% sort(D) +% ans = '' '0.56e007' '10000' '4.3E-2' '9.8' +% natsort(D, '\d+\.?\d*(E(+|-)?\d+)?') +% ans = '' '4.3E-2' '9.8' '10000' '0.56e007' +% +%%% Hexadecimal number substrings (possibly with '0X' prefix): +% E = {'a0X7C4z', 'a0X5z', 'a0X18z', 'aFz'}; +% sort(E) +% ans = 'a0X18z' 'a0X5z' 'a0X7C4z' 'aFz' +% natsort(E, '(?<=a)(0X)?[0-9A-F]+', '%x') +% ans = 'a0X5z' 'aFz' 'a0X18z' 'a0X7C4z' +% +%%% Binary number substrings (possibly with '0B' prefix): +% F = {'a11111000100z', 'a0B101z', 'a0B000000000011000z', 'a1111z'}; +% sort(F) +% ans = 'a0B000000000011000z' 'a0B101z' 'a11111000100z' 'a1111z' +% natsort(F, '(0B)?[01]+', '%b') +% ans = 'a0B101z' 'a1111z' 'a0B000000000011000z' 'a11111000100z' +% +%%% UINT64 number substrings (with full precision!): +% natsort({'a18446744073709551615z', 'a18446744073709551614z'}, [], '%lu') +% ans = 'a18446744073709551614z' 'a18446744073709551615z' +% +%%% Case sensitivity: +% G = {'a2', 'A20', 'A1', 'a10', 'A2', 'a1'}; +% natsort(G, [], 'ignorecase') % default +% ans = 'A1' 'a1' 'a2' 'A2' 'a10' 'A20' +% natsort(G, [], 'matchcase') +% ans = 'A1' 'A2' 'A20' 'a1' 'a2' 'a10' +% +%%% Sort direction: +% H = {'2', 'a', '3', 'B', '1'}; +% natsort(H, [], 'ascend') % default +% ans = '1' '2' '3' 'a' 'B' +% natsort(H, [], 'descend') +% ans = 'B' 'a' '3' '2' '1' +% +%%% Relative sort-order of number substrings compared to characters: +% V = num2cell(char(32+randperm(63))); +% cell2mat(natsort(V, [], 'asdigit')) % default +% ans = '!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_' +% cell2mat(natsort(V, [], 'beforechar')) +% ans = '0123456789!"#$%&'()*+,-./:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_' +% cell2mat(natsort(V, [], 'afterchar')) +% ans = '!"#$%&'()*+,-./:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_0123456789' +% +%% Input and Output Arguments %% +% +%%% Inputs (*=default): +% X = CellArrayOfCharRowVectors, to be sorted into natural-order. +% xpr = CharRowVector, regular expression for number substrings, '\d+'*. +% <options> tokens can be entered in any order, as many as required: +% - Sort direction: 'descend'/'ascend'*. +% - Case sensitive/insensitive matching: 'matchcase'/'ignorecase'*. +% - Relative sort of numbers: 'beforechar'/'afterchar'/'asdigit'*. +% - The SSCANF number conversion format, e.g.: '%x', '%i', '%f'*, etc. +% +%%% Outputs: +% Y = CellArrayOfCharRowVectors, <X> sorted into natural-order. +% ndx = NumericArray, such that Y = X(ndx). The same size as <X>. +% dbg = CellArray of the parsed characters and number values. Each row is +% one input char vector, linear-indexed from <X>. To help debug <xpr>. +% +% [X,ndx,dbg] = natsort(X,xpr*,<options>) + +%% Input Wrangling %% +% +assert(iscell(X),'First input <X> must be a cell array.') +tmp = cellfun('isclass',X,'char') & cellfun('size',X,1)<2 & cellfun('ndims',X)<3; +assert(all(tmp(:)),'First input <X> must be a cell array of char row vectors (1xN char).') +% +% Regular expression: +if nargin<2 || isnumeric(xpr)&&isempty(xpr) + xpr = '\d+'; +else + assert(ischar(xpr)&&isrow(xpr),'Second input <xpr> must be a regular expression (char row vector).') +end +% +% Optional arguments: +tmp = cellfun('isclass',varargin,'char') & 1==cellfun('size',varargin,1) & 2==cellfun('ndims',varargin); +assert(all(tmp(:)),'All optional arguments must be char row vectors (1xN char).') +% Character case matching: +ChrM = strcmpi(varargin,'matchcase'); +ChrX = strcmpi(varargin,'ignorecase')|ChrM; +% Sort direction: +DrnD = strcmpi(varargin,'descend'); +DrnX = strcmpi(varargin,'ascend')|DrnD; +% Relative sort-order of numbers compared to characters: +RsoB = strcmpi(varargin,'beforechar'); +RsoA = strcmpi(varargin,'afterchar'); +RsoX = strcmpi(varargin,'asdigit')|RsoB|RsoA; +% SSCANF conversion format: +FmtX = ~(ChrX|DrnX|RsoX); +% +if nnz(FmtX)>1 + tmp = sprintf(', ''%s''',varargin{FmtX}); + error('Overspecified optional arguments:%s.',tmp(2:end)) +end +if nnz(DrnX)>1 + tmp = sprintf(', ''%s''',varargin{DrnX}); + error('Sort direction is overspecified:%s.',tmp(2:end)) +end +if nnz(RsoX)>1 + tmp = sprintf(', ''%s''',varargin{RsoX}); + error('Relative sort-order is overspecified:%s.',tmp(2:end)) +end +% +%% Split Strings %% +% +% Split strings into number and remaining substrings: +[MtS,MtE,MtC,SpC] = regexpi(X(:),xpr,'start','end','match','split',varargin{ChrX}); +% +% Determine lengths: +MtcD = cellfun(@minus,MtE,MtS,'UniformOutput',false); +LenZ = cellfun('length',X(:))-cellfun(@sum,MtcD); +LenY = max(LenZ); +LenX = numel(MtC); +% +dbg = cell(LenX,LenY); +NuI = false(LenX,LenY); +ChI = false(LenX,LenY); +ChA = char(double(ChI)); +% +ndx = 1:LenX; +for k = ndx(LenZ>0) + % Determine indices of numbers and characters: + ChI(k,1:LenZ(k)) = true; + if ~isempty(MtS{k}) + tmp = MtE{k} - cumsum(MtcD{k}); + dbg(k,tmp) = MtC{k}; + NuI(k,tmp) = true; + ChI(k,tmp) = false; + end + % Transfer characters into char array: + if any(ChI(k,:)) + tmp = SpC{k}; + ChA(k,ChI(k,:)) = [tmp{:}]; + end +end +% +%% Convert Number Substrings %% +% +if nnz(FmtX) % One format specifier + fmt = varargin{FmtX}; + err = ['The supplied format results in an empty output from sscanf: ''',fmt,'''']; + pct = '(?<!%)(%%)*%'; % match an odd number of % characters. + [T,S] = regexp(fmt,[pct,'(\d*)([bdiuoxfeg]|l[diuox])'],'tokens','split'); + assert(isscalar(T),'Unsupported optional argument: ''%s''',fmt) + assert(isempty(T{1}{2}),'Format specifier cannot include field-width: ''%s''',fmt) + switch T{1}{3}(1) + case 'b' % binary + fmt = regexprep(fmt,[pct,'(\*?)b'],'$1%$2[01]'); + val = dbg(NuI); + if numel(S{1})<2 || ~strcmpi('0B',S{1}(end-1:end)) + % Remove '0B' if not specified in the format string: + val = regexprep(val,'(0B)?([01]+)','$2','ignorecase'); + end + val = cellfun(@(s)sscanf(s,fmt),val, 'UniformOutput',false); + assert(~any(cellfun('isempty',val)),err) + NuA(NuI) = cellfun(@(s)sum(pow2(s-'0',numel(s)-1:-1:0)),val); + case 'l' % 64-bit + NuA(NuI) = cellfun(@(s)sscanf(s,fmt),dbg(NuI)); %slow! + otherwise % double + NuA(NuI) = sscanf(sprintf('%s\v',dbg{NuI}),[fmt,'\v']); % fast! + end +else % No format specifier -> double + NuA(NuI) = sscanf(sprintf('%s\v',dbg{NuI}),'%f\v'); +end +% Note: NuA's class is determined by SSCANF or the custom binary parser. +NuA(~NuI) = 0; +NuA = reshape(NuA,LenX,LenY); +% +%% Debugging Array %% +% +if nargout>2 + dbg(:) = {''}; + for k = reshape(find(NuI),1,[]) + dbg{k} = NuA(k); + end + for k = reshape(find(ChI),1,[]) + dbg{k} = ChA(k); + end +end +% +%% Sort Columns %% +% +if ~any(ChrM) % ignorecase + ChA = upper(ChA); +end +% +ide = ndx.'; +% From the last column to the first... +for n = LenY:-1:1 + % ...sort the characters and number values: + [C,idc] = sort(ChA(ndx,n),1,varargin{DrnX}); + [~,idn] = sort(NuA(ndx,n),1,varargin{DrnX}); + % ...keep only relevant indices: + jdc = ChI(ndx(idc),n); % character + jdn = NuI(ndx(idn),n); % number + jde = ~ChI(ndx,n)&~NuI(ndx,n); % empty + % ...define the sort-order of numbers and characters: + jdo = any(RsoA)|(~any(RsoB)&C<'0'); + % ...then combine these indices in the requested direction: + if any(DrnD) % descending + idx = [idc(jdc&~jdo);idn(jdn);idc(jdc&jdo);ide(jde)]; + else % ascending + idx = [ide(jde);idc(jdc&jdo);idn(jdn);idc(jdc&~jdo)]; + end + ndx = ndx(idx); +end +% +ndx = reshape(ndx,size(X)); +X = X(ndx); +% +end +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%natsort \ No newline at end of file diff --git a/natsort-FEX/natsortfiles.m b/natsort-FEX/natsortfiles.m new file mode 100644 index 0000000000000000000000000000000000000000..858bcf36eb959dd41d26fa96d428b088a07e42f8 --- /dev/null +++ b/natsort-FEX/natsortfiles.m @@ -0,0 +1,169 @@ +function [X,ndx,dbg] = natsortfiles(X,varargin) +% Alphanumeric / Natural-Order sort of a cell array of filename/filepath strings (1xN char). +% +% (c) 2012 Stephen Cobeldick +% +% Alphanumeric sort of a cell array of filenames or filepaths: sorts by +% character order and also by the values of any numbers that are within +% the names. Filenames, file-extensions, and directories (if supplied) +% are split apart and are sorted separately: this ensures that shorter +% filenames sort before longer ones (i.e. thus giving a dictionary sort). +% +%%% Example: +% D = 'C:\Test'; +% S = dir(fullfile(D,'*.txt')); +% N = natsortfiles({S.name}); +% for k = 1:numel(N) +% fullfile(D,N{k}) +% end +% +%%% Syntax: +% Y = natsortfiles(X) +% Y = natsortfiles(X,xpr) +% Y = natsortfiles(X,xpr,<options>) +% [Y,ndx] = natsortfiles(X,...) +% [Y,ndx,dbg] = natsortfiles(X,...) +% +% To sort all of the strings in a cell array use NATSORT (File Exchange 34464). +% To sort the rows of a cell array of strings use NATSORTROWS (File Exchange 47433). +% +% See also NATSORT NATSORTROWS SORT CELLSTR IREGEXP REGEXP SSCANF DIR FILEPARTS FULLFILE +% +%% File Dependency %% +% +% NATSORTFILES requires the function NATSORT (File Exchange 34464). The inputs +% <xpr> and <options> are passed directly to NATSORT: see NATSORT for case +% sensitivity, sort direction, numeric substring matching, and other options. +% +%% Explanation %% +% +% Using SORT on filenames will sort any of char(0:45), including the printing +% characters ' !"#$%&''()*+,-', before the file extension separator character '.'. +% Therefore this function splits the name and extension and sorts them separately. +% +% Similarly the file separator character within filepaths can cause longer +% directory names to sort before shorter ones, as char(0:46)<'/' and char(0:91)<'\'. +% NATSORTFILES splits filepaths at each file separator character and sorts +% every level of the directory hierarchy separately, ensuring that shorter +% directory names sort before longer, regardless of the characters in the names. +% +%% Examples %% +% +% A = {'a2.txt', 'a10.txt', 'a1.txt'}; +% sort(A) +% ans = 'a1.txt' 'a10.txt' 'a2.txt' +% natsortfiles(A) +% ans = 'a1.txt' 'a2.txt' 'a10.txt' +% +% B = {'test_new.m'; 'test-old.m'; 'test.m'}; +% sort(B) % Note '-' sorts before '.': +% ans = +% 'test-old.m' +% 'test.m' +% 'test_new.m' +% natsortfiles(B) % Shorter names before longer (dictionary sort): +% ans = +% 'test.m' +% 'test-old.m' +% 'test_new.m' +% +% C = {'test2.m'; 'test10-old.m'; 'test.m'; 'test10.m'; 'test1.m'}; +% sort(C) % Wrong numeric order: +% ans = +% 'test.m' +% 'test1.m' +% 'test10-old.m' +% 'test10.m' +% 'test2.m' +% natsortfiles(C) % Correct numeric order, shorter names before longer: +% ans = +% 'test.m' +% 'test1.m' +% 'test2.m' +% 'test10.m' +% 'test10-old.m' +% +%%% Directory Names: +% D = {'A2-old\test.m';'A10\test.m';'A2\test.m';'A1archive.zip';'A1\test.m'}; +% sort(D) % Wrong numeric order, and '-' sorts before '\': +% ans = +% 'A10\test.m' +% 'A1\test.m' +% 'A1archive.zip' +% 'A2-old\test.m' +% 'A2\test.m' +% natsortfiles(D) % Shorter names before longer (dictionary sort): +% ans = +% 'A1archive.zip' +% 'A1\test.m' +% 'A2\test.m' +% 'A2-old\test.m' +% 'A10\test.m' +% +%% Input and Output Arguments %% +% +% See NATSORT for a full description of <xpr> and the <options>. +% +%%% Inputs (*=default): +% X = CellArrayOfCharRowVectors, with filenames or filepaths to be sorted. +% xpr = CharRowVector, regular expression to detect numeric substrings, '\d+'*. +% <options> can be supplied in any order and are passed directly to NATSORT. +% +%%% Outputs: +% Y = CellArrayOfCharRowVectors, filenames of <X> sorted into natural-order. +% ndx = NumericMatrix, same size as <X>. Indices such that Y = X(ndx). +% dbg = CellVectorOfCellArrays, size 1xMAX(2+NumberOfDirectoryLevels). +% Each cell contains the debug cell array for directory names, +% filenames, or file extensions. To help debug <xpr>. See NATSORT. +% +% [Y,ndx,dbg] = natsortfiles(X,*xpr,<options>) + +%% Input Wrangling %% +% +assert(iscell(X),'First input <X> must be a cell array.') +tmp = cellfun('isclass',X,'char') & cellfun('size',X,1)<2 & cellfun('ndims',X)<3; +assert(all(tmp(:)),'First input <X> must be a cell array of strings (1xN character).') +% +%% Split and Sort File Names/Paths %% +% +% Split full filepaths into file [path,name,extension]: +[pth,fnm,ext] = cellfun(@fileparts,X(:),'UniformOutput',false); +% Split path into {dir,subdir,subsubdir,...}: +pth = regexp(pth,'[^/\\]+','match'); % either / or \ as filesep. +len = cellfun('length',pth); +num = max(len); +vec{numel(len)} = []; +% +% Natural-order sort of the file extensions and filenames: +if nargout<3 % faster: + [~,ndx] = natsort(ext,varargin{:}); + [~,ids] = natsort(fnm(ndx),varargin{:}); +else % for debugging: + [~,ndx,dbg{num+2}] = natsort(ext,varargin{:}); + [~,ids,tmp] = natsort(fnm(ndx),varargin{:}); + [~,idd] = sort(ndx); + dbg{num+1} = tmp(idd,:); +end +ndx = ndx(ids); +% +% Natural-order sort of the directory names: +for k = num:-1:1 + idx = len>=k; + vec(:) = {''}; + vec(idx) = cellfun(@(c)c(k),pth(idx)); + if nargout<3 % faster: + [~,ids] = natsort(vec(ndx),varargin{:}); + else % for debugging: + [~,ids,tmp] = natsort(vec(ndx),varargin{:}); + [~,idd] = sort(ndx); + dbg{k} = tmp(idd,:); + end + ndx = ndx(ids); +end +% +% Return the sorted array and indices: +ndx = reshape(ndx,size(X)); +X = X(ndx); +% +end +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%natsortfiles \ No newline at end of file diff --git a/natsort-FEX/natsortfiles_doc.m b/natsort-FEX/natsortfiles_doc.m new file mode 100644 index 0000000000000000000000000000000000000000..d57a04190fc9b60d542bee29148934795f583ccf --- /dev/null +++ b/natsort-FEX/natsortfiles_doc.m @@ -0,0 +1,109 @@ +%% NATSORTFILES Examples +% The function <https://www.mathworks.com/matlabcentral/fileexchange/47434 +% |NATSORTFILES|> sorts a cell array of filenames or filepaths (1xN char), +% taking into account any number values within the strings. This is known +% as a _natural order sort_ or an _alphanumeric sort_. Note that MATLAB's +% inbuilt <http://www.mathworks.com/help/matlab/ref/sort.html |SORT|> function +% sorts the character codes only (as does |SORT| in most programming languages). +% +% |NATSORTFILES| is not a naive natural-order sort, but splits and sorts +% filenames and file extensions separately, which means that |NATSORTFILES| +% sorts shorter filenames before longer ones: this is known as a _dictionary +% sort_. For the same reason filepaths are split at every path-separator +% character (either |'\'| or |'/'|), and each directory level is sorted +% separately. See the "Explanation" sections below for more details. +% +% For sorting the rows of a cell array of strings use +% <https://www.mathworks.com/matlabcentral/fileexchange/47433 |NATSORTROWS|>. +% +% For sorting a cell array of strings use +% <https://www.mathworks.com/matlabcentral/fileexchange/34464 |NATSORT|>. +% +%% Basic Usage +% By default |NATSORTFILES| interprets consecutive digits as being part of +% a single integer, each number is considered to be as wide as one letter: +A = {'a2.txt', 'a10.txt', 'a1.txt'}; +sort(A) +natsortfiles(A) +%% Output 2: Sort Index +% The second output argument is a numeric array of the sort indices |ndx|, +% such that |Y = X(ndx)| where |Y = natsortfiles(X)|: +[~,ndx] = natsortfiles(A) +%% Output 3: Debugging Array +% The third output is a cell vector of cell arrays, where each cell array +% contains individual characters and numbers (after converting to numeric). +% This is useful for confirming that the numbers are being correctly +% identified by the regular expression. The cells of the cell vector +% correspond to the split directories, filenames, and file extensions. +% Note that the rows of the debugging cell arrays are +% <https://www.mathworks.com/company/newsletters/articles/matrix-indexing-in-matlab.html +% linearly indexed> from the input cell array. +[~,~,dbg] = natsortfiles(A); +dbg{:} +%% Example with DIR and a Cell Array +% One common situation is to use <https://www.mathworks.com/help/matlab/ref/dir.html +% |DIR|> to identify files in a folder, sort them into the correct order, +% and then loop over them: below is an example of how to do this. +% Remember to <https://www.mathworks.com/help/matlab/matlab_prog/preallocating-arrays.html +% preallocate> all output arrays before the loop! +D = 'natsortfiles_test'; % directory path +S = dir(fullfile(D,'*.txt')); % get list of files in directory +N = natsortfiles({S.name}); % sort file names into order +for k = 1:numel(N) + disp(fullfile(D,N{k})) +end +%% Example with DIR and a Structure +% Users who need to access the |DIR| structure fields can use |NATSORTFILE|'s +% second output to sort |DIR|'s output structure into the correct order: +D = 'natsortfiles_test'; % directory path +S = dir(fullfile(D,'*.txt')); % get list of files in directory +[~,ndx] = natsortfiles({S.name}); % indices of correct order +S = S(ndx); % sort structure using indices +for k = 1:numel(S) + fprintf('%-13s%s\n',S(k).name,S(k).date) +end +%% Explanation: Dictionary Sort +% Filenames and file extensions are separated by the extension separator: +% the period character |'.'|. Using a normal |SORT| the period gets sorted +% _after_ all of the characters from 0 to 45 (including |!"#$%&'()*+,-|, +% the space character, and all of the control characters, e.g. newlines, +% tabs, etc). This means that a naive |SORT| or natural-order sort will +% sort some short filenames after longer filenames. In order to provide +% the correct dictionary sort, with shorter filenames first, |NATSORTFILES| +% splits and sorts filenames and file extensions separately: +B = {'test_ccc.m'; 'test-aaa.m'; 'test.m'; 'test.bbb.m'}; +sort(B) % '-' sorts before '.' +natsort(B) % '-' sorts before '.' +natsortfiles(B) % correct dictionary sort +%% Explanation: Filenames +% |NATSORTFILES| combines a dictionary sort with a natural-order sort, so +% that the number values within the filenames are taken into consideration: +C = {'test2.m'; 'test10-old.m'; 'test.m'; 'test10.m'; 'test1.m'}; +sort(C) % Wrong numeric order. +natsort(C) % Correct numeric order, but longer before shorter. +natsortfiles(C) % Correct numeric order and dictionary sort. +%% Explanation: Filepaths +% For the same reason, filepaths are split at each file path separator +% character (both |'/'| and |'\'| are considered to be file path separators) +% and every level of directory names are sorted separately. This ensures +% that the directory names are sorted with a dictionary sort and that any +% numbers are taken into consideration: +D = {'A2-old\test.m';'A10\test.m';'A2\test.m';'AXarchive.zip';'A1\test.m'}; +sort(D) % Wrong numeric order, and '-' sorts before '\': +natsort(D) % correct numeric order, but longer before shorter. +natsortfiles(D) % correct numeric order and dictionary sort. +%% Regular Expression: Decimal Numbers, E-notation, +/- Sign +% |NATSORTFILES| is a wrapper for |NATSORT|, which means all of |NATSORT|'s +% options are also supported. In particular the number recognition can be +% customized to detect numbers with decimal digits, E-notation, a +/- sign, +% or other specific features. This detection is defined by providing an +% appropriate regular expression: see |NATSORT| for details and examples. +E = {'test24.csv','test1.8.csv','test5.csv','test3.3.csv','test12.csv'}; +natsortfiles(E,'\d+\.?\d*') +%% Regular Expression: Interactive Regular Expression Tool +% Regular expressions are powerful and compact, but getting them right is +% not always easy. One assistance is to download my interactive tool +% <https://www.mathworks.com/matlabcentral/fileexchange/48930 |IREGEXP|>, +% which lets you quickly try different regular expressions and see all of +% <https://www.mathworks.com/help/matlab/ref/regexp.html |REGEXP|>'s +% outputs displayed and updated as you type. \ No newline at end of file