Extract file extension: Difference between revisions

Content deleted Content added

Inline

Revision as of 23:24, 22 April 2016

Write a program that takes one string argument representing the path to a file and returns the file's extension, or the null string if the file path has no extension. An extension appears after the last period in the file name and consists of one or more letters or numbers.

Show here the action of your routine on the following examples:

picture.jpg returns .jpg
http://mywebsite.com/picture/image.png returns .png
myuniquefile.longextension returns .longextension
IAmAFileWithoutExtension returns an empty string ""
/path/to.my/file returns an empty string as the period is in the directory name rather than the file
file.odd_one returns an empty string as an extension (by this definition), cannot contain an underscore.

ALGOL 68

Works with: ALGOL 68G version Any - tested with release 2.8.win32

<lang algol68># extracts a file-extension from the end of a pathname. The file extension is #

defined as a dot followed by one or more letters or digits #

OP EXTENSION = ( STRING pathname )STRING:

   IF LWB pathname >= UPB pathname THEN
       # the pathname has 0 or 1 characters and so has no extension          #
       ""
   ELIF NOT isalnum( pathname[ UPB pathname ] ) THEN
       # the final character is not a letter or digit - no extension         #
       ""
   ELSE
       # could have an extension                                             #
       INT    pos := UPB pathname;
       WHILE pos > LWB pathname AND isalnum( pathname[ pos ] ) DO
           pos -:= 1
       OD;
       IF pathname[ pos ] = "." THEN
           # the character before the letters and digits was a "."           #
           pathname[ pos : ]
       ELSE
           # no "." before the letters and digits - no extension             #
           ""
       FI
   FI ; # EXTENSION #

test the EXTENSION operator #

PROC test extension = ( STRING pathname, STRING expected extension )VOID:

   BEGIN
       STRING extension = EXTENSION pathname;
       write( ( ( pathname
                + " got extension: ("
                + extension
                + ") "
                + IF extension = expected extension THEN "" ELSE "NOT" FI
                + " as expected"
                )
              , newline
              )
            )
   END ; # text extension #

main: ( test extension( "picture.jpg", ".jpg" )

test extension( "http://mywebsite.com/picture/image.png", ".png" )
test extension( "myuniquefile.longextension", ".longextension" )
test extension( "IAmAFileWithoutExtension", "" )
test extension( "/path/to.my/file", "" )
test extension( "file.odd_one", "" )

)</lang>

Output:

picture.jpg got extension: (.jpg)  as expected
http://mywebsite.com/picture/image.png got extension: (.png)  as expected
myuniquefile.longextension got extension: (.longextension)  as expected
IAmAFileWithoutExtension got extension: ()  as expected
/path/to.my/file got extension: ()  as expected
file.odd_one got extension: ()  as expected

ALGOL W

<lang algolw>begin

   % extracts a file-extension from the end of a pathname.                   %
   % The file extension is defined as a dot followed by one or more letters  %
   % or digits. As Algol W only has fixed length strings we limit the        %
   % extension to 32 characters and the pathname to 256 (the longest string  %
   % allowed by Algol W)                                                     %
   string(32) procedure extension( string(256) value pathname ) ;
   begin

       integer pathPos;

       % position to the previous character in the pathname                  %
       procedure prev         ; pathPos := pathPos - 1;
       % get the character as pathPos from pathname                          %
       string(1) procedure ch ; pathname( pathPos // 1 );
       % checks for a letter or digit - assumes the letters are contiguous   %
       % in the character set - not true for EBCDIC                          %
       logical   procedure isLetterOrDigit( string(1) value c ) ;
                 ( c <= "z" and c >= "a" ) or ( c <= "Z" and c >= "A" )
                                           or ( c <= "9" and c >= "0" ) ;

       % find the length of the pathname with trailing blanks removed        %
       pathPos := 255;
       while pathPos >= 0 and ch = " " do prev;

       % extract the extension if possible                                   %
       if pathPos <= 0
       then ""       % no extension: 0 or 1 character pathname               %
       else if not isLetterOrDigit( ch )
       then ""       % no extension: last character not a letter/digit       %
       else begin
           while pathPos > 0 and isLetterOrDigit( ch ) do prev;
           if ch not = "."
           then ""   % no extension: letters/digits not preceeded by "."     %
           else begin
               % have an extension                                           %
               string(32) ext;
               ext := " ";
               % algol W substring lengths must be compile-time constants    %
               % hence the loop to copy the extension characters             %
               for charPos := 0 until 31 do begin
                   if pathPos <= 255 then begin
                       ext( charPos // 1 ) := pathname( pathPos // 1 );
                       pathPos := pathPos + 1
                   end
               end for_charPos ;
               ext
           end
       end

   end extension ;

   % test the extension procedure                                            %
   procedure testExtension( string(256) value pathname
                          ; string(32)  value expectedExtension
                          ) ;
   begin
       string(32) ext;
       ext := extension( pathname );
       write( pathname( 0 // 40 )
            , " -> ("
            , ext( 0 // 16 )
            , ") "
            , if ext = expectedExtension then "" else "NOT"
            , " as expected"
            )
   end ; % text extension %
              
   testExtension( "picture.jpg",                            ".jpg"           );
   testExtension( "http://mywebsite.com/picture/image.png", ".png"           );
   testExtension( "myuniquefile.longextension",             ".longextension" );
   testExtension( "IAmAFileWithoutExtension",               ""               );
   testExtension( "/path/to.my/file",                       ""               );
   testExtension( "file.odd_one",                           ""               );

end.</lang>

Output:

picture.jpg                              -> (.jpg            )  as expected
http://mywebsite.com/picture/image.png   -> (.png            )  as expected
myuniquefile.longextension               -> (.longextension  )  as expected
IAmAFileWithoutExtension                 -> (                )  as expected
/path/to.my/file                         -> (                )  as expected
file.odd_one                             -> (                )  as expected

AWK

syntax: GAWK -f EXTRACT_FILE_EXTENSION.AWK

BEGIN {

   arr[++i] = "picture.jpg"
   arr[++i] = "http://mywebsite.com/picture/image.png"
   arr[++i] = "myuniquefile.longextension"
   arr[++i] = "IAmAFileWithoutExtension"
   arr[++i] = "/path/to.my/file"
   arr[++i] = "file.odd_one"
   for (j=1; j<=i; j++) {
     printf("%-40s '%s'\n",arr[j],extract_ext(arr[j]))
   }
   exit(0)

} function extract_ext(fn, sep1,sep2,tmp) {

   while (fn ~ (sep1 = ":|\\\\|\\/")) { # ":" or "\" or "/"
     fn = substr(fn,match(fn,sep1)+1)
   }
   while (fn ~ (sep2 = "\\.")) { # "."
     fn = substr(fn,match(fn,sep2)+1)
     tmp = 1
   }
   if (fn ~ /_/ || tmp == 0) {
     return("")
   }
   return(fn)

} </lang>

Output:

picture.jpg                              'jpg'
http://mywebsite.com/picture/image.png   'png'
myuniquefile.longextension               'longextension'
IAmAFileWithoutExtension                 ''
/path/to.my/file                         ''
file.odd_one                             ''

C

include <assert.h>
include <ctype.h>
include <string.h>
include <stdio.h>

/* Returns a pointer to the extension of 'string'. If no extension is found,

* then returns a pointer to the null-terminator of 'string'. */

char* file_ext(const char *string) {

   assert(string != NULL);
   char *ext = strrchr(string, '.');

   if (ext == NULL)
       return (char*) string + strlen(string);

   for (char *iter = ext + 1; *iter != '\0'; iter++) {
       if (!isalnum(*iter))
           return (char*) string + strlen(string);
   }

   return ext + 1;

}

int main(void) {

   const char *strings[] = {
       "picture.jpg",
       "http://mywebsite.con/picture/image.png",
       "myuniquefile.longextension",
       "IAmAFileWithoutExtension",
       "/path/to.my/file",
       "file.odd_one"
   };

   for (int i = 0; i < sizeof(strings) / sizeof(strings[0]); ++i) {
       printf("'%s' - '%s'\n", strings[i], file_ext(strings[i]));
   }

} </lang>

Output:

'picture.jpg' - 'jpg'
'http://mywebsite.con/picture/image.png' - 'png'
'myuniquefile.longextension' - 'longextension'
'IAmAFileWithoutExtension' - ''
'/path/to.my/file' - ''
'file.odd_one' - ''

C++

<lang cpp>#include <string>

include <algorithm>
include <iostream>
include <vector>

std::string findExtension ( const std::string & filename ) {

  auto position = filename.find_last_of ( '.' ) ;
  if ( position == std::string::npos ) 
     return "" ;
  else {
     std::string extension ( filename.substr( position ) ) ;
     position = extension.find( '_' ) ;
     auto pos2 = extension.find( '/' ) ;
     if (( position != std::string::npos ) || ( pos2 != std::string::npos ))

return "" ;

     else

return extension ;

}

int main( ) {

  std::vector<std::string> filenames {"picture.jpg" , "http://mywebsite.com/picture/image.png" ,
     "myuniquefile.longextension" , "IAmAFileWithoutExtension" , "/path/to.my/file" ,
     "file.odd_one" } ;
  std::vector<std::string> extensions( filenames.size( ) ) ;
  std::transform( filenames.begin( ) , filenames.end( ) , extensions.begin( ) , findExtension ) ;
  for ( int i = 0 ; i < filenames.size( ) ; i++ ) 
     std::cout << filenames[i] << " has extension : " << extensions[i] << " !\n" ;
  return 0 ;

} </lang>

Output:

picture.jpg has extension : .jpg !
http://mywebsite.com/picture/image.png has extension : .png !
myuniquefile.longextension has extension : .longextension !
IAmAFileWithoutExtension has extension :  !
/path/to.my/file has extension :  !
file.odd_one has extension :  !

C#

<lang C#> public static string ExtractExtension(string str) {

           string s = str;
           string temp = "";
           string result = "";
           bool isDotFound = false;

           for (int i = s.Length -1; i >= 0; i--)
           {
               if(s[i].Equals('.'))
               {
                   temp += s[i];
                   isDotFound = true;
                   break;
               }
               else
               {
                   temp += s[i];
               }
           }

           if(!isDotFound)
           {
               result = "";
           }
           else
           {
               for (int j = temp.Length - 1; j >= 0; j--)
               {
                   result += temp[j];
               }
           }

           return result;

} </lang>

Emacs Lisp

<lang Lisp>(file-name-extension "foo.txt") => "txt"</lang>

No extension is distinguished from empty extension but an (or ... "") can give "" for both if desired

<lang Lisp>(file-name-extension "foo.") => "" (file-name-extension "foo") => nil</lang>

An Emacs backup ~ or .~NUM~ are not part of the extension, but otherwise any characters are allowed.

<lang Lisp>(file-name-extension "foo.txt~") => "txt" (file-name-extension "foo.txt.~1.234~") => "txt"</lang>

Forth

<lang forth>: invalid? ( c -- f )

  toupper dup [char] A [char] Z 1+ within
  swap [char] 0 [char] 9 1+ within or 0= ;

extension ( addr1 u1 -- addr2 u2 )

  dup 0= if exit then
  2dup over +
  begin 1- 2dup <= while dup c@ invalid? until then
  \ no '.' found
  2dup - 0> if 2drop dup /string exit then
  \ invalid char
  dup c@ [char] . <> if 2drop dup /string exit then
  swap -
  \ '.' is last char
  2dup 1+ = if drop dup then
  /string ;

type.quoted ( addr u -- )

  [char] ' emit type [char] ' emit ;

test ( addr u -- )

  2dup type.quoted ."  => " extension type.quoted cr ;

tests

  s" picture.jpg" test
  s" http://mywebsite.com/picture/image.png" test
  s" myuniquefile.longextension" test
  s" IAmAFileWithoutExtension" test
  s" /path/to.my/file" test
  s" file.odd_one" test
  s" IDontHaveAnExtension." test ;</lang>

Output:

cr tests
'picture.jpg' => '.jpg'
'http://mywebsite.com/picture/image.png' => '.png'
'myuniquefile.longextension' => '.longextension'
'IAmAFileWithoutExtension' => ''
'/path/to.my/file' => ''
'file.odd_one' => ''
'IDontHaveAnExtension.' => ''
 ok

Go

<lang go>package main

import ( "fmt" "path" )

// An exact copy of `path.Ext` from Go 1.4.2 for reference: func Ext(path string) string { for i := len(path) - 1; i >= 0 && path[i] != '/'; i-- { if path[i] == '.' { return path[i:] } } return "" }

// A variation that handles the extra non-standard requirement // that extensions shall only "consists of one or more letters or numbers". // // Note, instead of direct comparison with '0-9a-zA-Z' we could instead use: // case !unicode.IsLetter(rune(b)) && !unicode.IsNumber(rune(b)): // return "" // even though this operates on bytes instead of Unicode code points (runes), // it is still correct given the details of UTF-8 encoding. func ext(path string) string { for i := len(path) - 1; i >= 0; i-- { switch b := path[i]; { case b == '.': return path[i:] case '0' <= b && b <= '9': case 'a' <= b && b <= 'z': case 'A' <= b && b <= 'Z': default: return "" } } return "" }

func main() { tests := []string{ "picture.jpg", "http://mywebsite.com/picture/image.png", "myuniquefile.longextension", "IAmAFileWithoutExtension", "/path/to.my/file", "file.odd_one", // Extra, with unicode "café.png", "file.resumé", // with unicode combining characters "cafe\u0301.png", "file.resume\u0301", } for _, str := range tests { std := path.Ext(str) custom := ext(str) fmt.Printf("%38s\t→ %-8q", str, custom) if custom != std { fmt.Printf("(Standard: %q)", std) } fmt.Println() } }</lang>

Output:

                           picture.jpg	→ ".jpg"  
http://mywebsite.com/picture/image.png	→ ".png"  
            myuniquefile.longextension	→ ".longextension"
              IAmAFileWithoutExtension	→ ""      
                      /path/to.my/file	→ ""      
                          file.odd_one	→ ""      (Standard: ".odd_one")
                              café.png	→ ".png"  
                           file.resumé	→ ""      (Standard: ".resumé")
                             café.png	→ ".png"  
                          file.resumé	→ ""      (Standard: ".resumé")

Haskell

<lang Haskell>module FileExtension

  where

myextension :: String -> String myextension s

  |not $ elem '.' s = ""
  |elem '/' extension || elem '_' extension = ""
  |otherwise = '.' : extension
     where

extension = reverse ( takeWhile ( /= '.' ) $ reverse s ) </lang>

Output:

map myextension ["picture.jpg" , "http://mywebsite.com/picture/image.png" , "myuniquefile.longextension" ,
                      "IAmAFileWithoutExtension" , "/path/to.my/file" , "file.odd_one"]
[".jpg",".png",".longextension","","",""]

J

Implementation:

<lang J>require'regex' ext=: '[.][a-zA-Z0-9]+$'&rxmatch ;@rxfrom ]</lang>

Obviously most of the work here is done by the regex implementation (pcre, if that matters - and this particular kind of expression tends to be a bit more concise expressed in perl than in J...).

Perhaps of interest is that this is an example of a J fork - here we have three verbs separated by spaces. Unlike a unix system fork (which spins up child process which is an almost exact clone of the currently running process), a J fork is three independently defined verbs. The two verbs on the edge get the fork's argument and the verb in the middle combines those two results.

The left verb uses rxmatch to find the beginning position of the match and its length. The right verb is the identity function. The middle verb extracts the desired characters from the original argument. (For a non-match, the length of the "match" is zero so the empty string is extracted.)

Alternative non-regex Implementation <lang J>ext=: #~ [: +./\ e.&'.' *. [: -. [: +./\. -.@e.&('.',AlphaNum_j_)</lang>

Task examples:

<lang J> ext 'picture.jpg' .jpg

  ext 'http://mywebsite.com/picture/image.png'

.png

  Examples=: 'picture.jpg';'http://mywebsite.com/picture/image.png';'myuniquefile.longextension';'IAmAFileWithoutExtension';'/path/to.my/file';'file.odd_one'
  ext each Examples

┌────┬────┬──────────────┬┬┬┐ │.jpg│.png│.longextension││││ └────┴────┴──────────────┴┴┴┘</lang>

Java

<lang java>public class Test {

   public static void main(String[] args) {
       String[] filenames = {"picture.jpg", "http://mywebsite.con/picture/image.png",
           "myuniquefile.longextension", "IAmAFileWithoutExtension", "/path/to.my/file",
           "file.odd_one"};

       for (String filename : filenames) {
           String ext = "null";
           int idx = filename.lastIndexOf('.');
           if (idx != -1) {
               String tmp = filename.substring(idx);
               if (tmp.matches("\\.[a-zA-Z0-9]+")) {
                   ext = tmp;
               }
           }
           System.out.println(filename + " -> " + ext);
       }
   }

}</lang>

picture.jpg -> .jpg
http://mywebsite.con/picture/image.png -> .png
myuniquefile.longextension -> .longextension
IAmAFileWithoutExtension -> null
/path/to.my/file -> null
file.odd_one -> null

jq

Pending resolution of the inconsistency in the task description as of this writing, the following definitions exclude the delimiting period.

In the first section, a version intended for jq version 1.4 is presented. A simpler definition using "match", a regex feature of subsequent versions of jq, is then given.

Works with: jq version 1.4

<lang jq>def file_extension:

 def alphanumeric: explode | unique
 | reduce .[] as $i
     (true;
      if . then $i | (97 <= . and . <= 122) or (65 <= . and . <= 90) or (48 <= . and . <= 57)
      else false
      end );
 rindex(".") as $ix
 | if $ix then .[1+$ix:] as $ext
   | if $ext|alphanumeric then $ext # or ".\($ext)" if the period is wanted
     else ""
     end
   else ""
   end;</lang>

Works with: jq version 1.5

<lang jq>def file_extension:

 match( "\\.([a-zA-Z0-9]*$)" ) // false
 | if . then .captures[0].string else "" end ;</lang>

Examples:

Using either version above gives the same results. <lang jq>"picture.jpg", "myuniquefile.longextension", "http://mywebsite.com/picture/image.png", "myuniquefile.longextension", "IAmAFileWithoutExtension", "/path/to.my/file", "file.odd_one" | "\(.) has extension: \"\(file_extension)\""</lang>

Output:

<lang sh>$ jq -r -n -f Extract_file_extension.jq picture.jpg has extension: "jpg" myuniquefile.longextension has extension: "longextension" http://mywebsite.com/picture/image.png has extension: "png" myuniquefile.longextension has extension: "longextension" IAmAFileWithoutExtension has extension: "" /path/to.my/file has extension: "" file.odd_one has extension: ""</lang>

Lua

<lang Lua>-- Lua pattern docs at http://www.lua.org/manual/5.1/manual.html#5.4.1 function fileExt (filename) return filename:match("(%.%w+)$") or "" end

local testCases = {

   "picture.jpg",
   "http://mywebsite.com/picture/image.png",
   "myuniquefile.longextension",
   "IAmAFileWithoutExtension",
   "/path/to.my/file",
   "file.odd_one"

} for _, example in pairs(testCases) do

   print(example .. ' -> "' .. fileExt(example) .. '"')

end</lang>

Output:

picture.jpg -> ".jpg"
http://mywebsite.com/picture/image.png -> ".png"
myuniquefile.longextension -> ".longextension"
IAmAFileWithoutExtension -> ""
/path/to.my/file -> ""
file.odd_one -> ""

Oforth

If extension is not valid, returns null, not "". Easy to change if "" is required.

<lang Oforth>: fileExt(s) { | i |

  s lastIndexOf('.') dup ->i ifNull: [ null return ]
  s extract(i 1+, s size) conform(#isAlpha) ifFalse: [ null return ]
  s extract(i, s size)

} </lang>

Output:

fileExt("picture.jpg") println
fileExt("http://mywebsite.com/picture/image.png") println
fileExt("myuniquefile.longextension") println
fileExt("IAmAFileWithoutExtension") println
fileExt("/path/to.my/file") println
fileExt("file.odd_one") println

Perl

<lang Perl>#!/usr/bin/perl use strict ; use warnings ;

sub extension {

  my $filename = shift ;
  if ( $filename !~ /\./ ) {
     return "" ;
  }
  else {
     my @parts = split ( /\./ , $filename ) ;
     my $extension = $parts[ -1 ] ;
     if ( $extension =~ /[\/_]/ ) {

return "" ;

     }
     else {

return ".$extension" ;

     }
  }

}

map { print ("file name: $_ , extension : " . extension( $_ ) . "\n" ) }

( "picture.jpg" , "http://mywebsite.com/picture/image.png" , 
  "myuniquefile.longextension" , "IAmAFileWithoutExtension" ,
  "/path/to.my/file" , "file.odd_one" ) ;

</lang>

Output:

file name: picture.jpg , extension : .jpg
file name: http://mywebsite.com/picture/image.png , extension : .png
file name: myuniquefile.longextension , extension : .longextension
file name: IAmAFileWithoutExtension , extension : 
file name: /path/to.my/file , extension : 
file name: file.odd_one , extension :

Perl 6

<lang perl6>sub extension ( Str $filename --> Str ) {

   given $filename.split(/\./)[*-1] {
       when $filename   { "" }
       when / <[\/_]> / { "" }
       default          { "." ~ $_ }
   }

}

say "$_ -> ", extension($_).perl for (

   'mywebsite.com/picture/image.png',
   'http://mywebsite.com/picture/image.png',
   'myuniquefile.longextension',
   'IAmAFileWithoutExtension',
   '/path/to.my/file',
   'file.odd_one',

)</lang>

Output:

mywebsite.com/picture/image.png -> ".png"
http://mywebsite.com/picture/image.png -> ".png"
myuniquefile.longextension -> ".longextension"
IAmAFileWithoutExtension -> ""
/path/to.my/file -> ""
file.odd_one -> ""

Phix

<lang Phix>function getExtension(string filename)

   for i=length(filename) to 1 by -1 do
       integer ch = filename[i]
       if ch='.' then return filename[i..$] end if
       if find(ch,"\\/_") then exit end if
   end for
   return ""

end function

constant tests = {"mywebsite.com/picture/image.png",

                 "http://mywebsite.com/picture/image.png",
                 "myuniquefile.longextension",
                 "IAmAFileWithoutExtension",
                 "/path/to.my/file",
                 "file.odd_one"}

for i=1 to length(tests) do

   printf(1,"%s ==> %s\n",{tests[i],getExtension(tests[i])})

end for</lang>

Output:

mywebsite.com/picture/image.png ==> .png
http://mywebsite.com/picture/image.png ==> .png
myuniquefile.longextension ==> .longextension
IAmAFileWithoutExtension ==>
/path/to.my/file ==>
file.odd_one ==>

PowerShell

<lang PowerShell> function extension($file){

   $ext = [System.IO.Path]::GetExtension($file)
   if (-not [String]::IsNullOrEmpty($ext)) {
       if($ext.IndexOf("_") -ne -1) {$ext = ""}
   }
   $ext

} extension "picture.jpg" extension "http://mywebsite.com/picture/image.png" extension "myuniquefile.longextension" extension "IAmAFileWithoutExtension" extension "/path/to.my/file" extension "file.odd_one" </lang> Output:

.jpg
.png
.longextension

Python

Uses os.path.splitext and the extended tests from the Go example above.

<lang python>Python 3.5.0a1 (v3.5.0a1:5d4b6a57d5fd, Feb 7 2015, 17:58:38) [MSC v.1900 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> import os >>> tests = ["picture.jpg", "http://mywebsite.com/picture/image.png", "myuniquefile.longextension", "IAmAFileWithoutExtension", "/path/to.my/file", "file.odd_one", # Extra, with unicode "café.png", "file.resumé", # with unicode combining characters "cafe\u0301.png", "file.resume\u0301"] >>> for path in tests:

   print("Path: %r -> Extension: %r" % (path, os.path.splitext(path)[-1]))

Path: 'picture.jpg' -> Extension: '.jpg' Path: 'http://mywebsite.com/picture/image.png' -> Extension: '.png' Path: 'myuniquefile.longextension' -> Extension: '.longextension' Path: 'IAmAFileWithoutExtension' -> Extension: Path: '/path/to.my/file' -> Extension: Path: 'file.odd_one' -> Extension: '.odd_one' Path: 'café.png' -> Extension: '.png' Path: 'file.resumé' -> Extension: '.resumé' Path: 'café.png' -> Extension: '.png' Path: 'file.resumé' -> Extension: '.resumé' >>> </lang>

Racket

<lang Racket>#lang racket

Note that for a real implementation, Racket has a `filename-extension` in its standard library, but don't use it here since it requires a proper name (fails on ""), returns a byte-string, and handles path values so might run into problems with unicode string inputs.

(define (string-extension x)

 (cadr (regexp-match #px"(\\.alnum:+|)$" x)))

(define (string-extension/unicode x)

 (cadr (regexp-match #px"(\\.(?:\\p{L}|\\p{N}|\\p{M})+|)$" x)))

(define examples '("picture.jpg"

                  "http://mywebsite.com/picture/image.png"
                  "myuniquefile.longextension"
                  "IAmAFileWithoutExtension"
                  "/path/to.my/file"
                  "file.odd_one"
                  ""
                  ;; Extra, with unicode
                  "café.png"
                  "file.resumé"
                  ;; with unicode combining characters
                  "cafe\u0301.png"
                  "file.resume\u0301"))

(printf "Official task:\n") (for ([x (in-list examples)])

 (printf "~s ==> ~s\n" x (string-extension x)))

(printf "\nWith unicode support:\n") (for ([x (in-list examples)])

 (printf "~s ==> ~s\n" x (string-extension/unicode x)))

</lang>

Output:

Official task:
  "picture.jpg" ==> ".jpg"
  "http://mywebsite.com/picture/image.png" ==> ".png"
  "myuniquefile.longextension" ==> ".longextension"
  "IAmAFileWithoutExtension" ==> ""
  "/path/to.my/file" ==> ""
  "file.odd_one" ==> ""
  "" ==> ""
  "café.png" ==> ".png"
  "file.resumé" ==> ""
  "café.png" ==> ".png"
  "file.resumé" ==> ""

With unicode support:
  "picture.jpg" ==> ".jpg"
  "http://mywebsite.com/picture/image.png" ==> ".png"
  "myuniquefile.longextension" ==> ".longextension"
  "IAmAFileWithoutExtension" ==> ""
  "/path/to.my/file" ==> ""
  "file.odd_one" ==> ""
  "" ==> ""
  "café.png" ==> ".png"
  "file.resumé" ==> ".resumé"
  "café.png" ==> ".png"
  "file.resumé" ==> ".resumé"

REXX

Using this paraphrased Rosetta Code task's definition that:

a legal file extension only consists of mixed-case Latin letters and/or decimal digits. <lang rexx>/*REXX pgm extracts the file extension (defined above from the RC task) from a file name*/ @.= /*define default value for the @ array.*/ parse arg fID /*obtain any optional arguments from CL*/ if fID\== then @.1 = fID /*use the filename from the C.L. */

            else do                             /*No filename given? Then use defaults.*/
                 @.1 = 'picture.jpg'
                 @.2 = 'http://mywebsite.com/pictures/image.png'
                 @.3 = 'myuniquefile.longextension'
                 @.4 = 'IAmAFileWithoutExtension'
                 @.5 = '/path/to.my/file'
                 @.6 = 'file.odd_one'
                 end

  do j=1  while  @.j\==;     x=               /*process  (all of)  the file name(s). */
  p=lastpos(.,@.j)                              /*find the last position of a period.  */
  if p\==0  then x=substr(@.j, p+1)             /*Found a dot?  Then get stuff after it*/
  if \datatype(x, 'A')   then x=                /*is it upper/lowercase letters|digits?*/
  if x==  then x= " [null]"                   /*use a better name for a  "null".     */
            else x= . || x                      /*prefix the extension with a period.  */
  say 'file extension=' left(x, 20)     "for file name=" @.j
  end       /*j*/                               /*stick a fork in it,  we're all done. */</lang>

output when using the default (internal) inputs:

file extension= .jpg                 for file name= picture.jpg
file extension= .png                 for file name= http://mywebsite.com/pictures/image.png
file extension= .longextension       for file name= myuniquefile.longextension
file extension=  [null]              for file name= IAmAFileWithoutExtension
file extension=  [null]              for file name= /path/to.my/file
file extension=  [null]              for file name= file.odd_one

sed

Output:

.jpg
.png
.longextension
IAmAFileWithoutExtension

Sidef

<lang ruby>func extension (filename) {

  given(filename.split('.').last) {
      when(filename) { "" }
      when(/[\/_]/)  { "" }
      default        { "." + _ }
  }

}

['mywebsite.com/picture/image.png',

'http://mywebsite.com/picture/image.png',
'myuniquefile.longextension',
'IAmAFileWithoutExtension',
'/path/to.my/file',
'file.odd_one',

].each {|f| say "#{f} -> #{extension(f).dump}" }</lang>

Output:

mywebsite.com/picture/image.png -> ".png"
http://mywebsite.com/picture/image.png -> ".png"
myuniquefile.longextension -> ".longextension"
IAmAFileWithoutExtension -> ""
/path/to.my/file -> ""
file.odd_one -> ""

Tcl

Tcl's built in file extension command already almost knows how to do this, except it accepts any character after the dot. Just for fun, we'll enhance the builtin with a new subcommand with the limitation specified for this problem.

<lang Tcl>proc assert {expr} { ;# for "static" assertions that throw nice errors

   if {![uplevel 1 [list expr $expr]]} {
       set msg "{$expr}"
       catch {append msg " {[uplevel 1 [list subst -noc $expr]]}"}
       tailcall throw {ASSERT ERROR} $msg
   }

}

proc file_ext {file} {

   set res ""
   regexp {(\.[a-z0-9]+)$} $file -> res
   return $res

}

set map [namespace ensemble configure file -map] dict set map ext ::file_ext namespace ensemble configure file -map $map

and a test:

foreach {file ext} {

   picture.jpg	.jpg
   http://mywebsite.com/picture/image.png	.png
   myuniquefile.longextension	.longextension
   IAmAFileWithoutExtension	""
   /path/to.my/file	""
   file.odd_one	""

} {

   set res ""
   assert {[file ext $file] eq $ext}

}</lang>

VBScript

<lang vb> Function GetExtension(s) If InStr(s,"/") Then arr_s = Split(s,"/") fname = arr_s(UBound(arr_s)) ElseIf InStr(s,"\") Then arr_s = Split(s,"\") fname = arr_s(UBound(arr_s)) Else fname = s End If GetExtension = "" If InStr(fname,".") Then arr_x = Split(fname,".") If InStr(1,arr_x(UBound(arr_x)),"_") = 0 Then GetExtension = "." & arr_x(UBound(arr_x)) End If End If End Function

'Testing the function arr_t = Array("picture.jpg","http://mywebsite.com/picture/image.png",_ "myuniquefile.longextension","IAmAFileWithoutExtension",_ "/path/to.my/file","file.odd_one") For i = 0 To UBound(arr_t) WScript.StdOut.WriteLine arr_t(i) & " -> " & GetExtension(arr_t(i)) Next </lang>

Output:

picture.jpg -> .jpg
http://mywebsite.com/picture/image.png -> .png
myuniquefile.longextension -> .longextension
IAmAFileWithoutExtension -> ''
/path/to.my/file -> ''
file.odd_one -> ''

zkl

The File object has a method splitFileName that does just that, returning a list of the parts. The method knows about the OS it was compiled on (Unix, Windows). <lang zkl>T("picture.jpg","http://mywebsite.com/picture/image.png",

 "myuniquefile.longextension","IAmAFileWithoutExtension",
 "/path/to.my/file","file.odd_one").apply(File.splitFileName).println();</lang>

Output:

L(L("","","picture",".jpg"),
L("","http://mywebsite.com/picture/","image",".png"),
L("","","myuniquefile",".longextension"),
L("","","IAmAFileWithoutExtension",""),
L("","/path/to.my/","file",""),
L("","","file",".odd_one"))

The last one is an odd duck so some code is in order: <lang zkl>fcn exonly(fileName){

  var re=RegExp(0'|\.[a-zA-Z0-9]+$|);
  ext:=File.splitFileName(fileName)[-1];
  if(not re.matches(ext)) return("");
  ext

}</lang> <lang zkl>T("picture.jpg","http://mywebsite.com/picture/image.png",

 "myuniquefile.longextension","IAmAFileWithoutExtension",
 "/path/to.my/file","file.odd_one").apply(exonly).println();</lang>

Output:

L(".jpg",".png",".longextension","","","")