Extract file extension: Difference between revisions

From Rosetta Code
Content added Content deleted
(Add Racket)
(Tcl implementation added)
Line 219: Line 219:
file ext= [null] for file name=file.odd_one
file ext= [null] for file name=file.odd_one
</pre>
</pre>

=={{header|Tcl}==

Tcl's built in [http://wiki.tcl.tk/10072 file extension] command already almost knows how to do this, except it accepts any character after the dot. Just for fun, we'll enhance the builtin with a new subcommand with the limitation specified for this problem.

<lang Tcl>proc assert {expr} { ;# for "static" assertions that throw nice errors
if {![uplevel 1 [list expr $expr]]} {
set msg "{$expr}"
catch {append msg " {[uplevel 1 [list subst -noc $expr]]}"}
tailcall throw {ASSERT ERROR} $msg
}
}

proc file_ext {file} {
set res ""
regexp {(\.[a-z0-9]+)$} $file -> res
return $res
}

set map [namespace ensemble configure file -map]
dict set map ext ::file_ext
namespace ensemble configure file -map $map

# and a test:
foreach {file ext} {
picture.jpg .jpg
http://mywebsite.com/picture/image.png .png
myuniquefile.longextension .longextension
IAmAFileWithoutExtension ""
/path/to.my/file ""
file.odd_one ""
} {
set res ""
assert {[file ext $file] eq $ext}
}</lang>

Revision as of 14:50, 5 May 2015

Extract file extension is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Write a program that takes one string argument representing the path to a file and returns the files extension, or the null string if the file path has no extension. An extension appears after the last period in the file name and consists of one or more letters or numbers.

Show here the action of your routine on the following examples:

  1. picture.jpg returns .jpg
  2. http://mywebsite.com/picture/image.png returns .png
  3. myuniquefile.longextension returns .longextension
  4. IAmAFileWithoutExtension returns an empty string ""
  5. /path/to.my/file returns an empty string as the period is in the directory name rather than the file
  6. file.odd_one returns an empty string as an extension (by this definition), cannot contain an underscore.


C#

<lang C#> public static string ExtractExtension(string str) {

           string s = str;
           string temp = "";
           string result = "";
           bool isDotFound = false;
           for (int i = s.Length -1; i >= 0; i--)
           {
               if(s[i].Equals('.'))
               {
                   temp += s[i];
                   isDotFound = true;
                   break;
               }
               else
               {
                   temp += s[i];
               }
           }
           if(!isDotFound)
           {
               result = "";
           }
           else
           {
               for (int j = temp.Length - 1; j >= 0; j--)
               {
                   result += temp[j];
               }
           }
           return result;

} </lang>

Go

<lang go>package main

import ( "fmt" "path" )

// An exact copy of `path.Ext` from Go 1.4.2 for reference: func Ext(path string) string { for i := len(path) - 1; i >= 0 && path[i] != '/'; i-- { if path[i] == '.' { return path[i:] } } return "" }

// A variation that handles the extra non-standard requirement // that extensions shall only "consists of one or more letters or numbers". // // Note, instead of direct comparison with '0-9a-zA-Z' we could instead use: // case !unicode.IsLetter(rune(b)) && !unicode.IsNumber(rune(b)): // return "" // even though this operates on bytes instead of Unicode code points (runes), // it is still correct given the details of UTF-8 encoding. func ext(path string) string { for i := len(path) - 1; i >= 0; i-- { switch b := path[i]; { case b == '.': return path[i:] case '0' <= b && b <= '9': case 'a' <= b && b <= 'z': case 'A' <= b && b <= 'Z': default: return "" } } return "" }

func main() { tests := []string{ "picture.jpg", "http://mywebsite.com/picture/image.png", "myuniquefile.longextension", "IAmAFileWithoutExtension", "/path/to.my/file", "file.odd_one", // Extra, with unicode "café.png", "file.resumé", // with unicode combining characters "cafe\u0301.png", "file.resume\u0301", } for _, str := range tests { std := path.Ext(str) custom := ext(str) fmt.Printf("%38s\t→ %-8q", str, custom) if custom != std { fmt.Printf("(Standard: %q)", std) } fmt.Println() } }</lang>

Output:
                           picture.jpg	→ ".jpg"  
http://mywebsite.com/picture/image.png	→ ".png"  
            myuniquefile.longextension	→ ".longextension"
              IAmAFileWithoutExtension	→ ""      
                      /path/to.my/file	→ ""      
                          file.odd_one	→ ""      (Standard: ".odd_one")
                              café.png	→ ".png"  
                           file.resumé	→ ""      (Standard: ".resumé")
                             café.png	→ ".png"  
                          file.resumé	→ ""      (Standard: ".resumé")

J

Implementation:

<lang J>require'regex' ext=: '[.][a-zA-Z0-9]+$'&rxmatch ;@rxfrom ]</lang>

Obviously most of the work here is done by the regex implementation (pcre, if that matters - and this particular kind of expression tends to be a bit more concise expressed in perl than in J...).

Perhaps of interest is that this is an example of a J fork - here we have three verbs separated by spaces. Unlike a unix system fork (which spins up child process which is an almost exact clone of the currently running process), a J fork is three independently defined verbs. The two verbs on the edge get the fork's argument and the verb in the middle combines those two results.

The left verb uses rxmatch to find the beginning position of the match and its length. The right verb is the identity function. The middle verb extracts the desired characters from the original argument. (For a non-match, the length of the "match" is zero so the empty string is extracted.)

Task examples:

<lang J> ext 'picture.jpg' .jpg

  ext 'http://mywebsite.com/picture/image.png'

.png

  ext 'myuniquefile.longextension'

.longextension

  ext 'IAmAFileWithoutExtension'
  ext '/path/to.my/file'
  ext 'file.odd_one'

</lang>

Racket

<lang Racket>#lang racket

(define (string-suffix x)

 (define v (regexp-match #px"\\.alnum:+$" x))
 (if v 
     (car v)
     ""))

(for-each (lambda (x) (printf "~s ==> ~s\n"

                             x
                             (string-suffix x)))
         (list "picture.jpg"
               "http://mywebsite.com/picture/image.png"
               "myuniquefile.longextension"
               "IAmAFileWithoutExtension"
               "/path/to.my/file"
               "file.odd_one"
               ""))</lang>
Output:
"picture.jpg" ==> ".jpg"
"http://mywebsite.com/picture/image.png" ==> ".png"
"myuniquefile.longextension" ==> ".longextension"
"IAmAFileWithoutExtension" ==> ""
"/path/to.my/file" ==> ""
"file.odd_one" ==> ""
"" ==> ""

REXX

(Using this paraphrased Rosetta Code task's definition that a legal file extension only consists of mixed-case Latin letters and/or decimal digits.) <lang rexx>/*REXX program extracts the (legal) file extension from a file name. */

                    @.  =             /*define default value for array.*/

if arg(1)\== then @.1 = arg(1) /*use the filename from the C.L. */

               else do                /*No filename given? Use defaults*/
                    @.1 = 'picture.jpg'
                    @.2 = 'http://mywebsite.com/pictures/image.png'
                    @.3 = 'myuniquefile.longextension'
                    @.4 = 'IAmAFileWithoutExtension'
                    @.5 = '/path/to.my/file'
                    @.6 = 'file.odd_one'
                    end
 do j=1  while @.j\==;  $=@.j;  x=  /*process all of the file names. */
 p=lastpos(.,$)                       /*find last position of a period.*/
 if p\==0  then x=substr($,p+1)       /*Found?  Get the stuff after it.*/
 if \datatype(x,'A')  then x=         /*upper & lower case letters+digs*/
 if x==  then x=' [null]'           /*use a better name for a "null".*/
           else x=. || x              /*prefix extension with a period.*/
 say 'file ext='left(x,20)       'for file name='$
 end       /*j*/
                                      /*stick a fork in it, we're done.*/</lang>

output using the default inputs:

file ext=.jpg                 for file name=picture.jpg
file ext=.png                 for file name=http://mywebsite.com/pictures/image.png
file ext=.longextension       for file name=myuniquefile.longextension
file ext= [null]              for file name=IAmAFileWithoutExtension
file ext= [null]              for file name=/path/to.my/file
file ext= [null]              for file name=file.odd_one

{{header|Tcl}

Tcl's built in file extension command already almost knows how to do this, except it accepts any character after the dot. Just for fun, we'll enhance the builtin with a new subcommand with the limitation specified for this problem.

<lang Tcl>proc assert {expr} {  ;# for "static" assertions that throw nice errors

   if {![uplevel 1 [list expr $expr]]} {
       set msg "{$expr}"
       catch {append msg " {[uplevel 1 [list subst -noc $expr]]}"}
       tailcall throw {ASSERT ERROR} $msg
   }

}

proc file_ext {file} {

   set res ""
   regexp {(\.[a-z0-9]+)$} $file -> res
   return $res

}

set map [namespace ensemble configure file -map] dict set map ext ::file_ext namespace ensemble configure file -map $map

  1. and a test:

foreach {file ext} {

   picture.jpg	.jpg
   http://mywebsite.com/picture/image.png	.png
   myuniquefile.longextension	.longextension
   IAmAFileWithoutExtension	""
   /path/to.my/file	""
   file.odd_one	""

} {

   set res ""
   assert {[file ext $file] eq $ext}

}</lang>