How can I write a to_upper() or to_lower() function in F90?

L

4

13

How does one write a (Intel) F90 function that converts a string into lowercase (or, alternatively, uppercase)? I want to pass a character array to the function and have it return a character array, e.g.

program main
    implicit none

    character*32 :: origStr = "Hello, World!"
    character*32 :: newStr

    newStr = to_lower(origStr)
    write (*,*) newStr

end program main

such that this program outputs hello, world!.

I've been starting with the to_lower() subroutine found at RosettaCode, but I can't figure out how to write it as a function.

Thanks in advance!

PS -- Bonus points if you can do it with a string of unfixed length!

Lacreshalacrimal answered 25/5, 2012 at 18:9 Comment(0)

G

11

As the original author of this code, I'm pleased that it's of some help. I used to wonder why these functions were not built in to Fortran. My guess is that they only work for a rather restricted set of letters, i.e. the ones used in English. If you have text in almost any other European language you will have characters with accents, and then converting them to upper or lower case is much harder. For example e-grave in French turned into upper case is usually shown as just plain E (the grave accent gets lost), but in e-acute it does not. The designers of Fortran have always tried to provide facilities which suit a wide range of languages, and doing upper/lower case conversion in a multi-language way is not at all easy. At least that's my guess as to why you have to do it yourself.

Gawain answered 19/6, 2012 at 21:48 Comment(2)

Hi Clive, welcome to SO. I first started learning Fortran in 2006 from your book. Just wanted to say thank you. – Elimination 20/6, 2012 at 4:54

I haven't read your book, but thank you for your useful code and insightful comment! I recently passed the routine to yet another colleague. I'm going to have to add your name to it. :-) – Lacreshalacrimal 16/7, 2014 at 14:49

L

18

Wow -- even though I'd searched for over an hour, immediately after posting this, I found an answer here (under "Miscellaneous Fortran Hints and Tips").

The code I used is as follows (for to_upper):

function to_upper(strIn) result(strOut)
! Adapted from http://www.star.le.ac.uk/~cgp/fortran.html (25 May 2012)
! Original author: Clive Page

     implicit none

     character(len=*), intent(in) :: strIn
     character(len=len(strIn)) :: strOut
     integer :: i,j

     do i = 1, len(strIn)
          j = iachar(strIn(i:i))
          if (j>= iachar("a") .and. j<=iachar("z") ) then
               strOut(i:i) = achar(iachar(strIn(i:i))-32)
          else
               strOut(i:i) = strIn(i:i)
          end if
     end do

end function to_upper

Hope this helps somebody!

Lacreshalacrimal answered 25/5, 2012 at 18:26 Comment(2)

this relies on achar and iachar being based on the ASCII table, which to my knowledge isn't standardized...(that being said, I do basically the same thing in my code and I've never had a compiler surprise me by not using the ASCII table ...) – Fidele 26/5, 2012 at 16:21

according to the FORTRAN 90 standard: "The intrinsic functions ACHAR and IACHAR provide conversions between these characters and the integers of the ASCII collating sequence." link. ICHAR will use the system's native character set (which is not necessarily ASCII). – Baun 27/5, 2012 at 12:30

A

12

Here's one that doesn't rely on the ASCII representation

Pure Function to_upper (str) Result (string)

!   ==============================
!   Changes a string to upper case
!   ==============================

    Implicit None
    Character(*), Intent(In) :: str
    Character(LEN(str))      :: string

    Integer :: ic, i

    Character(26), Parameter :: cap = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    Character(26), Parameter :: low = 'abcdefghijklmnopqrstuvwxyz'

!   Capitalize each letter if it is lowecase
    string = str
    do i = 1, LEN_TRIM(str)
        ic = INDEX(low, str(i:i))
        if (ic > 0) string(i:i) = cap(ic:ic)
    end do

End Function to_upper

You can easily change this to to_lower by switching the low and cap strings in the loop.

Anderlecht answered 29/5, 2012 at 22:49 Comment(2)

I do not see the reason why it should be more portable. ASCII converting functions are standard Fortran 90 and therefore work also on computers with different collating sequence, be it EBCDIC or anything other. And because your program also uses other Fortran 90 features, it requires Fortran 90 compiler the same way, as program that use achar does.. – Oosperm 12/6, 2012 at 8:23

Fair enough. I'll remove the portable statement. I personally think this method is easier to understand at first glance than the ASCII version since it does not rely on knowledge of ASCII representation, but that's just an opinion :) – Anderlecht 12/6, 2012 at 13:54

G

11

As the original author of this code, I'm pleased that it's of some help. I used to wonder why these functions were not built in to Fortran. My guess is that they only work for a rather restricted set of letters, i.e. the ones used in English. If you have text in almost any other European language you will have characters with accents, and then converting them to upper or lower case is much harder. For example e-grave in French turned into upper case is usually shown as just plain E (the grave accent gets lost), but in e-acute it does not. The designers of Fortran have always tried to provide facilities which suit a wide range of languages, and doing upper/lower case conversion in a multi-language way is not at all easy. At least that's my guess as to why you have to do it yourself.

Gawain answered 19/6, 2012 at 21:48 Comment(2)

Hi Clive, welcome to SO. I first started learning Fortran in 2006 from your book. Just wanted to say thank you. – Elimination 20/6, 2012 at 4:54

I haven't read your book, but thank you for your useful code and insightful comment! I recently passed the routine to yet another colleague. I'm going to have to add your name to it. :-) – Lacreshalacrimal 16/7, 2014 at 14:49

S

2

Few years later but just in case this can be of help, here a function that runs 35x faster in O3 and 3x faster in O0 (ifort2023)

module upper_lower
implicit none
contains
pure function to_lower (str) Result (string)

!   ==============================
!   Changes a string to upper case
!   ==============================

    Character(*), Intent(In) :: str
    Character(LEN(str))      :: string

    Integer :: ic, i

    Character(26), Parameter :: cap = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    Character(26), Parameter :: low = 'abcdefghijklmnopqrstuvwxyz'

!   Capitalize each letter if it is lowecase
    string = str
    do i = 1, LEN_TRIM(str)
        ic = INDEX(cap, str(i:i))
        if (ic > 0) string(i:i) = low(ic:ic)
    end do

end function to_lower

pure function to_lower_2 (str) Result (string)
    Character(*), Intent(In) :: str
    Character(LEN(str))      :: string

    integer, parameter :: wp= 32, la=65, lz= 90
    Integer :: c, icar

    do c= 1, len(str) 
        icar= ichar(str(c:c)) 
        if (icar>=la.and.icar<=lz) icar= icar + wp 
        string(c:c)= char(icar) 
    end do
end function to_lower_2

pure function to_lower_3(strIn) result(strOut)
    ! Adapted from original code from Clive page
    character(len=*), intent(in) :: strIn
    character(len=len(strIn)) :: strOut
    integer :: i,j

    do i = 1, len(strIn)
        j = iachar(strIn(i:i))
        if (j>= iachar("A") .and. j<=iachar("Z") ) then
            strOut(i:i) = achar(iachar(strIn(i:i))+32)
        else
            strOut(i:i) = strIn(i:i)
        end if
    end do

end function to_lower_3

pure function to_upper_2 (str) Result (string)
    Character(*), Intent(In) :: str
    Character(LEN(str))      :: string

    integer, parameter :: wp= 32, BA=97, BZ= 122
    Integer :: c, icar

    do c= 1, len(str) 
        icar= ichar(str(c:c)) 
        if (icar>=BA.and.icar<=BZ) icar= icar - wp 
        string(c:c)= char(icar) 
    end do
end function to_upper_2
end module

program main
use upper_lower
implicit none

integer :: i
character(len=:),allocatable :: str
real(8) :: t1, t2, t3, time_start, time_finish

str = to_lower_2('Hello, World')
print *, str
str = to_lower_2('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
print *, str

call cpu_time(time_start)
do i = 1,10000000
    str = to_lower('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
end do
call cpu_time(time_finish)
t1 = time_finish-time_start
print *, 'Original Method: ', t1

call cpu_time(time_start)
do i = 1,10000000
    str = to_lower_2('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
end do
call cpu_time(time_finish)
t2 = time_finish-time_start
print *, 'New Method     : ', t2

call cpu_time(time_start)
do i = 1,10000000
    str = to_lower_3('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
end do
call cpu_time(time_finish)
t3 = time_finish-time_start
print *, 'Clive: ', t3

print *, 'speedup (t1/t3): ', t1/t3
print *, 'speedup (t2/t3): ', t2/t3

end program main

Output (O3):

 hello, world
 abcdefghijklmnopqrstuvwxyz
 Original Method:    5.48437500000000
 New Method     :   0.125000000000000
 Clive:   7.812500000000000E-002
 speedup (t1/t3):    70.2000000000000
 speedup (t2/t3):    1.60000000000000

EDIT Following the remarks in the comments I have updated the comparison, indeed, Clive method is faster in O3, in O0 the second method is about 1.5 times faster

Steib answered 24/3, 2023 at 20:52 Comment(4)

You are not comparing to the original method. The original one is the one from Clive Page also shown by jvriesem. You are only comparing to the slower alternative later shown by SthMMorton. The original method does the same basic operation as yours. – Oosperm 25/3, 2023 at 7:40

BTW, the original code by Clive Page is better portable because it uses iachar snd achar instead of ichar and char. That is enough for portability at EBCDIC. What the chapter at Rosetta code claims about necessary changes for EBCDIC is nonsense. This method with achar should work there just fine. See my comment under the answer of SethMMorton. – Oosperm 25/3, 2023 at 8:18

You are totally right!! I have edited the post accordingly including the original code from Clive's page and it does come in front in O3 – Steib 25/3, 2023 at 9:26

Hi @VladimirFГероямслава you might be interested in this discussion I opened in the stdlib github.com/fortran-lang/stdlib/issues/703 and several methods were compared! While Clive method does give a good performance, it is not systematically the fastest, and actually the "slower" implementation came on top for ifx, this was quite unexpected. – Steib 26/3, 2023 at 12:52

Recommended topics

Hot tags