Home‎ > ‎ProgComp‎ > ‎ProgComp 2013 Example Answers‎ > ‎

Task 1. Case Mapping

Available Marks: 7

Perl's regular expressions have been adopted by other systems, sometimes incompletely. Version 9.1 of the SAS enterprise database system, for example, didn't implement case mapping codes in replacement text. This task does so, but for any string.

Embedded in a string of printable ASCII characters are the following symbol sequences, that change the state of the characters that follow:

CodeState change
\uThe following single character is converted to upper case
\lThe following single character is converted to lower case
\UAll following characters are converted to upper case
\LAll following characters are converted to lower case
\EStop converting case

Rules

  • Only letters can be changed, so \u# or \L123\E do not alter the # or 123.
  • \u and \l can be nested inside \L and \U, so \Um\lcdonald\E becomes McDONALD.
  • \U or \L terminate any previous \U or \L without the need for \E.
  • All code sequences are removed from the result, including redundant ones such as repeated \E, \U or \L, or \u nested inside \U etc.
  • There are no error cases, backslash followed by any character other than those above is uninterpreted.

Write a program that reads a file of lines representing perl-style case mapping strings, preceded by the number of lines, and displays the converted lines. There can be up to 20 lines and each line can have up to 100 characters.

Example

Input:

3
\Uunsw\E Computing
\uprogcomp \l2013\u.
T\LITLE CASE\E is also known as \L\uproper CASE\E.

Output:

UNSW Computing
Progcomp 2013.
Title case is also known as Proper case.

Test Data

You should test your program on the following data:
9
Mapping "\UuPpEr"\E and "\LlOWeR\E" case
\uNon-\lNested \using\lles
Nested \L\umc\uvEE\E and \Uo'l\le\la\lr\ly\E, \U\lR\Eedundant sequences.
Nested same-case: \Uab\uc\ud\uef\E and \L\lAB\lC\lDE\lF\E
Punctuation is unaffected\U:!@#$%^&*()_-+={}[]'";:,.<>/?1234567890\E
To get a \u\ precede it with either \l\u or \u\l.
\u\E is optional and can be redundant\E\E\E: \u\Uh\u\LeLLo[endofline] produces \Uh\LeLLo
Pointless repeats: \L\L\L\L\L\L\L\L\U\L\L\L\L\U\U\U\L\L\L\LX\E
Non-special sequences \t \n \e \A \$ \\



#-------------------------------------------------------------------------------
# Name:        Task 1. Case Mapping
# Purpose:
#
# Author:      Joseph.Lai
#
# Created:     22/05/2014
# Copyright:   (c) Joseph.Lai 2014
# Licence:     <your licence>
#-------------------------------------------------------------------------------
# List here instead of File Read for Testing purposes
list1 = [
3,
r"\Uunsw\E Computing",
r"\uprogcomp \l2013\u.",
r"T\LITLE CASE\E is also known as \L\uproper CASE\E",
]

def doGetFile():
    # Here is where you enter code to read from a file. This has been done before in other examples
    # Choose the list to process - list1 is the test, list2 is actual data
    #return list1

    # Open a file

    fileName = "Input Task1 Case Mapping.txt"

    fo = open(fileName, "r")
    file=fo.read();
    fo.close();
    inp = file.splitlines(); # creates a list from the file input separated by \n (default)
    return inp

def doProcess(sentence):
    i = -1
    newSentence = ""
    control = False
    upperLetter = False
    upperWord = False
    lowerLetter = False
    lowerWord = False

    while i < len(sentence)-1:
        i +=1

        # Determine if Control Characters
        if sentence[i] == "\\":
            control=True
        elif control:
            if sentence[i] == "U":
                upperWord = True
                lowerWord = False
            elif sentence[i] == "L":
                lowerWord = True
                upperWord = False
                #print("{} {}".format(lowerWord,upperWord),end =' ')
            elif sentence[i] == "u":
                upperLetter = True
            elif sentence[i] == "l":
                lowerLetter = True
            elif sentence[i] == "E":
                upperLetter = False
                upperWord = False
                lowerLetter = False
                lowerWord = False
            control = False

        else:
        # Normal letter
            #print(sentence[i],end= ' ')
            if upperLetter:
                if sentence[i].isalpha():   #only change case if Alphabetic
                    newSentence = newSentence + sentence[i].upper()
                else:
                    newSentence = newSentence + sentence[i]
                upperLetter = False
            elif lowerLetter:
                if sentence[i].isalpha():   #only change case if Alphabetic
                    newSentence = newSentence + sentence[i].lower()
                else:
                    newSentence = newSentence + sentence[i]
                lowerLetter = False
            elif upperWord:
                if sentence[i].isalpha():   #only change case if Alphabetic
                    newSentence = newSentence + sentence[i].upper()
                else:
                    newSentence = newSentence + sentence[i]
            elif lowerWord:
                if sentence[i].isalpha():   #only change case if Alphabetic
                    newSentence = newSentence + sentence[i].lower()
                else:
                    newSentence = newSentence + sentence[i]
            else:
                newSentence = newSentence + sentence[i]
    return newSentence


# ==================================================================================
first=True
count=0
maxcount = 0
maxLen = 100         #maximum num of letters in sentence
list = doGetFile()

# Process the List
# ==================================================================================
for sentence in list:
    if len(sentence)>maxLen:
        print("{} is too long by {} chars".format(sentence, len(sentence)-maxLen))
    if count < 20:
        if first:
            maxcount = sentence
            first = False
        else:
            newSentence = doProcess(sentence)
            print("{} : {}".format(sentence,newSentence))
            count+=1

    else:
        print("Maximum Records Reached")
        break

print("Records Read: {} , Max: {}".format(count,maxcount))




Input File: Input Task1 Case Mapping.txt

9
Mapping "\UuPpEr"\E and "\LlOWeR\E" case
\uNon-\lNested \using\lles
Nested \L\umc\uvEE\E and \Uo'l\le\la\lr\ly\E, \U\lR\Eedundant sequences.
Nested same-case: \Uab\uc\ud\uef\E and \L\lAB\lC\lDE\lF\E
Punctuation is unaffected\U:!@#$%^&*()_-+={}[]'";:,.<>/?1234567890\E
To get a \u\ precede it with either \l\u or \u\l.
\u\E is optional and can be redundant\E\E\E: \u\Uh\u\LeLLo[endofline] produces \Uh\LeLLo
Pointless repeats: \L\L\L\L\L\L\L\L\U\L\L\L\L\U\U\U\L\L\L\LX\E
Non-special sequences \t \n \e \A \$ \\








Comments