How to resolve the algorithm Strip control codes and extended characters from a string step by step in the C# programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Strip control codes and extended characters from a string step by step in the C# programming language

Table of Contents

Problem Statement

Strip control codes and extended characters from a string.

The solution should demonstrate how to achieve each of the following results:

In ASCII, the control codes have decimal codes 0 through to 31 and 127. On an ASCII based system, if the control codes are stripped, the resultant string would have all of its characters within the range of 32 to 126 decimal on the ASCII table. On a non-ASCII based system, we consider characters that do not have a corresponding glyph on the ASCII table (within the ASCII range of 32 to 126 decimal) to be an extended character for the purpose of this task.

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Strip control codes and extended characters from a string step by step in the C# programming language

Overview: The code snippet you provided is a C# program that demonstrates how to strip control characters and extended characters from a given string.

Main Method:

  • The Main method accepts string arguments from the command line.
  • It creates a test string that includes control characters and extended characters.
  • It calls two methods, StripControlChars and StripExtended, to remove the respective types of characters from the test string.
  • Finally, it prints the original, control-stripped, and extended-stripped versions of the test string.

StripControlChars Method:

  • This method takes a string as input and returns a new string without control characters.
  • It converts the string to a character array.
  • It uses a StringBuilder to build the new string, only adding characters that are not control characters.
  • Control characters are typically used for formatting, such as line breaks or tabs.

StripExtended Method:

  • This method takes a string as input and returns a new string without extended characters.
  • It uses a StringBuilder to build the new string.
  • It converts each character to a Unicode code point using Convert.ToUInt16.
  • It checks if each code point is within the range of basic ASCII characters (32 to 126).
  • If the code point is in this range, it appends the character to the StringBuilder.
  • Extended characters typically have Unicode code points greater than 126 and are used to represent characters not found in the ASCII character set, such as emojis or special symbols.

Sample Input and Output:

  • For the given test string, the program will produce the following output:
Original: string of ☺☻♥♦⌂, may include control characters and other ilk.♫☼§►↔◄
Stripped of control codes: string of , may include  and other ilk.♫☼§►↔◄
Stripped of extended: string of , may include  and other ilk.
  • The control-stripped string omits control characters like line breaks, tabs, and others.
  • The extended-stripped string omits extended characters like emojis and symbols, resulting in a string that contains only basic ASCII characters.

Source code in the csharp programming language

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace RosettaCode
{
    class Program
    {
        static void Main(string[] args)
        {
            string test = "string of ☺☻♥♦⌂, may include control characters and other ilk.♫☼§►↔◄";
            Console.WriteLine("Original: {0}", test);
            Console.WriteLine("Stripped of control codes: {0}", StripControlChars(test));
            Console.WriteLine("Stripped of extended: {0}", StripExtended(test));
        }

        static string StripControlChars(string arg)
        {
            char[] arrForm = arg.ToCharArray();
            StringBuilder buffer = new StringBuilder(arg.Length);//This many chars at most
            
            foreach(char ch in arrForm)
                if (!Char.IsControl(ch)) buffer.Append(ch);//Only add to buffer if not a control char

            return buffer.ToString();
        }

        static string StripExtended(string arg)
        {
            StringBuilder buffer = new StringBuilder(arg.Length); //Max length
            foreach(char ch in arg)
            {
                UInt16 num = Convert.ToUInt16(ch);//In .NET, chars are UTF-16
                //The basic characters have the same code points as ASCII, and the extended characters are bigger
                if((num >= 32u) && (num <= 126u)) buffer.Append(ch);
            }
            return buffer.ToString();
        }
    }
}


  

You may also check:How to resolve the algorithm Wireworld step by step in the Ruby programming language
You may also check:How to resolve the algorithm Consecutive primes with ascending or descending differences step by step in the Phix programming language
You may also check:How to resolve the algorithm SHA-256 step by step in the R programming language
You may also check:How to resolve the algorithm Guess the number step by step in the LabVIEW programming language
You may also check:How to resolve the algorithm Ukkonen’s suffix tree construction step by step in the Java programming language