How to resolve the algorithm UTF-8 encode and decode step by step in the Ada programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm UTF-8 encode and decode step by step in the Ada programming language

Table of Contents

Problem Statement

As described in UTF-8 and in Wikipedia, UTF-8 is a popular encoding of (multi-byte) Unicode code-points into eight-bit octets. The goal of this task is to write a encoder that takes a unicode code-point (an integer representing a unicode character) and returns a sequence of 1–4 bytes representing that character in the UTF-8 encoding. Then you have to write the corresponding decoder that takes a sequence of 1–4 UTF-8 encoded bytes and return the corresponding unicode character. Demonstrate the functionality of your encoder and decoder on the following five characters: Provided below is a reference implementation in Common Lisp.

Let's start with the solution:

Step by Step solution about How to resolve the algorithm UTF-8 encode and decode step by step in the Ada programming language

Source code in the ada programming language

with Ada.Strings.Fixed; use Ada.Strings.Fixed;
with Ada.Strings.UTF_Encoding.Wide_Wide_Strings;
with Ada.Integer_Text_IO;
with Ada.Text_IO;
with Ada.Wide_Wide_Text_IO;

procedure UTF8_Encode_And_Decode
is
   package TIO renames Ada.Text_IO;
   package WWTIO renames Ada.Wide_Wide_Text_IO;
   package WWS renames Ada.Strings.UTF_Encoding.Wide_Wide_Strings;

   function To_Hex
     (i : in Integer;
      width : in Natural := 0;
      fill : in Character := '0') return String
   is
      holder : String(1 .. 20);
   begin
      Ada.Integer_Text_IO.Put(holder, i, 16);
      declare
         hex : constant String := holder(Index(holder, "#")+1 .. holder'Last-1);
         filled : String := Natural'Max(width, hex'Length) * fill;
      begin
         filled(filled'Last - hex'Length + 1 .. filled'Last) := hex;
         return filled;
      end;
   end To_Hex;

   input : constant Wide_Wide_String := "AöЖ€𝄞";
begin
   TIO.Put_Line("Character   Unicode    UTF-8 encoding (hex)");
   TIO.Put_Line(43 * '-');
   for WWC of input loop
      WWTIO.Put(WWC & "           ");
      declare
         filled : String := 11 * ' ';
         unicode : constant String := "U+" & To_Hex(Wide_Wide_Character'Pos(WWC), width => 4);
         utf8_string : constant String := WWS.Encode((1 => WWC));
      begin
         filled(filled'First .. filled'First + unicode'Length - 1) := unicode;
         TIO.Put(filled);
         for C of utf8_string loop
            TIO.Put(To_Hex(Character'Pos(C)) & " ");
         end loop;
         TIO.New_Line;
      end;
   end loop;
end UTF8_Encode_And_Decode;


  

You may also check:How to resolve the algorithm Caesar cipher step by step in the Logo programming language
You may also check:How to resolve the algorithm Best shuffle step by step in the Seed7 programming language
You may also check:How to resolve the algorithm Arithmetic/Integer step by step in the DWScript programming language
You may also check:How to resolve the algorithm Factorions step by step in the Java programming language
You may also check:How to resolve the algorithm Loops/Continue step by step in the Pascal programming language