Character strings


Introduction

In C++, there are multiple ways to represent character strings. In this article, we will discuss character strings represented as one-dimensional arrays with elements of type `char`, a representation that originates from the C language. These strings are also known as null-terminated byte strings (NTBS). In the internal representation, after the last valid character (byte, octet) in the string, there is the character '\0' – the character with ASCII code 0, also called the null character. Thus, to represent the word "copil" in C/C++, which has 5 characters, 6 bytes will be used with the values: 'c', 'o', 'p', 'i', 'l', '\0'.

Declaring a string

A string is declared in C++ like this:
char s[11]; A string has been declared that can store a maximum of 11 characters, with indices 0 1 ... 10. The string s can store at most 10 useful characters, after the last useful character the character '\0' is stored. Also, when declaring a string it can be initialized. The following examples declare strings and initialize them with the string "child":

    char s[11] = "copil"; // only 6 characters are used
    char t[]="copil"; // 6 bytes are automatically allocated for the string t: the 5 letters and the null character \0
    char x[6]={'c','o','p','i','l','\0'}; // initialization is similar to that of an array - strings are arrays
    char z[]={'c','o','p','i','l','\0'}; // 6 bytes are automatically allocated for the string

Display a string It can be done with the << operator to insert into the stream: cout << s << endl; Reading a string The >> operator can be used to extract from the stream: dinner >> s; In this mode, due to the specificity of the >> operator, strings containing spaces cannot be read - characters up to the first space will be read, without it. To read strings containing spaces, we can use the getline method of the cin object or another istream object: istream& getline (char* s, streamsize n ); It will read in the string s the characters from the input stream (from the keyboard) until the end-of-line character '\n' appears, but no more than n-1 characters. The '\n' character will not be added to the string s, but will be extracted from the stream. E.g: cin.getline(s , 11); We could say that getline reads the whole line and skips ENTER. Here is a complete example: #include < iostream> using namespace std; int main(){ char nume1[31], nume2[31]; cout << "What's your name? (name, firstname) "; cin.getline(nume1, 31); cout << "What's your friend's name? "; cin.getline(nume2 , 31); cout << "Your name is " << nume1 << endl; cout << "You are friends with " << nume2 << endl; return 0; } Another way to read a string that may contain spaces is to use the get method of the istream object, which we don't cover here. Referencing a character in the string. Looping through a string Because strings are actually arrays, the [] operator is used to refer to a character in the string, as in the following example: char s[]="abacus"; // the string consists of 5 characters: the 4 letters and the null character '\0' cout << s[3]; // c s[1] = 'r'; cout << s; // arac cout << s[10]; // ??? unpredictable behavior: there is no character with index 10 in the string In many situations it is necessary to parse each character in the string. This requires a string traversal; this is done similarly to going through a certain painting. The difference is that, for the character string, the length is not explicitly known. It can be determined with the strlen function (see below), but we can control the traversal of the string by knowing that the null character '\0' appears after the last valid character in the string. The following examples loop through a string and display the characters separated by spaces: char s[11]; dinner >> s; // a word is read, without spaces int i = 0; while(s[i] != '\0') { cout << s[i] << " "; i ++; } or shorter char s[11]; cin >> s; // a word is read, without spaces for(int i = 0 ; s[i] ; i ++) cout << s[i] << " ";

Solved problems

Problem 1
From the text.in file, a multi-line text consisting of letters of the English alphabet, spaces and NewLine characters is read. In the text.out file, the initial text will be displayed, in which all words of the maximum length will be replaced with their reverse (mirror). The rest of the words and their arrangement on the lines will remain unchanged. The total number of characters in the file is at most 5000.
Example:
text.in
I thought the same as you
But I got the code wrong
text.out
I have tidnag the same as you
But I tiserg the code

Solution   
#include < fstream>
#include < cstring>
using namespace std;
ifstream fin("text.in");
ofstream fout("text.out");

int main()
{
    char s[5001];
    int lmax=0;
    fin.get(s,5001,EOF); //read the entire file
    //calculate the maximum length of a word
    for(int i=0;s[i];i++)
        if(s[i]!=' ' && s[i]!='\n')
        {
            int j=i;
            while(s[j] && s[j]!=' ' && s[j]!='\n')
                j++;
            if(j-i>lmax) lmax=j-i;
        }
    //reverse the words of maximum length
    for(int i=0;s[i];i++)
        if(s[i]!=' ' && s[i]!='\n')
        {
            int j=i;
            while(s[j] && s[j]!=' ' && s[j]!='\n')
                j++;
            if(j-i==lmax)
            {
                for(int x=i,y=j-1;x<y;x++,y--)
                    swap(s[x],s[y]);
            }
        }
    fout << s;
    return 0;
}
    

Problem 2

A word consisting of no more than 100 lowercase letters is read. A natural number p (p<=100) is read. Display the words obtained by removing a sequence of p letters from s.
Example: s="adina", and p=3, the words result:
na aa ad

Solution
#include < iostream>
#include < cstring>
    using namespace std;
    
    int main()
    {
        char s[101];
        int p;
        cin>>s>>p;
        for(int i=0;i<=strlen(s)-p;i++)
        {
            char t[101], aux[101];
            strcpy(t,s);
            strcpy(aux,s+i+p);
            strcpy(t+i,aux);
            cout<< t <<" ";
        }
        return 0;
    }

Practice exercises

1) Two words a and b are read, each consisting of at most 20 lowercase letters. Display on separate lines and separate them with a space:
- the letters that appear in both words
- the letters that appear in at least one of the words
- the letters that appear in only one of the words

Example:
    for the words adina and alina it will be displayed:
    a i n
    a d i l n
    d l
    

2) Read a sentence with a maximum of 200 letters and spaces and then a syllable made up of exactly 2 letters. Calculate and display the number of occurrences of the syllable in the sentence.

Example:
        Ana has apples
        re
        =>2

3) A word consisting of no more than 20 letters is read. Interchange the first half of the word with the second. If the word consists of an odd number of letters, then the middle letter will remain in place.

Examples: the word "cada" turns into "daca", and "alina" into "naial".

4) A natural number is read in and then in words made up of at most 20 letters each. Calculate and display how many of the n-1 words read starting with the second have the property that the first word read is their suffix.

 Example: if n=6, and the words read are ion, new year, ionel, broth, million,
        pawn => 3 (3 words from the last 5 end with the suffix ion).

5) Two words a and b consisting of no more than 20 letters each are read. Display all the suffixes of the word a that have the property of being prefixes of the word b. If there are no such suffixes, display the message "does not exist".

 Example: for the words a="rebel" and b="elegant" the required suffixes are "ele" and
        "e" (not necessarily in this order).