We consider the following problem: Given two sequences of decimal digits. Determine the common digits in ascending order.
Depending on the number of digits in the two sequences, we can implement various solutions:
Search each term from the first sequence in the second sequence. The sequence with common values needs to be sorted.
If the sequences are sorted (or we sort them), proceed as above but use binary search. The sequence with common values is already sorted from the beginning.
If the sequences are sorted (or we sort them), we can use merging to find the common elements.
Characteristic Vector
None of the above solutions take into account a key property of the values in the sequence – that they are digits, i.e., between 0 and 9. We can proceed as follows:
For each sequence, use a vector with only 10 elements, indexed from 0 to 9; let's call these v[] and u[];
The indices of the elements in the sequence represent the digits – possible values of the elements in the sequence;
Initially, the elements of both vectors are 0;
For each value x (which is a digit) appearing in the first sequence, mark the corresponding element in vector v[] with 1, v[x] = 1;
Similarly, each element y from the second sequence will be marked in vector u[] with 1, u[y] = 1;
Finally, the common digits c (from 0 to 9) are those marked with 1 in both vector v[] and u[].
The two vectors v[] and u[] are characteristic vectors. Their elements characterize the digits between 0 and 9, indicating for each digit whether it is part of the corresponding sequence or not. Note that we do not need to store the elements of the two given sequences. We are only interested in the distinct values that appear in each sequence, regardless of their order.
Example:
When to Use the Characteristic Vector?
To use a characteristic vector, at least the following conditions must be met:
The order of input data does not matter
The input data are small values, natural numbers from an interval like [0, M], or can be equivalent to such numbers
Practically, we can use a characteristic vector if the available memory allows declaring a vector with an appropriate number of elements:
If the input data are in the range [0, 1000] or [0, 10000], we can use a characteristic vector
If the input data are in the range [0, 1 million], we can use a characteristic vector – although there might be a better method
If the input data are in the range [0, 1 billion], we cannot use a characteristic vector
Frequency Vector
Let's consider again a sequence of decimal digits. Determine the digit that appears most frequently. If there are multiple digits that appear the maximum number of times, determine the smallest (or largest, or all, etc.).