numbot

De
computatis

on counting information

Created by Wojciech Migda
Kraków, 2013-10-04

In a galaxy far far away...

the great constructor and designer Trurl was resting in his home when a sudden, loud knock on his door destroyed his peace. With grunts and curses muffled with each step, he approached the door. When he opened it in front of him stood a messenger:

‐ Sir Trurl, my King has an order for you...
Trurl smiled.

The Problem

The messenger introduced the problem to be solved: there are two distant alied kingdoms which have devised a communication using exchange of comets consisting of bits of information. They already know how to build them and send them into space, but so far they have failed when it actually came to extracting the information from bodies of the comets in a fast and reliable manner.

The Problem

Messenger

‐ Your task, Sir, will be to build two robots which will intercept each incoming comet and extract the bits of data carried inside, which then will be translated into a message.

The Problem

Robot

‐ The robots must have means of controlling them programmaticaly in C++. Specificaly, it must be possible to query how many bits of data were extracted from the comet and how much space (in bytes) is needed to store the received message.

The Problem

Trurl

‐ It is a deal, my friend. Tell your King that the robots will be ready in a week.
answered Trurl. The messenger bowed, hopped on his vehicle and swooshed away.

Solution #1

The robots were quite simple to build, and their infterface included these two methods:

/* get number of bits in the message */
int getSize(void) const;

/* get number of bytes of space to hold the message */
int getLength(void) const;

Solution #1

The messenger returned on a set date a week later. He brought the payment and left with the robots along with a nicely printed manual on how to program them.

Controller

Solution #1

Unfortunately, the very next day the messenger returned with a complaint.
The contractor hired by the King and appointed to write scripts to control the robots, despite having the manual, at one point made a mistake and used getLength instead of getSize.

He was in a hurry and the error was not caught during the code review because the names of the methods were not distinctive enough to avoid ambiguity.

Solution #2

Trurl agreed to rework the interface and promised that the costs will be covered by the warranty.
Learning from his mistakes he modified the interface renaming the two methods:

/* get number of bits */
int getSizeInBits(void) const;

/* get number of bytes */
int getSizeInBytes(void) const;

Response

A month later the Messenger arrived again.
This time he brought a letter from the King with words of gratitude for the delivered robots.
The letter also mentioned that the newly established communication between kingdoms has brought wealth and prosperity.

Another problem

Lastly, the King would like to upgrade the data storage system to use words instead of bytes, and that will necesitate modification of the interface. However, the other kingdom does not plan such storage upgrade so the solution must be backward compatible.

Solution #3

Trurl agreed to provide the extension. He added another method:

/* get number of bits */
int getSizeInBits(void) const;

/* get number of bytes */
int getSizeInBytes(void) const;

/* get number of words */
int getSizeInWords(void) const;
When he handed over the upgrade package to the Messenger he included a hefty bill, as this time the extra work was not covered by the warranty.

The story continues

After another two weeks the messenger arrived again. This time he had two new requests.

The story continues

  • there will be another storage upgrade. But to cut down on costs the King demands a solution which will work with any data granularity employed, be that bits, nibbles, bytes, words, double words, and so on.

The story continues

  • there was another issue with the interface found few days ago. Apparently, even though the API was used correctly, the contractor has used an integer variable which held number of bits to iterate over an array of bytes...
    Fortunately, this time code inspection was successful, but the King demands such an interface, which will effectively prohibit making such mistakes.

Trurl's misery

As the messenger left Trurl began to ponder. Despite his intense efforts he could not come up with a solution which would satisfy the King's request.
Finally, he broke in tears and lamented loudly.

Trurl

Help is on the way

Klapaucjusz

Trurl's cries were heard by his long time friend Klapaucius, who has just returned from the intergalactic C++11 congress and was passing by a Trurl's house.

Help is on the way

Trurl presented his story to Klapaucius, who in turn responded with a rant on custom measurement units, user-defined literals, constexpr keyword, and operator""().

Striving for perfection

When Trurl woke up the next day he sat down and while still remembering his friend's advice he sketched the ultimate solution.

Striving for perfection

This solution boils down to having just one method, which will return a result being of a new custom type representing general measure of information.

/* get amount of information */
Information getSize(void) const;

Construction

It will be possible to construct Information:

empty:

Information info; // zero information

copy constructed from other instance:

Information old; // source
Information current(old);

Assignment

It will only be possible to assign Information with other instance:

Information rhs; // source
Information lhs; // destination

lhs = rhs;

Comparisons

It will be possible to compare Information:

Information lhs, rhs;

lhs == rhs;
lhs != rhs;
lhs < rhs;
lhs <= rhs;
lhs > rhs;
lhs >= rhs;

Arithmetics

It will be possible to perform some calculations on Information.

Information lhs, rhs;

lhs += rhs;
lhs -= rhs;
lhs + rhs;
lhs - rhs;

/* these don't make sense */
lhs++;
++lhs;
lhs--;
--lhs;

Arithmetics

Information             lhs, rhs;
Information::value_t    scalar;

lhs * scalar;
scalar * rhs;

lhs / scalar; /* result is of type Information */
lhs / rhs; /* result is a scalar of type Information::value_t */

lhs.div(rhs); /* result is a quotient|remainder pair
               * represented by Information::pair_t */

Literals

Information will be specified in units. Units will be explicitly stated when a value of Information is declared.

Information info;

info = 27_bits;
info = 14_nibbles;
info = 47_bytes;
info = 3_octets;
info = 9_words;
info = 0_dwords;
info = 24_qwords;

Units

There will be predefined constant values of units of measurements.

Information bit; /* 1 bit*/
Information nibble; /* 1 nibble */
Information byte; /* 1 byte */
Information octet; /* 1 octet */
Information word; /* 1 word */
Information dword; /* 1 dword */
Information qword; /* 1 qword */

qword == 2 * dword == 4 * word == 8 * octet;
byte == CHAR_BIT * bit;
octet == 2 * nibble == 8 * bit;

Happy end

No details of Trurl's implementation have spilled to the public, but the popular story is that the King was truely delighted with the final solution and rewarded the constructor accordingly.

But the story doesn't end here. With a bit of an effort we can follow Trurl's steps and recreate his work.

Information class

First, let's define some types:

class Information
{
public:
/* core storage type */
typedef std::size_t value_t;

};

Information class

First, let's define some types:

class Information
{
public:
typedef std::size_t value_t;

/* pair_t represents result of div  */
typedef struct pair
{
    value_t     q; // quotient
    value_t     r; // remainder
    constexpr pair(const value_t q, const value_t r) : q(q), r(r) {}
} pair_t;

};

Information class

Now we can declare member data:

class Information
{
private:
/* amount of information, can use arbitrary units */
value_t     m_amount;

};

Information class

We will follow with construction:

class Information
{
public:
constexpr Information() : m_amount(0){}
constexpr Information(const Information & rhs) : m_amount(rhs.m_amount) {}

private:
constexpr Information(const value_t amount) : m_amount(amount) {}
};

constexpr

So what does this constexpr mean?

constexpr

In short:
if arguments to the expression declared as constexpr are constants or other constexpr expressions, then the value of such expression will be evaluated during compile time.


Note: body of a constexpr function must comprise of a single statement.

Information class

then operators:

class Information
{
public:
Information & operator+=(const Information &rhs)
{
    m_amount += rhs.m_amount;
    return *this;
}
Information & operator-=(const Information &rhs)
{
    m_amount -= rhs.m_amount;
    return *this;
}

};

Information class

and more operators:

class Information
{
public:
constexpr Information operator+(const Information & rhs) const
    {return Information(m_amount + rhs.m_amount);}
constexpr Information operator-(const Information & rhs) const
    {return Information(m_amount - rhs.m_amount);}
constexpr Information operator*(const value_t rhs) const
    {return Information(m_amount * rhs);}
constexpr Information operator/(const value_t rhs) const
    {return Information(m_amount / rhs);}

constexpr value_t operator/(const Information & rhs) const
    {return m_amount / rhs.m_amount;}
constexpr value_t operator%(const Information & rhs) const
    {return m_amount % rhs.m_amount;}
};

Information class

and even more operators:

class Information
{
public:
constexpr bool operator==(const Information & rhs) const
    {return m_amount == rhs.m_amount;}
constexpr bool operator!=(const Information & rhs) const
    {return m_amount != rhs.m_amount;}
constexpr bool operator<=(const Information & rhs) const
    {return m_amount <= rhs.m_amount;}
constexpr bool operator>=(const Information & rhs) const
    {return m_amount >= rhs.m_amount;}
constexpr bool operator<(const Information & rhs) const
    {return m_amount < rhs.m_amount;}
constexpr bool operator>(const Information & rhs) const
    {return m_amount > rhs.m_amount;}
};

Information class

basic conversions:

class Information
{
public:
constexpr value_t to(const Information & rhs) const
    {return *this / rhs;}

constexpr pair_t div(const Information & rhs) const
    {return pair(*this / rhs, *this % rhs);}

constexpr pair_t convert(const Information & rhs) const
    {return this->div(rhs);}
};

User-defined literals

C++11 introduced UDL syntax [over.literal]:


operator "" identifier(parameter-declaration-clause )


then we can write:

long operator "" _MHz(long double);

long frequency = 390.3_MHz;

User-defined literals

We will use UDLs to define these suffixes:

constexpr Information operator"" _bit(unsigned long long x)
    {return Information(x);}

constexpr Information operator"" _nibble(unsigned long long x)
    {return Information(x * 4u);}
constexpr Information operator"" _byte(unsigned long long x)
    {return Information(x * CHAR_BIT);}
constexpr Information operator"" _octet(unsigned long long x)
    {return Information(x * 8u);}
constexpr Information operator"" _word(unsigned long long x)
    {return Information(x * 16u);}
constexpr Information operator"" _dword(unsigned long long x)
    {return Information(x * 32u);}
constexpr Information operator"" _qword(unsigned long long x)
    {return Information(x * 64u);}

User-defined literals

but since Information(const value_t amount) is private we need to friend those operators:

class Information
{
friend constexpr Information operator"" _bit(unsigned long long x);
friend constexpr Information operator"" _nibble(unsigned long long x);
friend constexpr Information operator"" _byte(unsigned long long x);
friend constexpr Information operator"" _octet(unsigned long long x);
friend constexpr Information operator"" _word(unsigned long long x);
friend constexpr Information operator"" _dword(unsigned long long x);
friend constexpr Information operator"" _qword(unsigned long long x);
};

Convenience units

to finish, for convenience we will define information units:

/* in global scope */
constexpr Information bit(1_bit);
constexpr Information nibble(1_nibble);
constexpr Information byte(1_byte);
constexpr Information octet(1_octet);
constexpr Information word(1_word);
constexpr Information dword(1_dword);
constexpr Information qword(1_qword);

constexpr

What difference does it make?

constexpr

As an example let's take this definition:

// file main.cpp

Information  foo = 147_bit;
and look at the generated assembly x86 code...

constexpr

without constexpr we will get (default optimization):

foo:
	.zero	4  ;;; declaration in memory
...
	movl	$foo, %eax  ;;; at program start
	movl	$147, 4(%esp)
	movl	$0, 8(%esp)
	movl	%eax, (%esp)
	call	_Zli4_bity  ;;; ctor called
whereas having constexpr gives:
foo:
	.long	147  ;;; declaration in memory

constexpr

and with an -O3 optimization:

foo:
	.zero	4
...
	movl	$147, foo  ;;; runtime initialization
constexpr version:
foo:
	.long	147

constexpr

What about const definition?

// file main.cpp

Information  foo = 147_bit;
const Information  goo = 279_bit;
...
return foo + goo;

constexpr

regular code with -O3 optimization:

	.local	_ZL3goo  ;;; again, declaration in memory
	.comm	_ZL3goo,4,4
...
	movl	$279, _ZL3goo  ;;; and runtime initialization
...
	movl	_ZL3goo, %edx  ;;; foo + goo
	addl	foo, %edx      ;;; referencing memory location
constexpr version:
	movl	foo, %edx   ;;; foo + goo
	addl	$279, %edx  ;;; goo is just a numeric literal

Thank you

Q&A

https://github.com/WojciechMigda/information
http://wojciechmigda.github.io/de-computatis-presentation/#/