Pages

About Me - 关于我

My photo
Madison, WI, United States
Joy Young ~~

2012/10/03

60 Details in C++ that you may not know

Name and Type

1.       The scope of a derived class is treated as nested in its base class when it comes to name lookup. (help understand hiding)
2.       Decltype yields the corresponding type for a VARIABLE and a reference for assignable EXPRESSION, and a value for non-assignable EXPRESSION. e.g.
decltype(ref_int + 0) //int
decltype(val_int) // int
decltype ((val_int)) // int &
3.       In implicit type conversion, char is promoted to int, not short.

Pointer and Reference

4.       A past-the-end pointer is not an invalid pointer. Though it also should not be dereferenced, it can be used to compare with the other pointers pointing to the same array. In contrast, comparing two pointers pointing to two unrelated object is UB.
5.       A reference is always treated as the object it refers, such as, sizeof(ref), typeid(ref), etc, all info is about the underlying object, with only one exception: decltype(ref) gives the reference type.
6.       A pointer got from new Type[0] is not a nullptr but a pass-the-end pointer, it serves just like vec.end() when the vector is empty.
7.       Use std::less<T> ()(a, b) instead of ‘<’ when writing a template. Because if T are pointers and they are not pointing to the same array, using ‘<’ on them is UB.

Ctor/Dtor

8.       A ctor can serve in static_cast. 
9.       Inherited ctors inherits explicit/constexpr properties. It does not inherit default arguments. Instead it gets multiple ctors in which each parameter with a default argument is successively omitted.
10.   If a class defines a dtor, even if it uses =default, the compiler will not synthesize a move operation for that class.
11.   The reason why the member initialization order depends on the definition order rather than the order in which they appear in the initialize list is, dtor must reverse the order and there is no way for a dtor to know the order in which you initialized the members.
12.   When you use ‘=default’ to force a compiler to generate a default move ctor for your class, but the compiler cannot (e.g. not all members are movable), the compiler will turn it into ‘=deleted’.

Move

13.   Compiler uses the copy ctor for move if you do not provide one. But, if you refuse to provide any copy-control members (assignment operator, copy ctor, etc) and all non-static data members are movable, you will get a real synthesized move ctor from compiler.
14.   IO stream object can be moved.
15.   After move operation, do not assume any state of the moved obj. e.g. People may assume  a moved vector has size() as 0 and the call of empty() on it return true. But it is not necessarily the case: perhaps the underlying whole pimpl object was moved away, the call of empty() or size() are forwarded to a null pointer.
16.   The std::containers only move its elements when the elements have a nonexcept move ctor. Otherwise it cannot guarantee exception safety when moving the container. (Is this part of the reason that VC11 does not support nonexcept keyword yet?)
17.   Classes that define a move ctor/assignment operator must also define their own copy operations; otherwise, those copy operations are defined as $deleted$ by default.
18.   You can use a move_iterator when you want to move all elements from a container to another.

Functions

19.   Using directives create an overlord set of functions from different namespaces:
using namespace sp1; //has print(int)
using namespace sp2;//has print(double)

print(1); //sp1::print
print(1.2);//sp2::print
20.   When looking for a best match for overloaded functions, the non-member (normal) functions and member functions race together.
21.   Each function default parameter can have its default specified only once. Any subsequent declaration can add a default only for a parameter that has not previously had a default specified.
int f(int , int = 2); int f (int =1, int);// OK, incremental specification
22.   Static member can be used as a default argument before definition and even declaration (but, ofc, the declaration should exist).

Member functions

23.   When using = default, (as with other member fuctions), if it appears in class, the default ctor will be inlined; if it appears outside the class, it is not by default.
24.   Besides const, you can place a reference qualifier on a member function to indicate this pointer is pointing to a rvalue or lvalue. E.g.
void mem_func(int ) const &&
means, mem_func(const && this_obj, int); The && here has nothing to do with move semantic. It avoids some silly statement like : str0 + str1 = “abc”. Operator=() should certainly behave differently  when *this is a rvalue or a lvalue.
Btw, it is nice but not supported by many compilers yet.
25.   =default is used when defining a function while =deleted is used when declaring a function.
26.   We can provide a definition for pure virtual function, but we can only write it outside the class.
27.   Virtual function can have default argument, but the one used is determined by its static type. No RTTI for this kind of work.
28.   Member functions can also be defined as volatile, only volatile member functions can be called on volatile objects.

Lambda

29.   Lambdas are function objects with deleted default ctor, deleted assignment operators and default dtor and it has a member ‘operator()’ which is a const member by default. Whether it has a defaulted or deleted copy/move ct depends on the captured data members.
30.   The value (as opposed to reference) captured by a lambda is const. You can override its constness by adding a mutable keyword following lambda.
31.   In a lambda function, when the return type is not specified, the function body is just a return statement, the return type is inferred from it, or it is void.
32.   We can omit parameter list and return type of lambda. So a simplest form and fully spelled formed should be:
auto f = []{};
auto f = [captures](parameters) mutable -> returntype {}

Template

33.   Template member function cannot be virtual.
34.   We can make a template type parameter a friend.
template <typename Type> class Bar{friend Type;}
It is OK when it is instantiated with a built-in type.
35.   Instantiation also happens (if not yet) when assigning the template function to a function pointer / reference.

Access control

36.   Other than narrowing down the access control during inheritance (by public, protected, private derivation), we have a way to enlarge it:
class Derived: private Base{
public:
        using Base::protected_member;
}
Now, protected_member is public.
37.   The protected/public member can be accessed from derived class, even if it uses private derivation. The access control of private derivation has effect on how to use the derived class, not the base class.
38.   Normal access controls apply to pointers to members.

STL:

39.   The STL containers now have a cbegin() and cend, which are specifically designed for auto:
auto citer = vec.cbegin()//always const_iter regardless the container type.
40.   Std::swap() only invalidates the iterators on std::string. It does not invalidate iterators on other types of containers. They are pointing to the same elements in the swapped container.
41.   Forward_list does not have size(). (I do not know why exactly)
42.   Forward_list is different, it inserts_after, etc, and uses an off-the-beginning iterator.
43.   String library now have string-to-numeric converting functions.
44.   Reverse_iterator.base() does not yield the adjacent position rather than the same one.
45.   Use make_pair, make_tuple to let compiler deduce types for you.
46.   Iterators for std::set, either cons_iterator or (non-const) iterator are both onst_iterators.
47.   Bitset subscription operator [] counts from right to left.
48.   We can use std::function to store a function directly but cannot when it has overloaded versions:
 int func(int, int);
int func(double, double);
std::function<int(int, int)> f = func; //which?
Use a pointer or reference to deambiguate:
int (&funcii )(int , int) = func;
std::function<int(int, int)> f =  funcii;
49.   The seekg, seekp, tellg, tellp uses the same marker on fstream and stringstream.

Other

50.   Sizeof (*a_null_pointer) is valid. Sizeof (char) is guaranteed to be 1, always.
51.   An enumerator value need not to be unique: enum {A = 1, B = 1, C = 1}; //it’s OK
52.   A definition of a nested class can be outside the enclosing class.
53.   A local class is legal but must be completely defined in the class body.
54.   A constexpr is not required to return a const expression always
55.   A constexpr is not required to be defined when the literal value is used. But it still needs a definition when the run time properties (like address operator &) are required.
56.   Friendship is not inherited. But, a friend of Base class can access the members of Base part of a Derived object.
57.   Virtual base classes are always constructed prior to non-virtual base classes regardless of where they appear in the inheritance hierarchy.
58.   noexcept  specifier should not appear in a typedef or type alias.
59.   The compiler will generate (when needed) the nonexcept ctor / copy-control members / dtor if the corresponding operations for all of its members promise not to throw.
60.    A function pointer that not specifying noexcept can point to noexcept functions, but NOT otherwise.

2012/08/01

A Simulation Way: Dynamically creating arbitrary dimensional array in a continuous memory block

I posted a piece of code of Dynamically creating arbitrary dimensional array in a continuous memory block  The purpose of that code was to create a raw pointer hierarchy which behaves like the array defined on the stack. I did not implement a multiarray class till recently because there is already some implementation available on-line, like boost.multiarray and what I mostly needed was some code more compatible with old code interface.
But, when it comes to a larger project, a significant drawback of that method becomes more apparent: we need to dereference the pointer many times (e.g. arr[2][4][1] needs to dereference 3times) to visit the data. Each time of dereference requires one memory access, so it  is time consuming.
In this post, I will give another way to do the job, to simulate the multi-arrary behavior by a container.
The idea is, we only and always use 1-D memory block, but we define operator[] and operator*  properly to enable access to the correct index. Obviously, the return value type is different from arr[2] and arr[2][3], so a Proxy class is needed to simulate this hierarchy.
The code and example test is pasted here. It requires very latest compiler like GCC 4.7.1 with std=c++0x, because I used template aliasing not to terrifying a user who wants to get a reference of the Proxy object returned. (For more detail please refer to the comments) Note that, the testing examples used here are almost same as the one used in testing IdealArrBuilder. It behaves just like a multi-dimenisonal array created on a continuous memory block. But it runs much more fast because of the opeartor[] does not need to dereference a pointer, i.e. does not need to access memory-- instead, it is optimized fully into pure math calculation. So it  runs fast and in my test, it runs 2~3 times faster than the raw-pointer one (made by IdealArrBuilder) and even faster than the naive 1-D array. The reason why it is even faster than naive 1-D array, I analyzed, is because: first, by dereferencing [i][j][k], the compiler seems understand it can reuse the i j calculation, so it expand the inner loop: 5 into 1.
I also provided some useful functions: to reference the Proxy object returned or to assign it to another MultiArray ojbect at the same dimension.
Besides the multiarray operations, it also supports basic container operations too.
Finally, I found it very useful if I added some RAII into it, so it also behaves like a smart pointer.
That's the code, comments, and an example:
please compile it with some compilers supporting template aliasing or remove the related code, it does not affect the logic but complicates the usage.




#include <iostream>
#include <vector>
#include <cassert>
#include <memory.h>

struct Foo
{
    int i, j;
};

template<class Type, size_t Dimension>
class MultiArray
{
    size_t _size;
    Type * _buffer;
    size_t _sizes[Dimension];

    template<typename ClientType, size_t CurDim>
    struct _Proxy
    {
        typedef _Proxy<ClientType, CurDim - 1> DerefType; //This is they type operator[] returns.
        size_t _base_pos; //This is the position provided from higher dim.
        size_t _cur_size;
        ClientType * _client_ptr;
        const static size_t DIMENSION = CurDim;
        //Just pass the argument level by level to the highest base
        _Proxy(size_t pos, ClientType * outter_ptr, size_t cur_size):_base_pos(pos), _cur_size(cur_size), _client_ptr(outter_ptr) {}
        DerefType operator[](size_t idx) const
        {
            size_t dim_size = _client_ptr->_sizes[Dimension - CurDim];
            //Like VC implementation of STL containers,
            //I do not check boundaries in Release version for better performance
            size_t pos = dim_size * _base_pos + idx;//To get the next position, we do this calculation
            size_t next_size = dim_size * _cur_size;//maintain current total size by this dimension to check access violation in Debug mode.
            assert(0 <= pos && pos < next_size);
            return DerefType(pos, _client_ptr, next_size);
        }
        inline DerefType operator * ()
        {
            return operator[](0);
        }
        operator MultiArray<Type, CurDim> ()//Conversion operator. The 3 of them (see below) are all the same essentially.
        {
            size_t sizes_beg = ClientType::DIMENSION - DIMENSION;
            MultiArray<Type, CurDim> arr_to_conv(_client_ptr->_sizes + sizes_beg);
            size_t pos = arr_to_conv.size() * _base_pos;
            memcpy(arr_to_conv.buffer(), _client_ptr->buffer() + pos ,  arr_to_conv.size() * sizeof (Type));
            return arr_to_conv;//it will invoke the move semantic, no painful copying and the resouce is transferred to outside object.
        }
    };
    template<typename ClientType>
    struct _Proxy<ClientType, 1>
    {
        typedef Type & DerefType;
        size_t _base_pos;
        size_t _cur_size;
        ClientType * _client_ptr;
        _Proxy(size_t pos, ClientType * outter_ptr, size_t cur_size):_base_pos(pos), _cur_size(cur_size), _client_ptr(outter_ptr) {}
        DerefType operator[](size_t idx) const
        {
            size_t dim_size = _client_ptr->_sizes[Dimension - 1];
            size_t pos = dim_size * _base_pos + idx;
            size_t total_size = dim_size * _cur_size;
            assert(0 <= pos && pos < total_size);
            return _client_ptr->_buffer[pos];
        }
        inline DerefType operator * () const
        {
            return operator[](0);
        }
        operator MultiArray<Type, 1> ()
        {
            size_t sizes_beg = _client_ptr->DIMENSION - 1;
            MultiArray<Type, 1> arr_to_conv(_client_ptr->_sizes + sizes_beg);
            size_t pos = arr_to_conv.size() * _base_pos;
            memcpy(arr_to_conv.buffer(), _client_ptr->buffer() + pos ,  arr_to_conv.size() * sizeof (Type));
            return arr_to_conv;
        }

    };
    template<typename ClientType>
    struct _Proxy<const ClientType, 1>
    {
        typedef Type DerefType;
        size_t _base_pos;
        size_t _cur_size;
        const ClientType * _client_ptr;
        _Proxy(size_t pos, const ClientType * outter_ptr, size_t cur_size):_base_pos(pos),  _cur_size(cur_size),_client_ptr(outter_ptr) {}
        DerefType operator[](size_t idx) const
        {
            size_t dim_size = _client_ptr->_sizes[Dimension - 1];
            size_t pos = dim_size * _base_pos + idx;
            size_t total_size = dim_size * _cur_size; //
            assert(0 <= pos && pos < total_size);
            return _client_ptr->_buffer[pos];
        }
        inline DerefType operator * () const
        {
            return operator[](0);
        }
        operator MultiArray<Type, 1> ()
        {
            size_t sizes_beg = _client_ptr->DIMENSION - 1;
            MultiArray<Type, 1> arr_to_conv(_client_ptr->_sizes + sizes_beg);
            size_t pos = arr_to_conv.size() * _base_pos;
            memcpy(arr_to_conv.buffer(), _client_ptr->buffer() + pos ,  arr_to_conv.size() * sizeof (Type));
            return arr_to_conv;
        }

    };



public:
    const static size_t DIMENSION = Dimension;
    template <size_t Dim>
    using Accessor = _Proxy<MultiArray, Dim>;
    template <size_t Dim>
    using Const_Accessor = _Proxy<const MultiArray, Dim>;

    typedef typename _Proxy<MultiArray, Dimension>::DerefType DerefType;
    typedef typename _Proxy<const MultiArray, Dimension>::DerefType CDerefType;
    inline Type& at(size_t idx)
    {
        assert(0 <= idx && idx < _size);
        return _buffer[idx];
    }
    inline Type at(size_t idx) const
    {
        assert(0 <= idx && idx < _size);
        return _buffer[idx];
    }
    DerefType  operator[](size_t idx)
    {
        return _Proxy<MultiArray, Dimension>(0, this, 1)[idx];
    }

    CDerefType operator[](size_t idx) const
    {
        return _Proxy<const MultiArray, Dimension>(0, this, 1)[idx];
    }
    DerefType operator *()
    {
        return operator[](0);
    }
    CDerefType operator * () const
    {
        return operator[](0);
    }
    template<typename ForwardIter>
    explicit MultiArray(ForwardIter sizes_beg = ForwardIter(), Type * buffer = nullptr): _size(0), _buffer(nullptr)
    {
        memset(_sizes, 0, Dimension * sizeof (size_t));
        reallocate(sizes_beg, buffer);
    }
    //no sizes info means we keep the current size but re-initialize the elements
    //Note instead of making it quite like a container, I also like to make it like a RAII recource wrapper.
    template<typename ForwardIter>
    void reallocate(ForwardIter sizes_iter = ForwardIter(), Type * buffer = nullptr)
    {
        delete [] _buffer;
        morph(sizes_iter);

        if (_size == 0)
        {
            _buffer = nullptr;
            return;
        }
        if (buffer == nullptr)
            _buffer = new Type[_size](); //I do not handle exceptions
        else
            _buffer = buffer;
    }
    //It is not yet a qualified smart pointer, but it behaves similar so it can simplfy my work.
    void reset(Type * buffer = nullptr)
    {
        if (buffer == nullptr)
        {
            clear();
            return ;
        }
        delete [] _buffer;
        _buffer = buffer;
    }

    //change the size of each dimension, but the buffer is not affected.
    template<typename ForwardIter>
    void morph(ForwardIter sizes_iter = ForwardIter())
    {
        if (sizes_iter == ForwardIter())
        {
            //we leave it peacefully if the iterator is invalid.
            return ;
        }
        _size = 1;
        for (size_t i = 0; i < Dimension ; ++i)
        {
            _sizes[i] = *(sizes_iter ++);
            _size *= _sizes[i];
        }

    }

    void clear()
    {
        memset(_sizes, 0, Dimension * sizeof (size_t));
        delete [] _buffer;
        _buffer = nullptr;
        _size = 0;
    }

    MultiArray(const MultiArray &to_copy): _size(0), _buffer(nullptr)
    {
        *this = to_copy;
    }
    MultiArray & operator = (const MultiArray &to_assign)
    {
        if (this == &to_assign)
            return *this;

        _size = to_assign._size;

        memcpy(_sizes, to_assign._sizes, Dimension * sizeof(size_t));

        delete [] _buffer;
        _buffer = new Type[_size](); //I do not handle exceptions
        memcpy(_buffer, to_assign._buffer, _size * sizeof(Type));
        return *this;
    }
    MultiArray(MultiArray && to_move):  _size(0), _buffer(nullptr)
    {
        *this = std::move(to_move);
    }
    MultiArray & operator = (MultiArray && to_move_assign)
    {
        if (this == &to_move_assign)
            return *this;
        clear();
        _size = to_move_assign._size;
        memcpy(_sizes, to_move_assign._sizes, Dimension * sizeof(size_t));
        memset(to_move_assign._sizes, 0, Dimension * sizeof (size_t));
        _buffer = to_move_assign._buffer;
        to_move_assign._buffer = nullptr;
        to_move_assign._size = 0;

        return *this;
    }

    inline bool empty() const
    {
        return _size == 0;
    }
    inline size_t size() const
    {
        return _size;
    }
    inline Type * buffer()
    {
        return _buffer;
    }
    inline const size_t * sizes() const
    {
        return _sizes;
    }
    inline size_t dimension() const
    {
        return DIMENSION;
    }
    ~MultiArray()
    {
        clear();
    }


};




using namespace std;

int main()
{
    size_t sizes[] = {4, 5, 6, 2};//4 is the highest dimesion
    MultiArray<Foo, 4> arr(sizes);
    vector<size_t> sizes_vec = {1, 2, 3 ,4};
    MultiArray<Foo, 4> brr(sizes_vec.begin());// works too
    MultiArray<Foo, 1> arr1d(sizes);
    arr1d[2].i = 3;

    cout<< MultiArray<Foo, 4>::DIMENSION<<endl;
    cout<<arr1d[2].i<<endl;

    (****arr).i = 5;
    arr[2][3][0][0].i = 3;
    arr[2][3][0][1].j = 4;

    MultiArray<Foo, 4> crr( arr); //copy
    //naturally, these works too, but they are copying.
    MultiArray<Foo, 2> copy2d = arr[2][3];
    MultiArray<Foo, 1> copy1d = copy2d[0];
    cout<<copy2d.size()<<endl;
    cout<<copy2d[0][1].j<<endl;
    cout<<copy1d[1].j<<endl<<endl;

    //use accessor to reference the intermediate object so that no need to copy.
    //Note, the const qualifier is modifying Accessor not the array.
    MultiArray<Foo, 4>::Accessor<2> const & acc_ref = arr[2][3];//auto const & acc_ref works too
    acc_ref[3][1].i = 5;
    cout<<acc_ref[3][1].i<<endl<<endl;

    //copy and move semantic:
    auto arr_cp = arr;
    cout<<arr_cp[2][3][0][1].j<<endl;
    auto arr_mv = std::move(arr_cp);
    cout<<arr_cp.size()<<endl;//arr_cp is no longer holding any resource.
    cout<<arr_mv[2][3][0][1].j<<endl;
    cout<<endl;

    //the subsripting ability is demonstrated below.
    const auto &arrref = arr;
    cout<<arr[2][3][0][0].i<<endl;
    cout<<arrref[2][3][0][1].j<<endl;
    cout<<arrref[0][0][0][0].i<<endl; //arrref[0][0][0][0] <=>****arrref
    cout<<(****arrref).i<<endl;
    cout<<endl;
    cout<<arrref[2][3][0][1].j<<endl;
    cout<<(***arrref)[157].j<<endl; //157 = 2*5*6*2 + 3*6*2 + 0*2 + 1
    cout<<(**arrref)[78][1].j<<endl; // 78 = 2*5*6 + 3*6
    cout<<(*arrref)[13][0][1].j<<endl; // 13 = 2*5 + 3
    cout<<(*arr[2][3])[1].j<<endl;//*arr[2][3] <=> arr[2][3][0]
    cout<<(*(*arr)[13])[1].j<<endl;//(*arr)[13] <=> arr[2][3]
    cout<<arrref.at(157).j<<endl;

    arr[2][3][0][3].i = 3; //valid, this is allowed because of the memory is allocated in a continuous memory block.

    //cout<<(*(*arr)[20])[1].i<<endl;//debug assertion failed because for the 2nd dimension, the largest idx allowed is 4*5 = 20

    //arr.at(240).i = 7;//debug assertion failed because of access violation
    //cout<<(***arrref)[250].i<<endl; //debug assertion failed because of access violation
    //cout<<(**arrref)[130][1].i<<endl; //debug assertion failed because for the 3rd dimesion, the largest idx allowed is 4*5*6 = 120
    size_t sizes2[]= {12, 2, 2, 5};
    arr.morph(sizes2);
    cout<<arr[7][1][1][2].j<<endl;
    arr.reallocate(sizes2);
    cout<<arr[7][1][1][2].j<<endl;
    return 0;
}